...and another thing header

Published: 01/09/2009 | Last Revision: 07/07/2010

I must confess I am somewhat astonished at the rate at which the computing world is moving to the virtualised space. It started as a trickle, quickly turned into a flood and is now a raging torrent. The business world appears to have woken up to the need to do something about a number of factors that are really starting to hurt. Firstly, we have run out of rack space and buying or renting more is almost prohibitively expensive. Secondly, even a four-year-old server represents a price/performance ratio which is not good value by comparison to new hardware. Thirdly, new server hardware has a performance/power consumption ratio which is almost beyond comparison to servers from some years back. Hence the Return On Investment (ROI) of moving to brand new, cutting edge hardware is almost overwhelming. And that’s just the start. New hardware comes in all sorts of exotic flavours. Want a 1U server with 8 processing cores, 32Gb of RAM and 6 hard disks in a hardware RAID configuration? Just tick the boxes at HP, Dell or anyone of a number of top flight vendors. Of course, these can come with several handfuls of gigabit Ethernet ports, and a couple of PSUs for good measure. All of which means that the workload you can put onto these boxes is vastly bigger than the boxes of old. And therein lies a problem. How do we get more workload onto a box when our workload is fairly static, or growing at a controllable rate? The orchestra starts its big sweeping tune, the walls roll back and your data centre magically transforms into a ballroom. “Shall we dance?” whispers your IT Director. Anyone can get a 4:1 virtualisation ratio these days – that means four servers onto one new box. If you have a bunch of tired old servers doing core roles which are deeply undemanding, like DC or DNS or so forth, then the ratio goes up and up. And as the ratio goes up and up, the old box count falls away with a locked-in reduction in rack space requirement, power consumption and air-con load. Is there any downside to this? Well, yes – you never get something for nothing, and there are some wrinkles that we need to be aware of, both from a developmental and deployment point of view.

Getting out of Kansas

Firstly, you need to wake up to the big issue – your software will be virtualised. You might not like this, but tough. Its 2009 going on 2010, and we are not stuck in Kansas, aka NT4 Server land, any more. If you don’t like this idea, go plant potatoes because your customers will not take no for an answer. Secondly, it means that it is going to be incumbent upon you to do much more testing. Let me be absolutely crystal clear about this, just in case there is any hint of confusion. Windows Server 2003, 2008 and R2 and so forth are certified to run on Microsoft’s Hyper-V and on VMware’s ESX family. Your application runs on Windows Server, and that’s what you specify to your customers. Now here comes the crunch. Not only do you have to certify your app as running on bare metal; you need to certify it on Hyper-V and on ESX too. If you want to fill in the other bits of market sector coverage, then there are other virtualisation platforms that should be on the list too. But I will concede that its OK to start with the two biggest, and those are Hyper-V and ESX. To make it even more crystal clear; a customer will phone you with a technical support question, and of course everything is progressing just fine right up to the point where the customer says “Oh, and its running on ESX4.” Your reaction will determine much about the future viability of your company at this point. If you suck air through your teeth and say “ah well, hmmm, well we don’t support running in a virtualised environment”, I can guarantee that you have upset your customer. Said customer will, with great justification, be hugely un-amused at your reaction and see no reason why you’re hiding behind this pathetic excuse. Not only that, you will find that the customer will probably start doing an internal costing and logistics project to determine just how easy it will be for the business to get off your platform and onto someone else’s. The truth of the matter is that a well written application should be just fine running on a modern, leading edge hypervisor environment. It should neither know nor care about these things. Your customer knows this, and will judge you accordingly. However, there are some gotchas that you really do need to consider before you brazenly state that you are supported on the various virtualised environments. As they say, assumption is the mother of all screw-ups. And many assumptions in the software world are down to sheer laziness, or a bad design which has magically crept into a production-ready form. Lets take some examples. If you have a modern leading edge virtualisation environment, then it is perfectly possible for you to change the virtualisation hardware on the fly. If the box looks like it is going a bit slowly, then add in more CPU cores. If it is idling and doesn’t need them any more, take some away. The same goes for memory. Got a big workload to handle? Then shovel a few more gigabytes of RAM into the virtual machine. And this can happen in real time! Now, clearly the operating system your app is running on is going to have to support this – and you need to be looking at the latest versions of Windows Server to get proper capability here. But how will your app react? It might not be too upset at getting more processor cores to play with, in that it might just ignore them. But having some taken away? What about assumptions about RAM on start-up? Is this a value that is being taken and then stored away to be referred to during the runtime life of the application? What nasty little bodges and secrets do we have lurking away both in our own code, and those of the frameworks that we use? For example, I was completely amazed to find that Adobe’s Photoshop Elements couldn’t cope with the removal of a printer driver while it was running. For some reason, Adobe decided to enumerate the printer pool on start up, and then steadfastly refused to believe a new printer had been installed unless you restarted the app. Worse still, it was fixated on the thought that a printer still existed when you had just deleted it. These might seem like stupid examples, and to an extent they are. But if assumptions like this are going on, it is entirely likely that nasty little assumptions are littering your codebase. Not only do you need to ensure that they have been swept away, but you need to be testing for them by having the appropriate VM environments as part of your code testing. And there is one last little sting in the tail. It is almost traditional to push the customer towards a new version. After all, who wants to maintain multiple codebases? But there may well be plenty of users out there who are just happy on a version of your product which is a few years old. Fixing coding bugs which have been highlighted by hosting in a VM is the sort of cost that many users will be unhappy to swallow. A loud and confident, “Ah yes, but we never promised it back then...” will fall on very deaf ears if the customer feels they have to fork out a wodge of cash just to get their existing working environment moved onto a VM installation. A wise company would think very carefully about charging for such bug fixes, and maybe even come to an accommodation for a partial-fee upgrade if such a problem occurred. Of course, everyone has to eat, support costs money, and no-one wants to give away features for a reduced cost to keep a customer happy. But it is clear that putting application code onto VM hosted environments can throw up nasties, and it is not clear that the customer has to pay for this sort of fix.

Share and Bookmark  

Comments

Be the first to comment about this article...

Leave a comment

You must login to place comments.