So how important is LiveMigration/VMotion now?

One of Microsoft’s big marketing statements I’ve heard several times is that LiveMigration wasn’t that important since clients don’t change when they do work on hardware even with LiveMigration.   I’ll cover why this in depth on why this is a flawed thought for an enterprise company in a future blog entry.

Along comes a critical use case this past week.  MS08-67 came out and threw most companies I know of into some serious chaos while they rolled this patch out ASAP.  Now this one does impact any Windows OS including Server Core.   Anyone that would be using Hyper-V would obviously be affected right now.    Let’s walk through trying to deploy this for 120 Hyper-V hosts with Quick Migration (which causes a service interruption) as fast as humanly possible with business buy-off to do this ASAP outside of Maintenance Zones.  Lets assume we are talking about a patch that ONLY affect Virtualization Hosts (I know I know.. not realistic with Hyper-V, bear with me).

Hyper-V Scenario with Quick Migration:

Assumptions to setup:

6 people * 4 Hosts per person per hour gives us roughly 24 hosts and their fail-over pair getting updated each hour and a half.   240 hosts divided by 24 gives us 10 hours to do all these migrations at a rush with a staggered patch start time by about 7 mins for each server.  Also each person is perfect in their execution.  That’s not unreasonable considering connect time to console and login times.

This doesn’t take into account the issues with the business units that are dependent on your services:

VMware Scenario with DRS & VMotion:

I have been able to do a full rushed patch deployment like this in my environment with an average of 30 Servers per Host in about 6 hours by myself.

We would start this patch application immediately upon notification since VMotion does not cause a network outage or service interruption.   The window of potential infection is incredibly small at this point as I don’t wait for a maintenance zone and start the update immediately on the Hosts.

So the question for a real enterprise how much is this worth?   For me its pretty obviously worth it.   No downtime.   No service impact.   Just a continiously available service for my clients who don’t have to care about the latest patch.

Posted on October 28, 2008 at 3:37 am by iguy · Permalink
In: Server Virtualization · Tagged with: ,

6 Responses

Subscribe to comments via RSS

  1. Written by Stu Fox
    on 5 November 2008 at 10:21 pm
    Permalink

    So why would you have people doing that work? Wouldn’t you have an automated software distribution system taking care of this for you (e.g. System Center Configuration Manager)? You just script the process, scheduling & maintenance windows take care of when it happens. And then you have Operations Manager monitoring and alerting, and sending you a text message should a node/patch fail. That’s what management is all about.

    And if you’re going to do Vmotion during the day, what’s your failback position should a node fail during the patch? Have you been through change control and had a move of your machines approved for daytime? Remember, it only has to go wrong once for you to look bad.

    Even though Hyper-V v2 has live migration, I’m still a fan of change control & outage windows, regardless of your hypervisor. I don’t like the possibility of things affecting my users during their working hours.

    Cheers

    Stu
    Disclaimer: I work for Microsoft NZ.

  2. Written by Iguy
    on 6 November 2008 at 8:00 am
    Permalink

    It’s all a question of what technology do you believe in and have experiences with and know the failure points in functionality.

    My fail-back is that VMware’s infrastructure is much more advanced than just “VMotion”. You setup a cluster of 10 hosts. As such you take one host out of production using maintenance mode, update it in that cluster, check the host out and then return it back into service. If it fails you skip it and keep going. A typical well designed cluster can handle at least one failure if not two with no impact to the performance level of the cluster.

    In a perfect world every company would have a perfect deployment system. Something to help automate the patch deployment. I personally have had less than ideal experiences using SCMM to deploy patches so I’d use something else. Though the idea is the same.

  3. Written by Malaysia VMware Communities
    on 6 November 2008 at 3:53 pm
    Permalink

    Live migration is very very important features for me and my team. I used to vmotion anytime I like.

  4. Written by Mike DiPetrillo
    on 8 November 2008 at 1:04 pm
    Permalink

    Stu,

    Change control is important in an enterprise. I used to work for several large banks and if something came in that was marked emergency then we’d have an emergency change control meeting, get the change approved, and schedule a window. Now companies can make that window happen during the day. Not only does this get things patched sooner/faster (after all this is emergency) it also saves the company money from not having to pay overtime. Should something go wrong you have your full, normal working staff there to take care of it. If you’re doing the auto patch thing at night you have a few problems:

    1) You need to spend time scripting the whole process and then finding some node to test the script on. What happens if the process can’t be scripted?

    2) When something goes wrong you now have to wait until the appropriate people can be notified, wake up, get some coffee, drive to the office, and finally fix the problem.

    VMotion is about freedom and flexibility. It goes beyond just patches. What happens if 100 more people start hitting my app and it needs to move to another host? Do you take downtime to move it with Quick Migration? People are using it a lot at this point. Do you just say sorry, you guys get bad performance until change control approves a move tonight or this weekend? Or do you move it with VMotion and fix the issue right away? Same thing with the storage side and Storage VMotion.

    I’m really sorry that Microsoft is the *only* virtualization vendor that hasn’t figured out the simple task of live migration or VMotion. Every other vendor on the planet has it today. Perhaps once you all get it you’ll realize what a true liberator it is for customers and change your mind on its usefulness.

    P.S. I work for VMware, but you already know that, Stu.

  5. Written by Stu Fox
    on 17 November 2008 at 12:09 am
    Permalink

    Mike

    I never said live migration isn’t useful – I think it’s a powerful technology and I can’t wait for us to have it as well. I just said that it’s not always as straightforward as it sounds. I think you’re also well aware that Microsoft has figured it out (the demos are out there of R2).

    Yeah, auto patching isn’t without potential issues, but neither is live migration. If you’ve got an IT shop mature enough to have Vmotion as part of it’s standard processes that, then you’ve probably got an IT shop that is mature enough to auto patch and alert as well. Remember that live migration doesn’t help you when your application fails – so if you’ve got a business critical application you’ve likely got other measures in place to ensure availability (Clustering/load balancing/really expensive synchronisations). You still have to deal with the same issues when patching those as well.

    As always, I enjoy the dialog!

    Cheers

    Stu

  6. Written by Vikash Kumar Roy
    on 6 January 2009 at 1:04 pm
    Permalink

    I read Stu comments on my VMware blog and seems like he is only aware of MS virtualization and its capability. FYI I have used both and I have no doubts to admit that VMWare is very matured in its virtualization technologies. Take any component of virtualization and you can yourself find its result.
    My View : If you really want to compare apple to orange you need to taste both and then judge which one you like most 
    I am neither MS employee nor VMWare employee but just a devotee of VMWare

Subscribe to comments via RSS

Leave a Reply