Monster VMs Mass Storage & How to check storage used

Jason Boche posted an interesting discovery with regards to Monster VMs and the amount of storage one would be sticking to them.  One of the large projects being designed today  has the likely reason to hit this limit with just a single VM.  This is pretty easy for that project to hit that as a single VM will have about 75TB assigned to it.   Since this design has 4 or 5 VMs of this size, which then means our standard ESXi Host will have around 300+ TB assigned to it.  Thankfully RDM isn’t impacted by this limit in theory, so there is a workaround available.  Not ideal and yet doable.

Instead one of the questions that came out of this news is “Does this issue impact the rest of our environment?”   Will this explain some random stability issues we have seen?

This script will return what every single host has in terms of VMDK files.  To use this script look at each host and see if the host itself is under the limits.  Then look at the cluster level based on the HA rules and see what the VMDK limits would be if a host failed.  That should give an idea of how close your clusters might be to hitting this limit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
if ( -not (Get-PSSnapIn | where {$_.Name -eq "VMware.VimAutomation.Core"}) )
{ 
    Add-PSSnapin -Name "VMware.VimAutomation.Core"
}
 
$vcenters = "vCenter"
 
connect-viserver -server $vcenters -cred (Get-Credential)
 
$diskuse = @()
 
$clusters = get-cluster
foreach ($cluster in $clusters) {
    "Working on Cluster $cluster"
    $vmhosts = get-vmhost -Location $cluster
    foreach ($vmhost in $vmhosts) {
    "`tWorking on VMHost $vmhost"
        $vmlist = get-vm -location $vmhost
        $totalDisk = 0 
        foreach ($vm in $vmlist) {
            $view = $vm | get-View   
            foreach ($disk in $view.Guest.Disk) {
                $disksize = ([math]::Round($disk.Capacity/1MB))
                $totaldisk += $disksize
            }
         }
         "`t`tVM size: $totaldisk"
            $obj = new-object PSObject -Property @{
                Cluster = $cluster
                VMHost  = $vmhost
                TotalDiskMB = $totaldisk
            }
         $diskuse += $obj
    }
}
 
$diskuse | export-csv "OpenFileDiskUse.csv" -NoTypeInformation

With Default settings the magic numbers to look for:

  • ESXi 4.x – 4TB
  • ESXi 5.1 – 8TB

EUC1453 – Managing from the Middle with Horizon

The Post-PC world is not Mac vs Windows.  It is all about the changes of multi-device, data mobility & anywhere access.  As such applications and means of access need to change.

Complexity for end clients has gotten much more complicated due to multiple devices, types of applications and so on.   So the management of the environment is very complex behind the scenes.  The requirements for cost management, easy to use & security still exist.  Those basic back end necessities are crucial to IT.

How did this happen?  Its the piecemeal approach.  IT had to get a tool to manage it for the Mac.  Then had to manage it for iOS.  Then Android and so forth.  This is for Web tools, SaaS tools, Mobile Apps etc.

Consumeration of IT really means it has to be easy or the end user will go somewhere else.  Now to deliver this, VMware’s approach is to manage from the middle.  My Apps, My Files, Native Experience.

Now Horizion suite is aiming to managed in the middle and be the portal to facilitate access and control for all the various end user computing space to the application and devices.  By doing this a Catalog can be offered, file sharing can be offered and controlled.  Central management is required to offer security policy controls also.

The three ways of managing in Horizon view point is Identity, Policy & Context.  In Identity, who are you.  Context is what are you trying to access and from and what OS.  This is different than before where the PC was a known entity.  Now there are more choices and options here.Policy takes Identity and Context to apply rules across both of them.

Horizon offers a place to manage, security, track, delivery and make functional.  Their goal is to encourage them to opt-in instead of force down.  End customers will find a way around IT wherever possible.  Getting them to buy in is the future.

INF-SEC1840 – ESXi Hardening Guide and Security Practices

The ESXi Hardening guide started in the 2.x days.  It was originally a best practices document around security.  In the 4.x, VMware started breaking it into categories of information.   Improvements over the years.   Some feedback on the 4.x guide is that it was presented in PDF, limited ways on how to resolve it and mitigate the listed risk and was not usable in a programmatic format.

The new format in 5.0 is a spreadsheet guide with categorization by component and sub-components.  Now this spreadsheet includes techniques to implement and apply these recommendations.  Look at http://vmware.com/go/securityguides for more information.

Automating these solutions and suggestions is the next obvious step in addressing the security concerns.   As such the SCAP standard from NIST which allows security audit and validation.    This Standard allows you to do checklist validation style approaches and how it is setup.  XCCDF is the primary XML rule based system for validation around testing and assessment.  XCCDF is the setup of the checklist approach and then OVAL designates a fixtext area which can be programmatic or just manual steps.  Together these two things will help by having standardized approach to building and utilization of security information and validation.

OVAL is the Open Vulnerability Assessment Language. Today there are over 13,500 definitions made that include different versions of all OS platforms.  This is open and community driven and significant amount of information being created every day.  This standard links with CVE to show vulnerability scoring which is more timely and updated reasonably quickly.

INF-VSP1365 – Software Defined Security & Networking

The 4 main layers of making Software Defined Networking & Security work from Bottom Up is

  • Abstraction
  • Pooling
  • Service Insertion
  • Administration

Today our networking and security structures are built around the physical entities that we setup.   As such the design is tied to the physical entities and how they are setup.  In the HyperVisor world, where VMs and services can be tied together into virtual datacenters, the physical entities are the limiting factors to better containment of security.

With Software Defined Networking, it is now possible to take that set of VMs which are tied together into a vApp container and apply security/network together around that container.   Not just tied to the physical or IPs, can be the entirety of the VM container.  This changes the conversation of security to talking about the smallest realistic entity of the OS instance instead of just one property of that OS instance (being IP address).

VMware has been working with some integrations in this space with F5, RiverBed, Symantec, SafeNet, Brocade, Emulex as a short list.

f5, one of the leaders of load balancing, is helping speed up application provisioning.  Today it can easily take 4+ weeks in some businesses due to the internal process of throwing over the wall with high failure rates.  Setting up the policy for infrastructure and then selection/application of the policy from the service catalog.  f5 has reduced down to 25 mins to fully automatically deploy these rules.  f5 does this by integration of their Enterprise Manager through the vShield Manager API.

RiverBed Cascade product is another integration point with vXLAN.  One of the big challenges is the Loss of Network Visibility & Control.   RiverBed can do this with IPFIX monitoring in Q4 this year.  They can dig into the vXLAN protocol with deep inspection.  This does it by reading into the Virtual Distributed Switch.  It gives Performance Management for the Software Defined Networking.  It can break down and work across the multi-tenancy design.

Symantec who is a heavy weight for security has some heavy integrations and some overlays.  Not much to be said here and depends on which product line that is examined.

The thing that makes this possible is the simplicity of the integration capabilities.  For many of these Partners they can continue to focus heavily on their products and be able to create a view into their data via a vCenter Plugin reasonably easily.

Intuit, maker of Quicken, has been working hard to delivery of Software Defined Datacenter.  They have many steps from R&D to production.  They go from Dev -> Test -> Performance -> Pre-Production -> Production -> Ongoing Compliance.  As such they need to support this on top of adding security, stability and HA/DR needs.   And of course lower the costs.   At the end the goal is to provide IT Agility.

Legacy designs meant they had to create unique zones of systems include physical routers and firewalls to handle public, private, dev and sensitive compartmentalization.   Very slow and difficult to provision new systems for teams to use.  The average app is a 3-Tier application.  Very high CapEx costs and there is no isolation within each zone.  This set of over the wall throwing of tickets means it takes 3+ weeks at best to get something setup.

The new design allows Intuit to pool everything together into a virtual hosting zone for the business to get systems from.  This software defined datacenter is their new solution.  They capture all the inforamation in a “blueprint form”.  They include VM, Storage, ACLs, Network and various customizations needed.   This gets fed into the request and is built out for them.  From there they have distinct provider zones that are compartmentalized.   By doing this set of abstractions they have 3x CapEx improvement, 2x Density, secure multi-tenancy and it is self service based.   This now takes on average about 30 mins to deploy from start to finish.

This change in viewpoint and approach has allowed IT at Intuit to turn from IT Provider to Customer Enabler.   This has given significant change to what IT folks have been doing in their day to day jobs.

Keynote Day 2 – Herrod and Future of End User Computing

Today’s Keynote is all about the End User Computing experience and the Battle of the Platinum partners.

Branch in a box through the View Rapid Deployment Program.   Take an appliance, some configuration and deploy as many workstations as you need.

Mirage from Wanova is being presented as the future of secured solutions and centralized management.  ACE, Local Mode are great solutions for a limited set of end use cases.   Mirage is aimed at handling all the various other end devices out there.  It will offer disaster recovery and centralized management.

After an entertaining canned demo around Mirage.  They presented some more advanced demos of using Tablets and existing OS instances.   Project AppShift is some R&D making it more swipe style based interface for Windows 7.  This is User Interface Virtualization Techinques.  This took takes several of Windows basic interface experiences and made them Tablet friendly with swiping and copy/paste across the system.

Horizon Suite Administration is announced today and a quick demo.  One of the points is that Horizon can manage Xen App Applications.  Horizon Mobile on iOS will wrap applications into a secured workspace.   This separates applications and control around security policies.  So now it allows you to manage end devices cleanly and safely across multiple devices from a single interface tool.

VMworld Challenge -What is the things that Partners are doing to improve the VM space.  They get 4 minutes to give their presentation.  Then everyone at VMworld will vote using the mobile app on who gave the best preso (or is doing the most interesting thing).  The winner will have their charity get a sizeable donation from VMware.

What are partners doing

Cisco – Playing for Kaboom who help build playgrounds in dense city centers for kids.

Techwise TV is presented by Cisco to making Networking more close.   Gist of L.I.S.P.  An interestingly cute little preso around VM Mobility with all the different means to move VMs around between datacenters due to IPv4/IPv6.  LISP is a free technology to use from Cisco.

Dell – playing for Girl Scouts of America.

Dell vStart 1000 is a stack that gives you the full Dell solution from storage, networking and compute in a rack with a simple management interface.   Fully integrated

EMC – charity is Wounded Warriors

EMC believes that more things should be built in or easily available  Directly from the Web Client, Chad Sakac from EMC, he created a backup job directly in vSphere environment.  His demo is live as he does things and clicks it isn’t pre-recorded. Chad ran out of time.  He was the first and brave.

HP – playing for Big Brothers and Big Sisters

HP showed how they have integrated their HP Matrix Infrastructure orchestration suite and how it works with vCloud Director.  It organizes and automates the integration and backend creation of a provider DC.

NetApp- Playing for Be The Match, helps DNA matching for bone marrow

Data ON Tap infrastructure is the demo.  How do you demo that per Dave Hitz.  Peak Colo is going to be used an example of how NetApp helps them out.   Everyone in Peak Colo gets a vSAN since all the infrastructure is shared overall and the vSAN is for each customer.

NetApp won and VMware is donating $10,000 to Be The Match.

Keynote VMworld 2012 – Day 1 – Right Here Right Now

In the past I did live blogging and it was an interesting thing to do.  Realistically it wasn’t a great experience for those watching and for me writing it.  Now a days, they are streaming live with VMware NOW and with near instant Twitter responses.  Instead this is more for me to document and enjoy the event.

Stomp (like) starts off the event.  Loud, exciting with drums in the VMworld at the center stage.  Never to let down a good show, VMworld hit it off great.

Rick Johnson, CMO start off with presentation around VMworld Agenda, General Session info and other scheduling stuff.   Tuesday keynote will be covering the future of End User Computing by Scott Herrod.  Then the battle royale among the Platinum Sponsors, live demo presentations with live voting of the event using mobile apps.  The best live demo wins by individual votes.

VMUG Rocks the house with some Green Shirts.

Paul Maritz now takes the stage.   Some interesting numbers on how the world has changed from 2008 to 2012.   Basically his history on VMware.  ~25k VCPs to ~125K, Workload Virtualization 25% to 60% of all loads.  One of his great contributions is the Cloud direction.

For the past 30 years, the primary driver of change is moving from Paper to Computer and that was considered innovative.  This generation is more about streams of data and consumption of that data in different ways.  It is about real-time versus specific static reporting.   Tying this data to how it impacts a person day to day is the mobile/social link.

One consistent direction for VMware coming from Paul is the 3 categories.  Server to Cloud, Existing Apps to Apps & Big Data and PC to Multiple End Points.   These three categories are the same breakouts of Server Layer, Development Environment & End User Computing as they have been for the past several years.

Paul at this point hands over Pat Gelsinger as the new CEO.   Interesting note Pat stated is that Pat was the R&D director and had a short meeting at an Intel Conference with Mendel from VMware.   They held up the conference 10 mins as Mendel was telling Pat about VMotion.  Mendel had it working and they were both so excited about use cases for this.

Software Defined Decenter

Pat had said back in 2007 that VMware needed to setup the Virtual Datacenter.   Significant progress has been made over the years.  In order to do this the stacks and environment needs to be completely automated and virtualized.

vCloud Suite

This suite includes many of the pieces to get the Software Defined Datacenter.

  • vCloud Director
  • SRM
  • vSphere
  • vCOPs
  • APIs

Comprehensive, Highest Performance and Proven Reliability is the only way to put Mission Critical on something.   vSphere 5.1 offers all this as the 9th major release.

vRAM is gone.  Heard loud and hear and listen carefully.   Going to a single CPU approach licensing model with no limits.  Woot!.

Cloud OPs

So how do you keep all this stuff running and build it?

  • Process for Operational Readiness
  • Role Based Certifications
  • Training expansions

Multi-Cloud is the future and today.

VMware already covers the PaaS with Cloud Foundry.  Now to handle Automation & Orchestration layer they have added Dynamic Ops.  Finally to add the bottom is at the Software Defined Datacenter layer is Nicira for Networking Virtualization.   Finally VMware joined the OpenStack organization.   They have a strong spread to cover many of the layers.

New Application Environment

vFabric and Cloud Foundry is going well.  Nothing really to note as new here.  Just continued development and movement forward.

End User Computing

Wanova/Mirage to manage PCs both physical and virtual.  Horizon is turning into the central portal for all application space access.   More to come here.

Pat is very respectful over what Paul Mariz has given to VMware and the kind of visionary VMware needed.  Both of these men have been revolutionary IT leaders over the past 30 years.

On to Scott Herrod’s presentation.

One of the steps forward in the Server Virtualization space is moving from a single VM up to multiple VMs in a single Virtual Datacenter. To do this compute, storage, networking and management have to be included.

In the compute space in 2011 with vSphere 5, 32 CPUs and 1Million IOPS per host.   Now with vSphere 5.1, 64 CPUs and 1 Million IOPS per VM.  Yes that’s right.. Per VM!!!   Serious improvements.

Epic Medical Systems is now offering their medical critical applications on x86 and will ONLY support it on vSphere.  This is an amazing announcement.

Along with all of the advancements of vSphere to help low latency and jitter sensitive applications such as Telephony.

Hadoop got some time around Project Serengeti.  This is a management application for Hadoop.

Software Defined Datacenter Storage

Interesting that Scott flew over the Storage VMotion that can be done with no shared storage.  This is huge as this is a major marketing capability of Hyper-V 3.0.   Lots of advancements and announcements here too.  Virtual Volumes, Virtualizing Flash, Virtual SAN.. SDDC for Storage is changing this world pretty seriously.

Software Defined Datacenter Networking

Many of the issues that happen now is networking.  To configure and setup Load Balancing, Network configurations can be significantly complex.  How do we move this along?

vXLAN ecosystem is how Scott is talking about this.   There are all sorts of different networking capabilities and I just can’t keep up with this all.  This space is moving fast from server offload to layer 3 edge interfaces to all sorts.   It might finally to be time to make the VM that goes around the world.

After having all this great tech the next challenge is how to manage it.  Going forward with the Suite’s will have vCOPs for example and how to integrate it in.   A good example is vCops shown directly in the vCloud interface.

Ultimately being able to expand this space is the partner enhancements.  So being able to utilize the existing API suite (vCloud API) to not have to deal with permissions separately or structure.  Just use the existing space.

The finish up is the cool demo on the ability to setup a VDC in a couple minutes between the private and public cloud. Sounds good.. just a big techie and quick with no time to sink in.

Slipped in quickly.. Enterprise+ gets a free upgrade to the vCloud Suite.

VMware is pushing the world forward.  I haven’t been this wowed since VMotion was first announced.

LAS4001: VMworld Labs Automation & Workflow Architecture

Disclaimer:  I scribbled notes as fast as I could during the presentation and may have mistyped/miswrote some information.  If you have solid information to make this more accurate please let me know.

At VMworld, they offer some amazing Hands on Labs.  This year there is 27 labs that could be taken. The goal after last year’s success is to hit over 250,000 VMs deployed and destroyed in 5 days of Hands on Labs.

vPod

This is built on an architecture called vPod.  In this architecture they have the vLayer0 which is the physical layer where the hardware lays with the initial ESXi.   Inside of that is vLayer1 which is the first layer of VMs.   This is where the ESXi instances of the labs exist along with the VSAs (Virtual Storage Appliances) and generally the vCenter Instance.  Since you have this layer in there you can create SRM labs with unique datacenters for the labs.   Under that is the vLayer2 that creates the VMs that actually are seen in most of the Labs as Testing Lab VMs.

Each physical cluster this year is 22-24 physical Hosts.   Using vSphere 5, DRS, and vCD 1.5.  Previously the Labs environment used Lab Manager.  Often in this environment is the newest of the newest beta level and is a fantastic test ground for load testing the space.

Automation

The automation they created is called LabCloud Automation.  This is a VM that uses Apache, Django, Adobe Message Framework, with a Postgres backend database.  It talks to the vCenter & vCloud Director via the various APIs, nothing special or funky just because they work at VMware.  Along with that all the API calls go through a set of “Bridge” python code.  These bridges are connected in a 1:1 fashion for each vCD instance.  The bridges help throttle communications going to the vCD as there are limits to how much the vCD API layer can process/handle at a given time.  There is one vCD per datacenter as the environment has been configured.

Every year there has been some sort of issues.  This is to be fully expected considering the scale and nature of what is going on.  Some history:

2009

This environment had 100% of the labs onsite.  All the compute, storage, everything.  Putting a serious datacenter on the show floor introduces its own issues as these convention centers are not configured for that much power draw typically.   So 1 power path for each device at best.  To make things easier for setup, the racks were built offsite and then shipped directly to the convention center with everything still in it.   That worked well enough until one of the racks had a forklift arm driven right through the middle.

2010

Datacenter was split between onsite and in a cloud.  The thin client connections via PCoIP worked great and had even more power issues since the size of the labs grew again.

2011

In the running of this year’s Labs everything is offsite at 3 different datacenters around the world.  Nothing is onsite. They have lost UCS blades, a storage controller, crashed some of the switches, completely crashed vCenter (which was a first).  They also managed to toast the View environment.  They have found that they lose 15-20% of the thinclients a year.  floor.   2011 everything is offsite, no power issues this year.  Originally the timers for the Labs started off only being 1 hour long.  Found most labs were taking longer than that so increased the timer to 1.5 hours.  This dug into the possible time to deliver labs to everyone.

People and Labs

Human dynamics are huge in events like this.   First day 200 people came in, sat down and all picked lab 1.  In the environment just like in vCD/View they stage up a certain number of labs so there’s always some ready to go.  They simply didn’t have enough of Lab 1 spun up ahead of time so many folks sitting there had to wait 10 mins for these labs to get up and running.  Not ideal.

Another challenge is tracking who is where and what that station’s condition is.  Is it available?  Is it getting reimaged?  Is the person done and the seat freed up? Is the seat working right or not? So they developed a person seatmap.  This tracks every single person and what lab they are assigned to. Used Adobe Flex for the interface.  Issues were found around keeping up to date consistently with the back-end database as the conference went on since the people scanning the badges were doing them in parallel.  More improvements coming next year.

Troubleshooting and Monitoring

Each of the 8 vCDs generate over 200 MB of logs every 10 mins.   Log rotation doesn’t cut it to figure out what’s going on.   Other tools were needed to attempt to alert on issues before they got too big.  vCenter Operations was extremely useful in diagnosis for some of the issues seen during the conference.

Predicting Demand

One of the tools written to stay ahead of the demand tracked how many vApps have been deployed to each vCD and what the lab status is. This tool has a minimum and maximum # of pre-deployed labs that it would make available.  By monitoring this the lab team was able to adjust to some of the human elements such as “Paul Maritz mentioned Horizon, so we should make sure that we have more of these labs ready to go when the General Session ends.”  At night getting the environment pre-populated takes about an hour to deploy 600-700 vApps when there is no load.Obviously doing that much load on top of people using the system to take the labs, it can take a bit longer.  One of the keys is trying to be ready for the first thing in the morning.

Some of the behind the scenes.  They do some Load Balancing across the environment.   This is tough and needs to take into account how big the labs are along with the complication that the hardware clusters are not all equal.  Some hardware has 96G of RAM per ESXi Host.. some has 24G.  So one of the big questions is how to spread big labs across clusters and vCD etc.  Some reporting happens to capture how the labs are deployed. This is a very tough item to automate today.  It has gotten better since now DRS is enabled through use of vCloud Director.  That introduces other issues though since it is a 3 layer deep virtualized space any changes occur over the network, not directly to storage often.  Performance labs were especially challenging.  “Well I want to cause the storage to go IOP crazy in the troubleshooting lab.”  “Uh. no. They are in a shared environment and could cause the actual physical environment to implode.”  Some SIOC and resource limits helped make these labs work well enough this year.

Stats

There is a ton of available stats.  One of the things that we in IT don’t do enough of is taking these stats and showing them to Tons of stats. Getting the information out to say “Look here.  This is cool.”

In the First year there was

  • 4,000 labs completed
  • 9 lab options
  • 40,000 VMs.

They quickly realized this is awesome.  The HUD up on the screen is huge in terms of selling the awesomeness going on.   It really made things happen and LABs got more popular.  Last year the HUD was based on car analogy.  You know your doing something right when Paul Maritz sits down and asks about the HUD and you mean we are doing that so far?

This year the original plan was to have a Jet fighter HUD that showed people’s names as they selected a lab and when they finished to give it a more human touch (along with missiles and explosions).  That ended up not getting displayed due to some bugs found at the last minute.

Used from http://twitpic.com/6cpg8l

Instead an Aquarium was used that showed each vApp as a fish (larger sized vApps meant a larger lead fish, smaller meant smaller fish).  Behind that lead fish was a school of fish that would follow it around.  Each fish in that school represented one of the VMs in the vApp/Lab.

Wrap Up

Performing the VMworld Labs and building a space to do this is the ultimate load/qa test.  Numerous times it has been found that the engineers/product managers never expected or thought that they’d do something like that or that way.  Significant amounts of feedback goes back to the products being used. The VMworld Labs are the single largest deploy of vCD in any single effort.

It is a fantastic capability to be offered at VMworld and shows just what is possible with the product suite.

Another post covering the LAS4001 labs here:  http://www.thinkmeta.net/index.php/2011/09/01/las4001-lab-architecture-session-cloud-and-virtualization-architecture/

 

Updated: 3 Sept 2011@6:31pm CST – Adding Aquarium HUD HoL screen shot and link to another LAS4001 lab blog post.

CIM1644: Lab Manager to vCloud for SAP

We know that Lab Manager is End of Life.  It is just a matter of time.   vCloud Director has most of the features and functionality to start migrating over.

Some of the nice new benefits are:

  • Everything in vCloud Director is full vSphere standard approach.  No more missing data in vCenter like Lab Manager.
  • VMs/vApps can be exported out as OVF and sent somewhere to start working immediately in a different environment.
  • No more SSMOVE needed.. just use Storage VMotion.
  • No more lab manager tools.  Straight up VMware Tools and using APIs instead.

Some issues and limitations on Lab Manager of no more than 8 nodes per cluster.   There are some limits to vCloud Director.   Be sure to take those limits into account.

Overall this session really covered much of the very high level view.

 

BCA1360 – Global Enterprise virtualizing Exchange 2010

Exchange 2010 can be virtualized and this session covers how they did it.

Some of the design points that need to be covered are:

  • DAS vs SAN
  • Scale up or Scale out

The choices made here are arbitrary and dependent on how you manage your datacenter and what you like/don’t like.

Their layout is:

  • 4 datacenters, 2 DCs in US & 2 in Europe
  • If they have a DC failure, can run around 25% reduced capacity
  • 3 Hosts per datacenter
  • 2 Hosts are active, 1 failover
  • SAN backend with 1TB 7k rpm SATA disks

How did they do it?

  1. Virtuals are manually balanced across the hosts per role
  2. DRS set to level 1 – don’t VMotion naturally
  3. No Reservations
  4. Dedicated Farm versus using the general farm
    • Exchange, all roles, all support systems etc

The Exchange 2010 Role layout is defined per OS instance, minimal sharing here.

CAS Role

  • 4 G, 2 vCPUs
  • VMDK based

Hub Role

  • 4G , 4 vCPUs
  • VMDK based

MBX Role

  • 2000 mailboxes per server
  • 6 vCPU
  • 36G of RAM
  • 3 NIC (MAPI, Backup & Replication)
  • VMDK for OS & Pagefile
  • RDM for Log & DB disks
  • For the 1TB LUN sizes use the 8MB block size format

SAN configuration

  • EMC Clarion CX4, 1TB 7200rpm SAT disk
  • RAID 6
  • Datastores in 8MB
  • Presented as 500GB and 1TB
  • OS, Pagefiles, & Misc Storage are VMDK
  • Logfile & Databases are RDM

LoadGen Physical versus Virtual

They ran some testing with VMware Assistance and the performance numbers were significantly under where Microsoft states are required.  In most cases significantly under.

Lessons Learned:

Backups and disk contention as things grew did start to become an issue as load was added.   Symptoms would be dropped connections.  Moved the backups to the passive copies instead.  This addressed much of the concerns.

When doing the migrations, take breaks in between each batch of migrations to iron out any issues.   Found problems like pocket of users had unique issues and needed to have time to iron out the gotchas.

Database sizes introduce issues around backup, replication etc.   Make sure you can manage them for the demands for your environment.

Some interesting discussions is that Hyper-Threading is not supported for production.   It complicates performance discussions by Microsoft.  VMware can do either so be sure to follow the Microsoft standards at the VM level.

Memory is a big question.  Basically set

Storage.. the main points are make sure you have appropriate IOP capability behind the scenes.  The other is if setting up VMDK files, should eagerZeroThick the VMDKs.   If you check the box for enabling FT during creatio, this will eagerZeroThick is automatically.  Otherwise this should be done when the machine is powered off and run VMKFSTOOLS from the command line.

16 months later…

  • Success doing VMotions and DAG failovers
  • Backups are running lights out
  • Will add more hosts to expand the environment
  • Pain Points:
  • Service Desk new process adoptions
  • Integration with legacy tools in house.

After all is said and done this has done quite a bit for the company.

  1. Datacenter savings
  2. TCO is down and has been passed on to the business
  3. much greater flexibility
  4. Scale out or Scale up very quickly
  5. Lower Administrative overhead so far
  6. More options for disaster recovery and scenarios

Exchange 2010 is possible.

VMworld 2011 – General Session #2 with Steve Herrod

Live Blog and notes taken during the session

Steve Herrod, VMware CTO, presenting for the General Session 2.

One of the great changes is the shift from servers and technology to End Devices and Services.   Universal access to the environment is critical part of this.  As such these have high expectations compared to past environments in computing.   DUH! (Devices, Universal Access, High Expectations).

Directions that need to address the DUH is IT needs to simplify, manage and connect the end folks to the services they care about.   Desktop services will be need to be contained and compartmentalized.   This includes the full Desktop (View) or just the applications (Thinapp).   Now if we can separate the Desktop, the Applications and the Data apart from each other, powerful policies can do quite a bit.   Once this is done User/Application/Data policies can take place.

View is the direct solution to handle the Desktop Service.  Well known and announced View 5.0.

To handle applications ThinApp is the direction being followed.   There is a new service called ThinApp Factory as part of the application store.   This tool will help auto create ThinApp packages.   Fully automated creations of these packages following some recipes.

Project Octopus deals with the data service layer.  This is built around providing a dropbox equivalent services in an enterprise.  http://www.vmwareoctopus.com

Horizon has a Mobile piece that allows IT to securely gain access to the application space.   Along with that is the Mobile Virtual Machine space.   As part of pushing the access/application, can have a controlled/contained environment on the phone.

One of the great things coming is a feature called AppBlast.   It can present the application, not a full desktop to devices like an iThing.  This works on HTML5, not a custom unique protocol to deliver it.  So to edit an Excel presentation on an iPad, just connect to the internal IT and work with Excel.

vSphere 5 is the basis for all of this.   The HyperVisor is still not a commodity as many of these powerful features come from the fact that the base is solid and functional.

vSphere 5 can now handle 32 vCPUs, 1,000GB RAM, 1 Million per Host.  Melvin the Monster VM has come.

Some of the other great features are around performance guarantees. Storage IO assurance is good.   By creating Storage Pools, can organize by application, purpose, performance and other validations.  Once a pool is created, automation on placement can take place.   Along with that can perform DRS with Storage now using Storage VMotion.   This can be done automatically.   Setup the policies and forget about it as the infrastructure deals with it.

vSphere 5 allows the IT admin to now set Storage I/O Control to assure performance.  Dealing with the noisy neighbor is a significant issue to control as the central system in the hypervisor.  Along with that network I/O control has arrived also.

As the future comes to pass, one of the largest challenges is the IP problem.   Right now networking in the TCP/IP stack is built with Location = Identifier.   If a machine wants to move around, the location and Identifier are tied together.  For example moving a VM from an internal Cloud to an External Cloud the VM needs to be given a new IP.  This can be a very disruptive activity.

VMware has a solution now called VXLAN which is Virtual eXtensible LAN.   This technology has been worked on with Intel, Emulex, Cisco, Arista, Broadcom.   The spec has been sent to IETF.   This is a big step on the virtualization solution.   Once this tie has been broken cloud mobility becomes feasible and deliverable.

Management is the end game after a solid reportable infrastructure is available.   From there the approach is

  1. Monitor
  2. Correlate
  3. Remediate
At the end the focus is that End User Computing is making a major shift soon.   This shift is coming with massive improvements in the client experience into something that the clients actually want.  All this is possible by simplifying the layers under neath so time can be spent on this new functionality.   To do that management of these spaces needs to be come more automated.