Windows Server 2012 with Hyper-V & The New VHDX Format Leads The Way

Introduction

Whether you realize this or not but our trusted old VHD format is getting a bit old in the tooth. As a matter of fact it has been around since the last century. It has served us well but now it needs a major overhaul to better serve us at present and to prepare us for the decades to come. We (at least in the environments I support) see a continuing demand for bigger virtual disks & ever better performance. This should be no surprise. Not only does the amount of data produced keep going up year after year but we’re virtualizing more very resource intensive workloads than ever. Think image intensive data that has to be processed by number crunching virtual machines or large databases like SQL Servers. Sure 64 vCPUs and 1TB of memory are great and impressive but we also need loads of fast and ever more reliable storage. Trying to serve and support these needs with combined 2TB disks is very cumbersome (to be polite) and pass trough disks take a way a lot of the flexibility & options the VHD format gives us. So here comes the new VHDX format.  There is no back porting here, the only OS at the moment that supports VHDX is Windows Server 2012. The good news here is that we have in box tools to convert between VHD & VHDX.

Bigger, Better & Faster

Size

The VHDX format supports up to 64TB now. Yes that is 32 times more than the current VHD. As a matter of fact al lot of SANs still in use today won’t give you that size of LUN. Is there a need for this?  Well, I circle in some places with huge files in massive amounts so I can use big LUNs and large data VHDX files. Concatenating disks is something I do no like to do. Come upgrade/maintenance/renewal time that one bites too much for comfort.

There are also some other virtual disk formats that need to wake up and break that 2TB size boundary . Especially when Microsoft states that this is not a File format hard limitation. By that they mean they have room to increase it. Wow!

Protection Against Disk Corruption

The VHDX format also provides corruption protection during power failures for the VHDX files. This is done by a logging mechanism for the updates of the VHDX metadata structures. The logging mechanism is contained within the VHDX file so no worries, you won’t have to worry about managing log files. The overhead is minimal, as they only log metadata such as block allocations, block state updates and NOT the actual data stored. So no, it has not become a database Smile you need to manage, don’t worry. The protection works only for the VHDX file and not the data that is written to it. That job falls to NTFS or ReFS. What we discussed here was protection against VHDX file corruption.

The Need For Speed

With VHDX we also get larger block sizes up to 256MB for dynamic & differencing disks, meaning they perform better with workloads that allocate in larger chunks.

Modern Large Sector Disks

We get support to run VHDX on large sector disks without loosing performance.

I refer you to KB articles Using Hyper-V with large sector drives on Windows Server 2008 and Windows Server 2008 R2 and Information about Microsoft support policy for large-sector drives in Windows.

As you can read there the performance hit for both non fixed VHDs and applications is pretty bad. The 512e (4K physical and 512-byte logical sector size) approach is bad due to the Read-Modify-Write (RMW) process overhead in dynamic & differencing disks. 4K native (4K logical sector size) just isn’t supported by Hyper-V before Windows Server 2012. The maximum logical & physical sector size is now 4KB and that means that we get a lot better performance when running applications that are designed to use 4KB workloads in Hyper-V 3.0. VHDX structures are aligned on MB boundaries, so the need for the RMW from the disk is eliminated if the physical sector size of the virtual disk is set to 4K.

image

Storing Custom Metadata

We also get the ability to store custom metadata in the VHDX  file for information we find relevant. This could be about what’s on there, OS version or patches applied.
ODX Support. This custom data is stored using key/value pairs that support up to 1024 entries of 1MB. That should be adequate for a while Winking smile.

VHDX Leverages Offline Data Transfer (ODX)

The virtual stack allows ODX requests from the guest to flow down all the way to the hardware and as such VHDX operations benefit from this as well. Examples of this are:

  • Creating VHDX files, even such large ones has gotten an whole lot faster. Especially if you can offload this to the SAN. If your storage vendor supports ODX then you’re in VHDX creation speed heaven! As a bonus  even VHD files created in Windows Server 2012 benefit from this technology.
  • On top of that Merge & Mirror operation are also offloaded to the hardware which is great for merging snapshots or live storage migration.
  • In the future the virtual machines themselves might/will be able to pass through offload operations. This is hard core stuff  and due to the file layout far from trivial.

Please note that this only works with SCSI attached VHDX files. IDE devices have no ODX support capabilities.

TRIM/UNMAP Support

With Windows Server 2012 / VHDX we get what is described in the documentation “’Efficiency in representing data (also known as “trim”), which results in smaller file size and allows the underlying physical storage device to reclaim unused space. (Trim requires physical disks directly attached to a virtual machine or SCSI disks in the VM, and trim-compatible hardware.) It also requires Windows Server 2012 on hosts & guests.

It’s a major benefit in the “Stay Thin” philosophy associated with thin provisioning. No more running “sdelete” in your windows VMs (tedious, slow, resource intensive) or installing an agent (less tedious) to support reclaiming space. This is important to many of us and this level of support and integration makes our lives a lot easier & speeds things up. So choose you storage wisely.

TRIM is the specification for this functionality by Technical Committee T13, that handles all standards for ATA interfaces. UNMAP is the Technical Committee T10 specification for this and is the full equivalent of TRIM but for SCSI disks. UNMAP is used to remove physical blocks from the storage allocation in thinly provisioned Storage Area Networks. My understanding is that is what is used on the physical storage depends on what storage it is (SSD/SAS/SATA/NL-SAS or SAN with one or all or the above).

Basically VHDX disks report themselves as thin provision capable. That means that any deletes as well as defrag operation in the guests will send down “unmaps” to the VHDX file, which will be used to ensure that block allocations within the VHDX file is freed up for subsequent allocations as well as the same requests are forwarded to the physical hardware which can reuse it for it’s thin provisioning purpose. This means that an VHDX will only consume storage for really stored data & not for the entire size of the VHDX, even when it is a fixed one. You can see that not t just the operating system but also the application/hypervisor that owns the file systems on which the VHDX lives needs to be TRIM/UNMAP aware to pull this off. It is worth nothing this mean that it only works on the SCSI attached storage in the virtual machine, not on IDE connected VHDX disks.

Closing Thoughts On The Future Proof VHDX Format

For anyone interested in developing against the VHDX formats the specifications will be published. So that’s good news for ISVs, big and small. For all the reasons mentioned above I’m a fan of the VHDX format Open-mouthed smile and it’s yet one more reason to go full speed ahead with testing Windows 2012 so we can move forward fast and reap the benefits of reliability & scalability without sacrificing performance.

Hyper-V Shared Nothing Live Migration In Windows Server 2012– VM Mobility Rules

I see and hear some people shrug at the idea of Shared Nothing Live Migration, dismissing it as marginally useful. Some do state they’ll have it as well but that it’s not that valuable. Well I disagree totally. A lot of the time these remarks are due to a lack of understanding about how several technologies in the Microsoft stack work together. Combine this with tunnel vision and the fear of some vendors and you get a lot of FUD.

I advise you to look beyond the virtualization stack, to the issues that people who are building infrastructure for dynamic, flexible and * cloud  data centers are dealing with.

Look, as “architects” we have to design & build for failure. We all know that it’s just a matter of time before things go BOINK.  So we build in redundancy, some of this within a silo, some of this is between silos. The two approaches compliment each other. What this gives you is options and everybody who knows me, especially those who work  with me has heard my mantras: “Assumptions are the mother of all F* Ups” and “Options, options, options”. Make sure you design & build in options. This way you can maneuver your self out of a bad situation. Don’t ever assume you’re out of options, especially not when you put some in the design on purpose Winking smile. It’s also very useful beyond that because a lot of you might agree with me that silos and fork lift, down time inducing upgrades, migrations, transitions or replacements are expensive and bad. This is where Share Nothing Live Migrations comes into play. You gain mobility over silos. That silo might be a server, a cluster, storage or mixtures of them all.

With Shared Nothing Live Migration we can migrate virtual machines between those silos with nothing more than a network cable.This is huge people. You are no longer trapped in that silo. In this context it provides you with all the options & flexibility mobility gives you. even it the technology itself is not about high availability.

Some very useful scenarios

Migrate virtual machines from an old cluster to a new cluster with out any down time

  1. Migrate virtual machines from stand alone hyper-V hosts to a fail over cluster with out any down time
  2. Migrate virtual machines from one stand alone host to another one for maintenance, again, without any down time
  3. Choose different types if storage & Hyper-V deployment depending in IOPS, redundancy, availability, manageability needs. With Shared Nothing Live Migration you can be confident  that  you can move your virtual machine from one environment to the other when needs change. This is breaking the storage silo boundaries open people! This is huge … think about it.

How it works

The details are for another post but basically is made possible by the combination of Live Storage Migration and Live Migration.

First the Storage is Live Migrated

image

After the Live Storage Migration is done the state of the virtual machines is copied and synchronized.

image

This Is Mobility

I hear the competition shrug.  It isn’t high availability. Well indeed no one who understands the feature ever said it was. It’s virtual machine mobility. Look at the scenarios above and you’ll see that this ability could very well be game changer in how we look at storage & design solutions.

Speed & Performance

What did we hear on this front: “it will be too slow to be really useful”. Really? Well let’s see:

  1. The world is converging to 10Gbps and after that 40Gbps and up will come
  2. NIC Teaming in box With Windows 2012 which can provide more bandwidth.
  3. SMB 3.0 Multichannel. This provides multiple channels per connection spreading the load over multiple CPUs
  4. SMB Direct, have you seen the speeds this achieves?

Before you state that this doesn’t work on Live Migration … as confirmed at TechEd 2012 Europe with Jose Baretto this does work when both the source AND the target is an SMB 3.0 share. This means yet another reason to use SMB 3.0 share for your Hyper-V storage needs! So unlike what Tad at vLimited keeps saying, unhindered by any knowledge, it is a very valuable feature and it can be extremely fast given the right connectivity and storage that can handle the IOPS. And no, the fact that it’s unbuffered doesn’t impact this to much. Test this by using xcopy/robocopy /J with a VHD over your infrastructure.

image

Even if you’re on a budget and cannot go for the RDMA NICs & SMB 3.0 you have several options to get very decent virtual machine mobility and not be stuck in a silo. And for those who want to leverage this feature to create and agile & mobile virtual environment you have some very nice technologies available to optimize to your needs & budgets.

Conclusion

Virtual Machine mobility and storage mobility are very interesting features that provide for a previously unknown flexibility. Windows Server 2012 makes us rethink our storage approaches (I sure am) and I’m very interested in seeing how this will evolve.

Answer to Brad at TechEd Europe 2012 Keynote: Pessimists & Tad Don’t like Windows Server 2012

Brad is on stage for the opening keynote asking if the glass is half full or half empty. Well it depends on where you are in the ecosystem. For us the glass is half full and filling up fast.

Some people nag me about the fact that Windows Server 2012 is so different and that it’s wrong to turn the world upside down. Yes, it is different and new in many ways.  There are also many improvements to features that already exist. There is a lot to learn and understand. Why are some people so pessimistic?

Ever since I got my hands on the BUILD Developer Preview bits I have personally invested a lot of my time in Windows Server 2012. With the beta that only increased. Why? Well, that’s the way forward, because that’s where the improvements are. We can’t do tomorrows jobs and meet tomorrows demands with yesterdays technology.

pessimistsbanner

The picture above is basically the pessimists view of the world. Enjoy your cupper but I’m not joining you. Windows Server 2012 rocks and it’s going to do a whole lot for our industry and businesses. But wait a minute, I do understand why Tad is so pessimistic. But that’s about the future of vLimited and being stuck in the past. Listen Tad, you’d better empty that cup because this is where vLimited becomes history rather than write it.

Does that mean I’ll be throwing away Windows 2008 R2? Nope. I expect to deal a lot with it in the next few years but I’m not going to build future infrastructure on the previous version. I will introduce Windows Server 2012 where and when we benefit from it. For me that is from day one the bits RTM. The benefits are so overwhelming we’d hurt ourselves by not doing it. Your mileage may vary. But don’t get stuck in the past  Here’s a link to your escape pod: Microsoft Virtual Machine Converter Solution Accelerator I’m happy it’s here. That’s what people are asking me more and more about, how to move to Hyper-V.

But what’s with the negativism of some? Sure people are still running Windows Server 2000/2003. Sometimes for good reasons, often for (very) bad ones. Are some going to go through all this again with people clinging to Windows2008 R2? No doubt. Been there, seen it. Very predictable. Is Windows Server 2012 going to fail?  No way.  And what I’m seeing in Windows Server 2012 is great technology. Will it be perfect? No. I already have feature requests for vNext Smile. But this is pushing the ball forward, this is ambitious in the best sense of that word.  There will be bugs, there will be challenges and hiccups. That’s part of the business and the realities of life.  But look at all what’s available in there. Don’t just read some industry press articles. Did you test it your self already? Did you do any clustering? Tested all the new functionality in Hyper-V? The innovations in Live Migration options and networking? Looked at the amount of PowerShell support in there? Notice the improvements in Active Directory, DHCP and other core infrastructure services? Have you used Windows Server 2012 at all yet? You didn’t look at SMB 3.0 and all the storage improvements in there did you? Go talk to Jeff Woolsey, he’s passionate about it and for good reasons. Put in some effort, live a little, get out of your comfort zone and you’ll be going places. Don’t be a pessimist. Think positive or you’ll end up like Tad who was the joke of the party at MMS2012

image

Windows Server 2012 Cluster Reset Recent Events Feature

There are various small improvements in Windows Server 2012 Failover Clustering that make live a little easier. When playing in the lab one of the things I like to do is break stuff. You know, like pull out the power plug  of a host during a live migration or remove a network cable  for one or more of the networks, flip the power of the switch off and on again, crash the vmms.exe process and other really bad things …Smile Just getting a feel for what happens and how Windows 2012 & Hyper-V responds.

As you can imagine this fills up the cluster event logs real fast. It also informs you in that you’ve had issues in the past 24 hours. Those recent cluster events could not be cleared or set to “acknowledged” up to Windows 2008 R2 except by deleting the log files. Now this has to be done on all nodes and is something you should not do in production and is probably even prohibited. There are environments where this is indeed a “resume generating” action. But it’s annoying that you can leave a client with a healthy looking environment after you have fixed an issue.

image

For the lab or environments where event log auditing is a no issue I used to run a little script that would clear the event logs of the lab cluster nodes not to be dealing with to much noise between tests or to leave a GUI that represents the healthy state of the cluster for the customer.

This has become a lot easier and better in Windows Server 2012 we now have a feature for this build in to the Failover Cluster Manager GUI. Just right click the cluster events and select “Reset Recent Events”.

image

 

The good thing is this ignores the recent events before “now” but it does not clear the event log. You can configure the query to show older events again. This is nice during testing in the lab. Even in a production environment where this is a big no-no, you can’t do this you can now get rid of noise from previous issues,focus on the problem you working on or leave the scene with a clean state after fixing an issue without upsetting any auditors.

image