A first look at shared virtual disks in Windows Server 2016

Introduction to shared virtual disks in Windows Server 2016

Time to take a first look at shared virtual disks in Windows Server 2016 and how they are set up. Shared VHDX was first introduced in Windows Server 2012 R2. It provides shared storage for use by virtual machines without having to “break through” the virtualization layer. This way is still available to us in Windows Server 2016. The benefit of this is that you will not be forced to upgrade your Windows Server 2012 R2 guest clusters when you move them to Windows Server 2016 Hyper-V cluster hosts.

The new way is based on a VHD Set. This is a vhds virtual hard disk file of 260 MB and a fixed or dynamically expanding avhdx which contains the actual data. This is the “backing storage file” in Microsoft speak. The vhds file is used to handle the coordination of actions on the shared disk between the guest cluster nodes?

Note that an avhdx is often associated with a differencing disk or checkpoints. But the “a” stands for “automatic”. This means the virtual disk file can be manipulated by the hypervisor and you shouldn’t really do anything with it. As a matter of fact, you can rename this off line avhdx file to vhdx, mount it and get to the data. Whether this virtual disk is fixed or dynamically expanding doesn’t matter.

You can create on in the GUI where it’s just a new option in the New Virtual Hard Disk Wizard.

Or via PowerShell in the way you’re used to with the only difference being that you specify vhds as the virtual disk extension.

In both cases both vhds and avhdx are created for you, you do not need to specify this.

You just add it to all nodes of the guest cluster by selecting a “Shared Drive” to add to a SCSI controller …

… browsing to the vhds , selecting it and applying the settings to the virtual machine. Do this for all guest cluster nodes

Naturally PowerShell is your friend, simple and efficient.

Rules & Restrictions

As before shared virtual disk files have to be attached to a vSCSI controller in the virtual machines that access it and it needs to be stored on a CSV. Both block level storage or a SMB 3 file share on a Scale Out File Server will do for this purpose. If you don’t store the shared VHDX or VHD Set on a CSV you’ll get an error.

Sure for lab purposes you can use an non high available SMB 3 share “simulating” a real SOFS share but that’s only good for your lab or laptop.

The virtual machines will see this shared VHDX as shared storage and as such it can be used as cluster storage. This is an awesome concept as it does away with iSCSI or virtual FC to the virtual machines in an attempt to get shared storage when SMB 3 via SOFS is not an option for some reason. Shared VHDX introduces operational ease as it avoids the complexities and drawbacks of not using virtual disks with iSCSI or vFC.

In Windows Server 2012 R2 we did miss some capabilities and features we have come to love and leverage with virtual hard disks in Hyper-V. The reason for this was the complexity involved in coordinating such storage actions across all the virtual machines accessing it. These virtual machines might be running on different hosts and, potentially the shared VHDX could reside on different CSVs. The big four limitations that proved to be show stopper for some use cases are in my personal order of importance:

  1. No host level backup
  2. No on line dynamic resize
  3. No storage live migration
  4. No checkpoints
  5. No Hyper-V Replica support

I’m happy to report most of these limitations have been taken care of in Windows Server 2016. We can do host level backups. We can online resize a shared VHDX and we have support for Hyper-V replica.

Currently in 2016 TPv4 storage live migration and checkpoints (both production and standard checkpoints) are still missing in action but who knows what Microsoft is working on or has planned. To the best of my knowledge they have a pretty good understanding of what’s needed, what should have priority and what needs to be planned in. We’ll see.

Other good news is that shared VHDX works with the new storage resiliency feature in Windows Server 2016. See Virtual Machine Storage Resiliency in Windows Server 2016 for more information. Due to the nature of clustering when a virtual machine loses access to a shared VHDX the workload (role) will move to another guest cluster node that still has access to the shared VHDX. Naturally if the cause of the storage outage is host cluster wide (the storage fabric or storage array is toast) this will not help, but other than that it provides for a good experience. The virtual machine guest cluster node that has lost storage doesn’t go into critical pause but keeps polling to see if it regains access to the shared VHDX. When it does it’s reattached and that VM becomes a happy fully functional node again.

It also supports the new Storage Qos Policies in Windows Server 2016, which is something I’ve found during testing.

Thanks for reading!

ReFS vNext Block Cloning and ODX

Introduction

With Windows Server 2016 we also get a new version of ReFS, which I’ll designate aptly as ReFS vNext. It offers a few new abstractions that allow applications and virtualization to control how files are created, copied and moved. The features that are crucial to this are block cloning and data tiering.

Block cloning allows to clone any block of a file into any other block of another file. These operations are based on low cost metadata actions. Block cloning remaps logical clusters (physical locations on a volume) form the source region to the destination region. It’s important to note that this works within the same file or between files. This combined with “allocate on write” ensures isolation between those regions, which basically means files will not over write each other’s data if they happen to reference the same region and one of them writes to that region. Likewise, for a single file, if a region is written to, that changed data will not pop up in the other region. You can learn more about it on this MSDN page on block cloning which explains this further.

ReFS vNext does not do this for every workload by default. It’s done on behalf of an application that calls block cloning, such as Hyper-V for example when merging VHDX files. For these purposes the meta data operations counting references to regions make data copies within a volume fast as it avoids reading and writing of all the data when creating a new file from an existing one, which would mean a full data copy. It also allows reordering data in a new file as with checkpoint merging and it also allows for “data projection” where data can be projected form one area in to another without an actual copy. That’s how speed is achieved.

Now some of the benefits of ReFS vNext are tied into Storage Space Direct. Such as the tiering capability in relation to the use of erasure encoding / parity to get the best out of a fast tier and a slower tier without losing too much capacity due to multiple full data copies. See Storage Spaces Direct in Technical Preview 4 for more information on this.

I’m still very much a student of all this and I advise you to follow up via blogs and documentation form Microsoft as they become available.

What does it mean?

In the end it’s all about making the best use of available resources. The one that you already have and the one that you will own in the future. This lowers TCO and increases ROI. It’s not just about being fast but also optimizing the use of capacity while protecting data. There is one golden rule in storage: “Thou shalt not lose data”.

For now, even when you’re not yet in a position to evaluate Storage Space Direct, ReFS vNext on existing storage show a lot of promise as well. I have blogged about file creation speeds (VHDX files) in this blog post: Lightning Fast Fixed VHDX File Creation Speed With ReFS vNext on Windows Server 2016. In another blog post, Accelerated Checkpoint merging with ReFS vNext in Windows Server 2016 you can read about the early results I’ve seen with Hyper-V checkpoint merging in Windows Server 2016 Technical Preview 4. These two examples are pretty amazing and those results are driven by ReFS metadata operations as well.

Does it replace ODX?

While the results so far are impressive and I’m looking forward to more of this, it does not replace ODX. It complements it. But why would we want that you might ask, as we’ve seen in some early testing that it seems to beat ODX in speed? It’s high time to take a look at ReFS vNext Block Cloning and ODX in Windows Server 2016 TPv4.

The reality is that sometimes you’ll probably don’t want ODX to be used as the capabilities of ReFS vNext will provide for better (faster) results. But sometimes ReFS vNext cannot do this. When? Block cloning for all practical purposes works within a volume. That means can only do its magic with data living on the same volume. So when you copy data between two volumes on the same LUN or between volumes on a different LUNs you will not see those speed improvements. So for deploying templates stored on another LUN/CSV fast it’s not that useful. Likewise, if for space issues or performance issues you were storing your checkpoints on a different LUN you will not see the benefits of ReFS vNext block cloning when merging those checkpoints. So you will have to revise certain design and deployment decisions you made in the past. Sometimes you can do this, sometime you can’t. But as ODX works at the array level (or beyond in certain federated systems) you can get excellent speeds wile copying data between volumes / LUNs on the same server, between volumes / LUNs on different servers. You can also leverage SMB 3.0 to have ODX kick in when it makes sense to avoid senseless data copies etc. So ODX has its own strengths and benefits ReFS vNext cannot touch and vice versa. But they complement each other beautifully.

So as ReFS vNext demonstrates ODX like behavior, often outperforming ODX, you cannot just compare those two head on. They have their own strengths. Just remember and realize that ReFS vNext actually does support ODX so when it’s applicable it can be leveraged. That’s one thing I did not understand form the start. This is beginning to sound like an ideal world where ReFS vNext shines whenever its features are the better choice while it can leverage the strengths of ODX – if the underlying storage array provides it – for those scenarios where ReFS vNext cannot do its magic as described above.

The Future

I’m not the architect at Microsoft working on ReFS vNext. I do know however, that a bunch of very smart people is working on that file system. They see, hear and listen to our experiments, results, and requests. ReFS is getting a lot of renewed attention in Windows Server 2016 as the preferred file system for Storage Space Direct and as such for CSVs. Hyper-V is clearly very much on board with leveraging the capabilities of ReFS vNext. The excellent results of that, which we can see in speeding up VHDX creation/ extending and checkpoint merges, are testimony to this. So I’m guessing this file system is far from done and is going places. I’m expecting more and more workloads to start leveraging the ReFS vNext capabilities. I can see ReFS itself also become more and more feature complete and for example Microsoft has now stated that they are working on deduplication for ReFS, although they do not yet have any specifics on release plans. It makes sense that they are doing this. To me, a more feature complete ReFS being leveraged in ever more uses cases is the way forward. For now, we’ll have to wait and see what the future holds but I am positive albeit a bit impatient. As always I’m providing Microsoft with my feedback and wishes. If and when they make sense and are feasible they probably have them on their roadmap or I might give them an idea for a better product, which is good for me as a customer or partner.

A first look at shared virtual disks in Windows Server 2016

Introduction

Time to take a first look at shared virtual disks in Windows Server 2016 and how they are set up. Shared VHDX was first introduced in Windows Server 2012 R2. It provides shared storage for use by virtual machines without having to “break through” the virtualization layer. This way is still available to us in Windows Server 2016. The benefit of this is that you will not be forced to upgrade your Windows Server 2012 R2 guest clusters when you move them to Windows Server 2016 Hyper-V cluster hosts.

The new way is based on a VHD Set. This is a vhds virtual hard disk file of 260 MB and a fixed or dynamically expanding avhdx which contains the actual data. This is the “backing storage file” in Microsoft speak. The vhds file is used to handle the coordination of actions on the shared disk between the guest cluster nodes?

Note that an avhdx is often associated with a differencing disk or checkpoints. But the “a” stands for “automatic”. This means the virtual disk file can be manipulated by the hypervisor and you shouldn’t really do anything with it. As a matter of fact, you can rename this off line avhdx file to vhdx, mount it and get to the data. Whether this virtual disk is fixed or dynamically expanding doesn’t matter.

You can create on in the GUI where it’s just a new option in the New Virtual Hard Disk Wizard.

Or via PowerShell in the way you’re used to with the only difference being that you specify vhds as the virtual disk extension.

In both cases both vhds and avhdx are created for you, you do not need to specify this.

You just add it to all nodes of the guest cluster by selecting a “Shared Drive” to add to a SCSI controller …

… browsing to the vhds , selecting it and applying the settings to the virtual machine. Do this for all guest cluster nodes

Naturally PowerShell is your friend, simple and efficient.

Rules & Restrictions

As before shared virtual disk files have to be attached to a vSCSI controller in the virtual machines that access it and it needs to be stored on a CSV. Both block level storage or a SMB 3 file share on a Scale Out File Server will do for this purpose. If you don’t store the shared VHDX or VHD Set on a CSV you’ll get an error.

Sure for lab purposes you can use an non high available SMB 3 share “simulating” a real SOFS share but that’s only good for your lab or laptop.

The virtual machines will see this shared VHDX as shared storage and as such it can be used as cluster storage. This is an awesome concept as it does away with iSCSI or virtual FC to the virtual machines in an attempt to get shared storage when SMB 3 via SOFS is not an option for some reason. Shared VHDX introduces operational ease as it avoids the complexities and drawbacks of not using virtual disks with iSCSI or vFC.

In Windows Server 2012 R2 we did miss some capabilities and features we have come to love and leverage with virtual hard disks in Hyper-V. The reason for this was the complexity involved in coordinating such storage actions across all the virtual machines accessing it. These virtual machines might be running on different hosts and, potentially the shared VHDX could reside on different CSVs. The big four limitations that proved to be show stopper for some use cases are in my personal order of importance:

  1. No host level backup
  2. No on line dynamic resize
  3. No storage live migration
  4. No checkpoints
  5. No Hyper-V Replica support

I’m happy to report most of these limitations have been taken care of in Windows Server 2016. We can do host level backups. We can online resize a shared VHDX and we have support for Hyper-V replica.

Currently in 2016 TPv4 storage live migration and checkpoints (both production and standard checkpoints) are still missing in action but who knows what Microsoft is working on or has planned. To the best of my knowledge they have a pretty good understanding of what’s needed, what should have priority and what needs to be planned in. We’ll see.

Other good news is that shared VHDX works with the new storage resiliency feature in Windows Server 2016. See Virtual Machine Storage Resiliency in Windows Server 2016 for more information. Due to the nature of clustering when a virtual machine loses access to a shared VHDX the workload (role) will move to another guest cluster node that still has access to the shared VHDX. Naturally if the cause of the storage outage is host cluster wide (the storage fabric or storage array is toast) this will not help, but other than that it provides for a good experience. The virtual machine guest cluster node that has lost storage doesn’t go into critical pause but keeps polling to see if it regains access to the shared VHDX. When it does it’s reattached and that VM becomes a happy fully functional node again.

It also supports the new Storage Qos Policies in Windows Server 2016, which is something I’ve found during testing.

Thanks for reading!

Upgrade the firmware on a Brocade Fibre Channel Switch

NOTE: content available as pdf download here.

Upgrade the firmware on a Brocade Fibre Channel Switch

In order to maintain a secure, well-functioning fibre channel fabric over the years you’ll need to perform a firmware upgrade now and again. Brocade fibre channel switches are expensive but they do deliver a very solid experience. This experience is also obvious in the firmware upgrade process. We’ll walk through this as a guide on how to upgrade the firmware on a Brocade fibre channel switch environment.

Have a FTP/SFTP/SCP server in place

If you have some switches in your environment you’re probably already running a TFTP or FTP server for upgrading those. For TFTP I use the free but simple and good one provided by Solarwinds. They also offer a free SCP/SFTP solution. For FTP it depends either we have IIS with FTP (and FTPS) set up or we use FileZilla FTP Server which also offers SFTP and FTPS. In any case this is not a blog about these solutions. If you’re responsible for keeping network gear in tip top shape you should this little piece of infrastructure set up for both downloads and uploads of configurations (backup/restore), firmware and boot code. If you don’t have this, it’s about time you set one up sport! A virtual machine will do just fine and we back it up as well as we store our firmware and backups on that VM as well. For mobile scenarios I just keep TFTP & FilleZilla Server installed and ready to go on my laptop in a stopped state until I need ‘m.

Getting the correct Fabric OS firmware

It’s up to your SAN & switch vendors to inform you about support for firmware releases. Some OEMs will publish those on their own support sites some will coordinate with Brocade to deliver them as download for specific models sold and supported by them. Dell does this. To get it select your switch version on the dell support site and under downloads you’ll find a link.

clip_image002

That link takes you to the Brocade download page for DELL customers.

clip_image004

Make sure you download the correct firmware for your switch. Read the release notes and make sure you’re the hardware you use is supported. Do your homework, go through the Brocade Fabric OS (FOS) 7.x Compatibility Matrix. There is no reason to shoot yourself in the foot when this can be avoided. I always contact DELL Compellent CoPilot support to verify the version is support with the Compellent Storage Center firmware.

When you have downloaded the firmware for your operating system (I’m on Windows) unzip it and place the content of the resulting folder in your FTP root or desired folder. I tend to put the active firmware under the root and archive older one as they get replaced. So that root looks like this. You can copy it there over RDP or via a FTP client. If the FTP server is running your laptop, it’s just a local copy.

clip_image005

The upgrade process

A word on upgrading the firmware

I you move from a single major level/version to the next or upgrade within a single major level/version you can do non-disruptive upgrades with a High Availability (HA) reboot meaning that while the switch reloads it will not impact the data flow, the FC ports stay online. Everything keeps running, bar that you lose connectivity to the switch console for a short time.

Some non-disruptive upgrade examples:

V6.3.2e to V6.4.3g

V7.4.0a to v7.4.0b

V7.3.0c to v7.4.0b

Note that this way you can step from and old version to a new one step by step without ever needing downtime. I have always found this a really cool capability.

You can find Brocades recommendations on what the desired version of a major release is in https://www.brocade.com/content/dam/common/documents/content-types/target-path-selection-guide/brocade-fos-target-path.pdf

I tend to way a bit with the latest as the newer ones need some wrinkles taken care of as we can see now switch 7.4.1 which is susceptible to memory leaks.

Some disruptive upgrade examples (FC ports go down):

7.1.2b to 7.4.0a

6.4.3.h to 7.4.0b

Our upgrade here from 7.4.0a to 7.4.0b is non-disruptive as was the upgrade from to 7.3.0c to 7.4.0a. You can jump between version more than one version but it will require a reboot that takes the switch out of action. Not a huge issue if you have (and you should) to redundant fabrics but it can be avoided by moving between versions one at the time. IT takes longer but it’s totally non-disruptive which I consider a good thing in production. I reserve disruptive upgrades for green field scenarios or new switches that will be added to the fabric after I’m done upgrading.

Prior to the upgrade

There is no need to run a copy run or write memory on a brocade FC switch. It persists what you do and you have to save and activate your zoning configuration anyway when you configure those (cfgsave). All other changes are persisted automatically. So in that regards you should be all good to go.

Make a backup copy of your configuration as is. This gives you a way out if the shit hits the fan and you need to restore to a switch you had to reset or so. Don’t forget to do this for the switches in both fabrics, which normally you have in production!

You log on switch with your username and password over telnet or ssh (I use putty or kitty)

MySwitchName:admin> configupload

Hit ENTER

Select the protocol of the backup target server you are using

Protocol (scp, ftp, sftp, local) [ftp]: ftp

Hit ENTER

Server Name or IP Address [host]: 10.1.1.12

HIT ENTER

Enter the user, here I’m using anonymous

User Name [user]: anonymous

Hit ENTER

Give the backup file a clear and identifying name

Path/Filename [<home dir>/config.txt]: MySwitchNameConfig20151208.txt

Hit ENTER

Select all (default)

Section (all|chassis|switch [all]): all

configUpload complete: All selected config parameters are uploaded

That’s it. You can verify you have a readable backup file on your FTP server now.

clip_image007

The Upgrade

A production environment normally has 2 fabrics for redundancy. Each fabric exists out of 1 or more switches. It’s wise to start with one fabric and complete the upgrade there. Only after all is proven well there should you move on to the second fabric. To avoid any impact on production I tend plan these early or late in the day also avoiding any backup activity. Depending on your environment you could see some connectivity drops on any FC-IP links (remote SAN replication FC to IP ó IP to FC) but when you work one fabric at the time you can mitigate this during production hours via redundancy.

Log on to first brocade fabric switch with your username and password over telnet or ssh (I use putty or kitty). At the console prompt type

firmwaredownload

This is the command for the non-disruptive upgrade. If you need or want to do a disruptive one, you’ll need to use firmwaredownload –s.

Hit Enter

Enter the IP address of the FTP server (of the name if you have name resolution set up and working)

Server Name or IP address: 10.1.1.12

User name: I fill out anonymous as this gives me the best results. Leaving it blank doesn’t always work depending on your FTP server.

User Name: anonymous

Enter the path to the firmware, I placed the firmware folder in the root of the FTP server so that is

Path: /v7.4.0b

Hit enter

At the password prompt leave the password empty. Anonymous FTP doesn’t need one.

Password:

Hit enter, the upgrade process preparation starts. After the checks have passed you’ll be asked if you want to continue. We enter Y for yes and hit Enter. The firmware download starts and you’ll see lost of packages being downloaded. Just let it run.

clip_image009

This goes on for a while. At one point you’ll see the prom update happening.clip_image011

When it’s done it starts removing unneeded files and when done it will inform you that the download is done and the HA rebooting starts. HA stands for high availability. Basically it fails over to the next CP (Control Processor, see http://www.brocade.com/content/html/en/software-upgrade-guide/FOS_740_UPGRADE/GUID-20EC78ED-FA91-4CA6-9044-E6700F4A5DA1.html) while the other one reboots and loads the new firmware. All this happens while data traffic keeps flowing through the switch. Pretty neat.

When you keep a continuous ping to the FC switch running during the HA reboot you’ll see a short drop in connectivity.

image

But do realize that since this is a HA reboot the data traffic is not interrupted at all. When you get connectivity back you SSH to switch and verify the reported version, which here is now 7.4.0b.

clip_image014

That’s it. Move on to the switch in the same fabric until you’re done. But stop there before you move on to your second fabric (failure domain). It pays to go slow with firmware upgrades in an existing environment.

This doesn’t just mean waiting a while before installing the very latest firmware to see whether any issues pop up in the forums. It also means you should upgrade one fabric at the time and evaluate the effects. If no problems arise, you can move on with the second fabric. By doing so you will always have a functional fabric even if you need to bring down the other one in order to resolve an issue.

On the other hand, don’t leave fabrics unattended for years. Even if you have no functional issues, bugs are getting fixed and perhaps more importantly security issues are addressed as well as browser and Java issues for GUI management. I do wish that the 6.4.x series of the firmware got an update in order for it to work well with Java 8.x.