I attended and presented at E2EVC 2015 in Berlin from June 12th to June 14th. The networking was a blast. No “marchitecure” bull shit or vendor fairy tales what so ever and lots of very open discussions on the realities we’re seeing and facing in virtualization and cloud. Most account managers and esoteric presales would die a painful (but fast) death in this environment.
One session was with my Hyper-V Amigo buddy Carsten Rachfahl and was pure demo extravaganza, so no slides. My own session was “SMB Direct – The Secret Decoder Ring” and was an attempt to position this technology what by looking at the why and where followed by the how by who and when.
I hope a lot of people had at least a better understanding of SMB Direct, RDMA and DCB. The second aim was to take away the fear many people have of this tech by showcasing it in short demos. Time constraints where a challenge so it was not a 200 level session.
There was a brief moment of “this can’t be good” the sys admin looked at the file size of the backup folders and compared it to the size reported for the files. Sure I had told him that Windows inbox deduplication rocked but this had to be too good to be true or deduplication had just eaten all the backup files and he was “toast”. It was neither. But that requires some explanation. The good news is that Windows Data Deduplication combined with a backup product that supports it like VEEAM will save you a ton of money on deduplication licenses some charge and storage costs.
This is what he saw, and what caused the raised eye brow. 12.4TB reduced to 285GB.
Deduplication can’t be that great, right? Did something go wrong? Checking the properties of ALL selected files themselves did not report anything else but compared to the volume info for used space something seems very wrong. That’s supposed to be 5.34 TB.
The volume properties report the effective spaces consumed on the volume, so that reflects the true deduplication results. You can confirm this with PowerShell
A savings rate of 57% and 5.34 TB of actually consumes space (5880575557632 bytes) and an unoptimized size of 12.4 TB. Just as server manager reports.
So what is explorer up to at the folder and file level? Nothing, it just can’t show you the complete picture. Windows Data Deduplication stores duplicated chunks into the System Volume Information folder. Windows explorer runs under your account and has no access to that folder and doesn’t report the size of all chunks in there. The only thing it does reports are the non duplicated bits that are left in the source folder. In our case where the backups reside. The result is, as said, raised eyebrows.
The same is true for any other tool actually, like WinDirStat in the blow screenshot.
When we run this tools as system we get a different picture and you can navigate to the actual ChunkStore and learn more about the internals.
For the third type we need to create a new virtual disk using the fragmented one as the source. See Checking and Correcting Virtual Hard Disk Fragmentation. This easily done but it does cause down time unless you leverage storage live migration. So that’s my preferred method, especially as I leverage ODX when I do this, so it’s pretty fast. So always leave yourself some margin on storage to be able to perform maintenance operations. That has always been true and still is.
But how do you find out that you have this issue? PowerShell is your friend! Here’s a snippet to show you can check all VMs their vhdx files on a cluster:
Here’s a screenshot of some output of this snippet
As said the best solution that does not incur down time is to storage (live) migrate the virtual disks affected. We can automate this and put in some logic to do this for all virtual hard disks that are more than X% fragmented. Do take care to also check for disk space or the migration will fail.
Many MVP’s attended Microsoft Ignite 2015 in Chicago to see what our future will look like.
Carsten published the “Hyper-V Amigo Chat” we did right after Ignite. The conference was a blast for us all. Tired but happy we chat about storage space direct, Nano Server, ReFS, Dedupe, Azure Stack, … Enjoy!