Check/repair/defragment an XFS volume

Introduction

As I have started to use XFS in bite-size deployments to gain experience with it I wanted to write up some of the toolings I found to manage XFS file systems. Here’s how to check/repair/defragment an XFS volume.

My main use case for XFS volumes is on hardened Linux repositories with immutability to use with Veeam Backup & Replication v11 and higher. It’s handy to be able to find out if XFS needs repairing and if they do, repair them. Another consideration is fragmentation. You can also check that and defrag the volume.

Check XFS Volume and repair it

xfs_repair is the tool you need. You can both check if a volume needs repair and actually repair it with the same tool. Note that the use of xfs_check has been depreciated or is not even available (anymore).

To work with xfs_repair you have to unmount the filesystem, so there will be downtime. Plan for a maintenance window.

To check the file system use the -n switch

sudo xfs_repair -n /dev/sdc
Check, repair, and defragment an XFS volume
Check, repair and defragment an XFS volumea dry run with xfs_repair -n

There is nothing much to do but we’ll now let’s run the repair.

sudo xfs_repair /dev/sdc
Check, repair, and defragment an XFS volume
Repairing an XFS file system

The output is similar as for the check we did for anything to repair is basically a dry run of what will be done. In this case, nothing.

Now, don’t forget to mount the file system again!

sudo mount /dev/sdc /mnt/veeamsfxrepo01-02

Check a volume for fragmentation and defrag it

Want to check the fragmentation of an XFS volume? You can but again, with xfs_db. The file system has to be unmounted for that or you will get the error xfs_db: can’t determine device size. To check for fragmentation run the following command against the storage device /file system.

sudo xfs_db -c frag -r /dev/sdc
Check, repair, and defragment an XFS volume
A lab simulation of sudo xfs_db -c frag -r /dev/sdc – Yeah know it’s meaningless 😉

Cool, now we know that we can defrag it online. For that we use xfs_fsr.

xfs_fsr /devsdc /mnt/veeamxfsrepo01-02
Check, repair, and defragment an XFS volume
There is nothing to do in our example

xfs_scrub – the experimental tool

xfs_scrub is a more recent addition but the program is still experimental. The good news is it will check and repair a mounted XFS filesystem. At least it sounds promising, right? It does, but it doesn’t work (Ubuntu 20.04.1 LTS).

No joy – still a confirmed bug – not assigned yet, importance undecided. Not yet my friends.

Conclusion

That’s it. I hope this helps you when you decide to take XFS for a spin for your storage needs knowing a bit more about the tooling. As said, for me, the main use case is hardened Linux repositories with immutability to use with Veeam Backup & Replication v11. In a Hyper-V environment of course.

ReFS vNext Block Cloning and ODX

Introduction

With Windows Server 2016 we also get a new version of ReFS, which I’ll designate aptly as ReFS vNext. It offers a few new abstractions that allow applications and virtualization to control how files are created, copied and moved. The features that are crucial to this are block cloning and data tiering.

Block cloning allows to clone any block of a file into any other block of another file. These operations are based on low cost metadata actions. Block cloning remaps logical clusters (physical locations on a volume) form the source region to the destination region. It’s important to note that this works within the same file or between files. This combined with “allocate on write” ensures isolation between those regions, which basically means files will not over write each other’s data if they happen to reference the same region and one of them writes to that region. Likewise, for a single file, if a region is written to, that changed data will not pop up in the other region. You can learn more about it on this MSDN page on block cloning which explains this further.

ReFS vNext does not do this for every workload by default. It’s done on behalf of an application that calls block cloning, such as Hyper-V for example when merging VHDX files. For these purposes the meta data operations counting references to regions make data copies within a volume fast as it avoids reading and writing of all the data when creating a new file from an existing one, which would mean a full data copy. It also allows reordering data in a new file as with checkpoint merging and it also allows for “data projection” where data can be projected form one area in to another without an actual copy. That’s how speed is achieved.

Now some of the benefits of ReFS vNext are tied into Storage Space Direct. Such as the tiering capability in relation to the use of erasure encoding / parity to get the best out of a fast tier and a slower tier without losing too much capacity due to multiple full data copies. See Storage Spaces Direct in Technical Preview 4 for more information on this.

I’m still very much a student of all this and I advise you to follow up via blogs and documentation form Microsoft as they become available.

What does it mean?

In the end it’s all about making the best use of available resources. The one that you already have and the one that you will own in the future. This lowers TCO and increases ROI. It’s not just about being fast but also optimizing the use of capacity while protecting data. There is one golden rule in storage: “Thou shalt not lose data”.

For now, even when you’re not yet in a position to evaluate Storage Space Direct, ReFS vNext on existing storage show a lot of promise as well. I have blogged about file creation speeds (VHDX files) in this blog post: Lightning Fast Fixed VHDX File Creation Speed With ReFS vNext on Windows Server 2016. In another blog post, Accelerated Checkpoint merging with ReFS vNext in Windows Server 2016 you can read about the early results I’ve seen with Hyper-V checkpoint merging in Windows Server 2016 Technical Preview 4. These two examples are pretty amazing and those results are driven by ReFS metadata operations as well.

Does it replace ODX?

While the results so far are impressive and I’m looking forward to more of this, it does not replace ODX. It complements it. But why would we want that you might ask, as we’ve seen in some early testing that it seems to beat ODX in speed? It’s high time to take a look at ReFS vNext Block Cloning and ODX in Windows Server 2016 TPv4.

The reality is that sometimes you’ll probably don’t want ODX to be used as the capabilities of ReFS vNext will provide for better (faster) results. But sometimes ReFS vNext cannot do this. When? Block cloning for all practical purposes works within a volume. That means can only do its magic with data living on the same volume. So when you copy data between two volumes on the same LUN or between volumes on a different LUNs you will not see those speed improvements. So for deploying templates stored on another LUN/CSV fast it’s not that useful. Likewise, if for space issues or performance issues you were storing your checkpoints on a different LUN you will not see the benefits of ReFS vNext block cloning when merging those checkpoints. So you will have to revise certain design and deployment decisions you made in the past. Sometimes you can do this, sometime you can’t. But as ODX works at the array level (or beyond in certain federated systems) you can get excellent speeds wile copying data between volumes / LUNs on the same server, between volumes / LUNs on different servers. You can also leverage SMB 3.0 to have ODX kick in when it makes sense to avoid senseless data copies etc. So ODX has its own strengths and benefits ReFS vNext cannot touch and vice versa. But they complement each other beautifully.

So as ReFS vNext demonstrates ODX like behavior, often outperforming ODX, you cannot just compare those two head on. They have their own strengths. Just remember and realize that ReFS vNext actually does support ODX so when it’s applicable it can be leveraged. That’s one thing I did not understand form the start. This is beginning to sound like an ideal world where ReFS vNext shines whenever its features are the better choice while it can leverage the strengths of ODX – if the underlying storage array provides it – for those scenarios where ReFS vNext cannot do its magic as described above.

The Future

I’m not the architect at Microsoft working on ReFS vNext. I do know however, that a bunch of very smart people is working on that file system. They see, hear and listen to our experiments, results, and requests. ReFS is getting a lot of renewed attention in Windows Server 2016 as the preferred file system for Storage Space Direct and as such for CSVs. Hyper-V is clearly very much on board with leveraging the capabilities of ReFS vNext. The excellent results of that, which we can see in speeding up VHDX creation/ extending and checkpoint merges, are testimony to this. So I’m guessing this file system is far from done and is going places. I’m expecting more and more workloads to start leveraging the ReFS vNext capabilities. I can see ReFS itself also become more and more feature complete and for example Microsoft has now stated that they are working on deduplication for ReFS, although they do not yet have any specifics on release plans. It makes sense that they are doing this. To me, a more feature complete ReFS being leveraged in ever more uses cases is the way forward. For now, we’ll have to wait and see what the future holds but I am positive albeit a bit impatient. As always I’m providing Microsoft with my feedback and wishes. If and when they make sense and are feasible they probably have them on their roadmap or I might give them an idea for a better product, which is good for me as a customer or partner.

Windows Deduplication And Mysterious Folder & File Sizes

There was a brief moment of “this can’t be good” the sys admin looked at the file size of the backup folders and compared it to the size reported for the files. Sure I had told him that Windows inbox deduplication rocked but this had to be too good to be true or deduplication had just eaten all the backup files and he was “toast”. It was neither. But that requires some explanation. The good news is that Windows Data Deduplication combined with a backup product that supports it like VEEAM will save you a ton of money on deduplication licenses some charge and storage costs.

This is what he saw, and what caused the raised eye brow. 12.4TB reduced to 285GB.

image

Deduplication can’t be that great, right? Did something go wrong? Checking the properties of ALL selected files themselves did not report anything else but compared to the volume info for used space something seems very wrong. That’s supposed to be 5.34 TB.

image

The volume properties report the effective spaces consumed on the volume, so that reflects the true deduplication results. You can confirm this with PowerShell

image
A savings rate of 57% and  5.34 TB of actually consumes space (5880575557632 bytes) and an unoptimized size of 12.4 TB.  Just as server manager reports.

image

So what is explorer up to at the folder and file level? Nothing, it just can’t show you the complete picture. Windows Data Deduplication stores duplicated chunks into the System Volume Information folder. Windows explorer runs under your account and has no access to that folder and doesn’t report the size of all chunks in there. The only thing it does reports are the non duplicated bits that are left in the source folder. In our case where the backups reside. The result is, as said, raised eyebrows.

The same is true for any other tool actually, like WinDirStat in the blow screenshot.

image

When we run this tools as system we get a different picture and you can navigate to the actual ChunkStore and learn more about the internals.

image

REFS Is Going Places At Microsoft Ignite 2015

I just loved the strengths of ReFS when I started looking into them when it was first announced. However it has been a bit quiet around ReFS due to some limitations or support issues.

But we’re seeing progress again and it seems that ReFS will be taken on a bigger role, even as the preferred file system for certain use cases. This is awesome. We’ll get the benefits REFS brings for less expensive types of disks in combination with storage spaces which are quite good already:

  • Integrity: ReFS stores data so that it is protected from many of the common errors that can cause data loss. File system metadata is always protected. Optionally, user data can be protected on a per-volume, per-directory, or per-file basis. If corruption occurs, ReFS can detect and, when configured with Storage Spaces, automatically correct the corruption. In the event of a system error, ReFS is designed to recover from that error rapidly, with no loss of user data.
  • Availability: ReFS is designed to prioritize the availability of data. With ReFS, if corruption occurs, and it cannot be repaired automatically, the online salvage process is localized to the area of corruption, requiring no volume down-time. In short, if corruption occurs, ReFS will stay online.
  • Scalability: ReFS is designed for the data set sizes of today and the data set sizes of tomorrow; it’s optimized for high scalability.
  • App Compatibility: To maximize AppCompat, ReFS supports a subset of NTFS features plus Win32 APIs that are widely adopted.
  • Proactive Error Identification: The integrity capabilities of ReFS are leveraged by a data integrity scanner (a “scrubber”) that periodically scans the volume, attempts to identify latent corruption, and then proactively triggers a repair of that corrupt data.

But, there’s more as ReFS has been improved and those improvements qualify it as the best default choice for a file system on Storage Spaces Direct (SSDi or D2S). What they have done to make it fast for certain data operations (VM creation, resizing, merges, snapshots) can only be described as “ODX” like. We’re getting speed, scalability, auto repair, high availability and data protection in a budget friendly storage. What’s not to like. There are many uses cases now where the need and benefits are clear but the economics worked against us. Well that’s about to be solved with the new and improved storage offerings in Windows Server 2016. I’m looking forward to more information on the evolution of ReFS as it matures as a file systems and take its place in on the front stage over the years. If you haven’t yet, I suggest you start looking at ReFS (again). I’ll be watching to see how far they take this.