Extend the Capacity and Life of Existing Storage Infrastructure by Tiering Files to Azure
Data storage requirements continue to increase every year, forcing infrastructure / storage teams to constantly review capacity and search for new ways to reduce the cost curve for storage and retention.
Much of what is being stored and saved resides in files – places on shares and often untouched for months, years, or ever – yet left unchecked files will replicate like tribbles… clogging and cluttering storage subsystems in computer rooms and closets until it’s time for a costly upgrade to increase capacity.
Why not transparently tier old files to lower-cost storage in Azure to free up local storage capacity on Window file servers, and avoid costly server or storage upgrades? You can do just that with Azure File Sync without disrupting users or even (hopefully) a reboot.
Azure File Sync is a service for Windows Server that replicates files to Azure and if desired, tier them / “stub them out” on the file server to reduce required storage capacity. It is installed via an agent for supported versions of Windows Server (2012 R2, 2016, 2019), and connects to an Azure File Share using a Storage Sync Service in Azure:
The benefit to your storage infrastructure in terms of capacity come from enabling tiering when you connect the agent to the Azure File share (through the Sync Service), which you can do on a per server basis – as well as set a free capacity target and modification date thresholds for keeping files local:
Once files are synced from you server to Azure (and tiered / “stubbed”), file access for users and applications don’t really change, but the space on your server certainly can! Here’s an example where about 800+GB of files were tiered from a file server:
The files still look like they are still local to the file server – since the file metadata is local, but the actual data may be in Azure.
When a user accesses the file, the File Sync Agent transparently recalls the file from Azure and caches it on the local server – very similar to how OneDrive or One Drive for Business might perform.
Imagine the space that could be freed up on constrained branch office servers as well as large file servers… pushing out server replacements or SAN capacity upgrades! The added benefit is backup centralization…with files replicated to Azure, you can manage backups there… taking additional strain off your data center infrastructure – I’ve shown how to install and configure Azure File Sync previously along with using it for backup and recovery.
Now I’d like to show you a real-life example of how to shrink down a VMware-based Windows Server.
I’m going to show you the steps I used to significantly reduce the local storage used for a VMware-based Windows File Server.
My server is a Windows Server 2016 VM with two VMDKs: one 60GB boot disk and a 1TB data disk (thick provisioned):
It has the same 800+GB of files I showed earlier, except that they are all local and inside the data VMDK on the E: Drive:
After installing Azure File Sync and synchronizing the files, I tiered all the local files on the E: drive by running this simple bit of PowerShell inside the VM:
Import-Module “C:\Program Files\Azure\StorageSyncAgent\StorageSync.Management.ServerCmdlets.dll”
Invoke-StorageSyncCloudTiering -Path “E:\File Shares” I could have waited for the bulk of the files to “stub out” but why, when you can run some PowerShell! This process had significant impact on the storage used inside the VMDK:
The typical way I approach freeing up space for a VM is to compress the disk file – there are several tools (including those from) VMware to do this.
Depending on your storage subsystem, you likely will want to zero out the now empty portion of your VMDK – that would allow the zeroed blocks to be deduplicated. To zero out Windows VMs, I use SDelete (https://docs.microsoft.com/en-us/sysinternals/downloads/sdelete) with the “-z” option – something like:
sdelete.exe -z E:
I used the 64bit version that’s in the download, and it took about 6 hours to do the cleaning and purging process – all while the file server was still up and available.
Again, depending upon your storage platform, you may at this point be able to deduplicate the VMDK and save a ton of space. If your storage doesn’t support deduplication, then you likely have a few more steps, depending on your ultimate plans and goals.
For a thin provisioned VMDK, you should be able reduce the file size using vmkfstools as outlined here (and documented on the VMware site). Thick provisioned VMDKs are trickier… but still something worth tackling.
If your ultimate goal is to move the VM somewhere else – like to Azure (either native IAAS or to VMware in Azure using Azure VMware Solutions) then the migration tools themselves will likely resolve the VMDK size issue as part of the move (using Azure Site Recovery or VMware HCX to AVS). If, however you want the VM to remain on premises, you’ll still want to make it smaller. You can convert the VMDK from thick to thin provisioned simply use VMware Converter or clone the VM in vCenter. Either way, the results are the key…. and getting to less storage utilization (or stopping uncontrolled storage growth in your data center) is the goal. I merely shutdown the existing VM and cloned it – changing the storage to “thin” for the new VM. This process did take some time (about 2 ½ hours of downtime), but took the data disk down from 1TB to less than 9GB:
Yes, there was downtime as part of this process – but that’s the only server downtime that occurred through the entire process!
If my storage supported de-duplication or I used something like HCX to move the VM, I likely would have seen little if any downtime.
Don’t Want to Shrink? Replace the VM Instead!
Another way to avoid downtime would be to create a new VM that’s part of the Azure File Sync setup. Azure File Sync supports replicating files to multiple hosts from a single Azure File Share – that means you could create a new VM with a smaller, thin provisioned VMDK and use it to host the tiered files.
The new VM would just need the File Sync agent installed and configured with tiering enabled (just like the original VM, with it’s own new file system.
For everything to work seamlessly, the new file server will need to be in the same AD domain, and there are a few other things to consider:
- Server name – you may want the new server to have the same name (respond to requests for files) like the old VM. You could rename the server to match, or use DFS-N, or just use a new server name (not a bad option for some situations!)
- Shares and share permissions – Azure File Sync replicates directories, files, and permissions, but doesn’t do anything with the actual shares defined on the server. You’ll want to migrate (backup and restore) those shares / permissions to the new server
Ned P. pointed me at a great article (very old) article on migrating file / storage services worth reading:
That pointed me to the Windows Server Migration Tools:
Which (if I remember correctly) I installed, and “extracted my share info using just Powershell, and send it to the new, remote server:
Install-WindowsFeature migration Add-PSSnapin Microsoft.Windows.ServerManager.Migration send-smigserverdata -computername ws2019 -sourcepath "d:\File Shares" -destinationpath "d:\File Shares" -include share -recurse -verbose
Sending from the source server is only half the code, you also need to “receive” the share info on the other side:
…and here’s what the target server looked like before, and after the share migration:
Azure File Sync’s tiering capability can be used to easily reduce storage utilization and help avoid expensive storage upgrades. If you have a standalone file server (like in a branch location), you may not have to go through all these steps to see the value of upgrade avoidance – you’ll save space you’ll just reuse.
Here I went through a more challenging (worst case) example:
- VM in a centralized data center on VMware (tough for me – I’m a Hyper-V guy, but I’m learning!)
- “thick” provisioned storage
I wanted to summarize the time and effort to reclaim this ~1TB of storage, so I put together the table below:
|Operation||Elapsed Time||Work Time||Downtime?|
|Install / Configure File Sync||20 minutes||20 minutes||None|
|Replicate 800+GB of files to Azure||~2 days||None||None|
|Tier data (free space in VMDK)||seconds||5 minutes||None|
|Zero out free space in VMDK||6 hours||5 minutes||None|
|Shrink 1 TB VMDK and reattach to VM||2 ½ hours||10 minutes||2 ½ hours*|
|Total Time||~3 days||40 minutes||2 ½ hours|
*Note that downtime could be avoided / reduced based on use of deduplication or migration (HCX).