- Azure Sentinel—A real-world example - Tue, Oct 12 2021
- Deploying Windows Hello for Business - Wed, Aug 4 2021
- Azure Purview: Data governance for on-premises, multicloud, and SaaS data - Wed, Feb 17 2021
In case you missed it, Microsoft has been making a serious play for replacing your next SAN purchase with Scale-Out File Servers (SOFSs) and storage spaces. First presented in Windows Server 2012 and enhanced in 2012 R2, these technologies, together with the SMB 3.0 protocol and support for RDMA NICs, match most SANs, feature by feature. And they are much cheaper because you’ll run them on commodity hardware and stick the disks in “dumb” JBOD enclosures without the need for expensive RAID controllers.
I was looking forward to what was coming in the next version of the server, and Microsoft didn’t disappoint, again bringing a technology with its roots in Azure: storage replication (SR). This is the last piece of the puzzle; allowing data to be replicated within a datacenter or to a remote location is a feature of high-end SANs. The other improvement is an enhanced storage QoS. Introduced in 2012 R2 (but only on a per-VHD/VHDX file basis, not across the entire storage stack), it can now apply to one or more virtual disks, VMs, or a tenant with centralized control.
Storage replication
This in-box replication technology is block-level (meaning it doesn’t care about the data type) and storage agnostic, so different storage types can be replicated. This is in contrast to most SAN replication add-ons, which require the same brand and often model of SAN at each end.
For shorter distances over fast links, you can do this synchronously, where the write isn’t acknowledged by the storage until it has happened on both the primary and the replica storage. For longer distances, the asynchronous replication option works better, where the write is acknowledged as soon as it’s processed on the primary side, leading to the possibility of some data loss if there’s a sudden outage.
Replication is enabled on a per-volume basis and can take place over ordinary TCP/IP networks or using RDMA (iWARP or InfiniBand with Mellanox Metro-X) networks. Replication also supports multichannel and can work together with both Deduplication and BitLocker.
It’s important to point out that SR doesn’t replace Distributed File System Replication (DFSR) for branch office scenarios. SR won’t work well over typical slow WAN links with high latency. SR is also not a backup technology; if you delete a whole bunch of files on the primary side by mistake, those blocks will be instantly erased on the replica side as well. SR also isn’t a replacement for application-aware replication technologies such as Hyper-V Replica, SQL AlwaysOn, or Exchange Database Availability groups. Whenever an application has built-in replication, it’s going to be much better tailored for the specific workload compared to a generic storage replication technology such as SR.
This is one of the few new technologies in the Windows Server 10 Technical Preview that has a guide, requiring four servers and two SANs or JBOD enclosures with RDMA NICs to set up the full lab.
Microsoft goes to great lengths to point out (at least four times in the guide) that this preview is NOT supported for production workloads, nor has its performance been optimized yet.
Installing Windows Volume Replication
How to use storage replication
One example workload is a stretched Hyper-V cluster with a few nodes in a primary datacenter with their own shared storage (SAN or a JBOD enclosure) and another few hosts in another nearby datacenter with their own shared storage. This would give you a true stretched cluster with automatic failover of VMs between the primary and secondary sites—unlike Hyper-V Replica, which requires manual intervention for failover.
SR is also referred to as Windows Volume Replication (WVR), an older codename, which you’ll see in the guide and in the GUI of the Technical Preview.
For each volume that you’re going to replicate, you’ll need a log file volume of the same size on both the source and the destination storage array of at least 10% of the size of the data storage volume; Microsoft recommends SSD-based storage. For data volumes, you can optionally seed the data beforehand through an out-of-band transport mechanism.
Storage Quality of Service
Microsoft introduced storage QoS in Windows Server 2012 R2. On an individual VHD(X) file basis, you could define a minimum and maximum IOPS limitation, measured in 8K chunks. The problem was that no centralized traffic cop existed; therefore, although the Hyper-V host would limit the maximum IO demanded by a VM and tried to guarantee the minimum setting, there was no way the SAN of SOFS cluster storage could actually balance IO load among several hosts filled with demanding VMs.
Setting storage QoS in Hyper-V 2012 R2
The new mechanism is much more comprehensive. Built on top of SOFS, the mechanism allows you to define policies for a virtual disk, for a VM or a group of VMs, or for a tenant/service. Because it doesn’t rely on storage spaces, it will work even if you have an iSCSI of FC SAN backing the SOFS cluster. A guide for this feature was recently published; find it here.
Using PowerShell (I REALLY hope there will be a GUI for this in the final version), you can create policies, assign policies, and monitor the performance of each flow in each policy. Each policy can be either single-instance or multi-instance, where the former applies the settings across a set of VMs or disks.
So, if you define a maximum of 800 IOPS in a single instance policy and then apply that policy to five virtual hard disks, the combined total IO from all five disks won’t exceed that limit. A multi-instance policy, on the other hand, is suitable if you want to specify a similar set of minimum and maximum limits across many VMs because the limits apply to each individual VM.
You can change settings in a policy after the policy has been created, but you can’t change the policy type. The PowerShell cmdlets for applying policies to virtual hard disks aren’t included in the Technical Preview; Microsoft wants you to look at this script instead. Also, policies can’t be applied to shared VHDX files.
PowerShell for storage QoS
Conclusion
I’m often asked to identify the most important fabric component of a well-balanced private cloud/virtualized environment. Although my students point to memory or processing power, the correct answer is storage. Providing reliable, performant, and scalable storage for many VMs is challenging; doing so at a much lower price point than proprietary SANs is innovative. Both new technologies revealed in the Technical Preview complete the picture for virtualization storage. Therefore, if your business is looking at Hyper-V or considering the purchase of more storage (when the final version is released in the middle of next year), definitely include SOFS in your short list.
In the next post of this series I'll cover the new networking features.