- Azure Sentinel—A real-world example - Tue, Oct 12 2021
- Deploying Windows Hello for Business - Wed, Aug 4 2021
- Azure Purview: Data governance for on-premises, multicloud, and SaaS data - Wed, Feb 17 2021
One of the big benefits S2D has over "traditional" Storage Spaces is the simpler networking. In 2012/2012 R2, you had two NICs dedicated to storage traffic between the SOFS and Hyper-V nodes as well as another network between the cluster nodes and another set of NICs for client and VM-to-VM traffic.
In a hyper-converged S2D solution, you can have only two NICs over which storage, cluster heartbeat and VM-to-VM/client data flows. Switch Embedded Teaming (SET) allows you to mix these networking loads together and use QoS or DCB to control bandwidth allocation. RDMA network interfaces are among the best, on average seeing a 30% increase in throughput over standard Ethernet while at the same time seeing an approximate 30% reduction in CPU usage. This last point is important because in a hyper-converged cluster, you want as many CPU cycles as possible to be available to run your VM workloads.
Setting up an S2D cluster
In my test lab, I have four physical computers running Windows Server 2016 Hyper-V, each with 32 GB of RAM. Two nodes have Chelsio 40 Gb RDMA NICs (2 x 40 Gb ports), and two nodes have Chelsio 10 Gb RDMA NICs (2 x 10 Gb ports). A 10 Gb spider cable connects the 40 Gb NICs to a Dell 12-port 10 Gb switch, providing a total bandwidth of 40 Gbps per host. The 10 Gb hosts have both ports connected, providing a total of 20 Gbps bandwidth per node.
I prefer to use iWarp RDMA networking over ROCE/Infiniband. It's easier because it avoids the complex DCB configuration of the switch. In addition, because it's just TCP/IP, you can use ordinary switches. This means you don't need all new infrastructure to implement iWarp, unlike ROCE. RDMA isn't limited to servers either. Microsoft now supports it in Windows 10, enabling scenarios where a high-speed trading application or video-editing station can connect to backend S2D storage over RDMA. As a side note, Chelsio is releasing a solution where their NICs can act as switches, thus eliminating the need for an expensive 10 or 40 Gbps switch altogether.
Each host has two SATA SSDs and two SATA 2 TB HDDs installed. I started by installing the OS on each node, followed by the latest driver for the Chelsio NIC. I ran the Test-Cluster cmdlet to see if the nodes met the requirements for a Failover Cluster.
I then used PowerShell to create a new cluster (starting with three nodes; I wanted to see how easy it was to add a fourth node later). To check availability of all drives for use, I ran Get-PhysicalDisk.
Finally, I ran Enable-ClusterStorageSpacesDirect, which automatically grabbed all SSDs and HDDs and added them to a single pool. The last step is creating a virtual disk on top of the pool. Here's where things are a bit different from Storage Spaces. In the old world, you had to choose your tiering and whether to use mirroring or parity manually. S2D automatically picks this for you based on number of nodes and number of disks.
Overall, setting up S2D is a lot more straightforward in the RTM release than the fiddling around required in the Technical Previews.
If you have System Center Virtual Machine Manager 2016 (VMM) you can attach an existing S2D cluster to it. On the other hand, if you have new physical servers, you can use VMM's deployment technology to install Windows Server 2016, turn the nodes into Hyper-V hosts and deploy S2D to them, all with a single checkbox in the VMM create cluster wizard.
Monitoring and managing S2D
As the name implies, System Center Operations Manager provides a management pack for S2D with a dashboard to visualize performance metrics and warn about issues. But unlike in earlier releases, the logic to gather the necessary data is not in the management pack. Instead, Microsoft has built a health service for storage (also covering Storage Replica and Storage QoS) directly into the OS. This provides information about health state as well as performance metrics. You can run Get-StorageSubSystem Name | Get-StorageHealthReport to access cluster-wide health data.
Datacom is an early adopter of S2D here in Australia. They built their own dashboard in Grafana using a PowerShell script to gather the data from the health service. DataON Storage is a popular vendor in Storage Spaces/S2D; they have built a beautiful HTML5-based dashboard for their products. You can opt in to report disk failures directly to them so they can ship a replacement drive, possibly before you're even aware of the failure.
Cluster-Aware Updating (CAU) is the inbox and cluster-aware engine for patching each cluster node in an orchestrated fashion. As of 2016, it's aware of S2D. It will only patch a node in which all virtual disks are healthy.
Storage Spaces Direct is a game changer. I suspect it'll be the default deployment for Hyper-V clusters going forward. This comes from the general hype about hyper-convergence in the IT industry and the benefits S2D brings. Such benefits include ease of setup, cost-effective components and fantastic performance. For the latter, Microsoft has demonstrated six million IOPS for reading in an S2D cluster. With SQL Server 2016 supported to store databases directly on S2D, that'll also be an interesting option, since many database deployments are very storage intensive. S2D is also another reminder to look seriously at RDMA networking. The enhancements in throughput and minimal CPU overhead are very attractive.
There are some words of caution, however. S2D is not a small business/branch office solution in most cases, since it requires a Windows Server Datacenter license at approximately five times the cost of the Standard license. For these smaller deployments, virtual SAN solutions such as those from StarWind still have their place.
Subscribe to 4sysops newsletter!
Since you can use S2D with almost any type of local storage, you can easily set up a lab with a few VMs and learn the technology in preparation for production deployments.