- Storage Replica in Windows Server 2016 - Tue, Sep 20 2016
The main features of Storage Replica:
- It performs zero data-loss block-level replication of data.
- It is storage-agnostic (but it requires that we have data volumes, which are NTFS formatted).
- It is configurable as synchronous or asynchronous.
- Replication is based on volume source and destination.
- It uses SMB 3 as the transport protocol and is supported using TCP/IP or RDMA.
- It can replicate open files, as it operates on block level.
It supports different use cases, including host-to-host replication, cluster-to-cluster replication, and same-host replication (if we want to synchronize data from one volume to another).
Nano Server also supports Storage Replica, but you need to add it as a separate component when building the server image.
The diagram below describes how Storage Replica works with a synchronous configuration. (1) When an application writes data down to the file system (for instance the D:\ drive), IO filtering will intercept the IO and (2) write it to the log volume on the same host. (3) The data will replicate across to the secondary site and write it to the log volume there. When the data is written to the log volumes it will (4) send an acknowledgment to the primary server, which will in turn send an (5) acknowledgment to the application. The data also will be flushed to the volume from the logs using write-through.
The log volume’s purpose is to record all block changes that occur, similar to an SQL database transaction log that stores all transactions and modifications. In case of a power outage on the remote site, it would need to get all the changes that occurred since the outage in order to compare them.
It is important to be aware that in a synchronous configuration, the application needs to wait for acknowledgment from the remote site. A constrained network will affect application performance considerably. As most TCP/IP networks add about 2–5 ms latency, it can create a bad user experience; instead consider using RDMA, which has a considerably lower overhead, because it does not use TCP/IP (which gives a much lower latency). So for synchronous replication, recommendations are that we have a maximum of 5 ms latency and high bandwidth available between source and destination resources.
By design, the Data and Log volumes on the remote site will be unmounted and marked as non-accessible.
Now, if we were to configure asynchronous replication, the picture would be quite different. Instead, it would write data locally first to the log file, then send an acknowledgement to the application, giving the same application performance it would give if we didn’t have Storage Replica installed. Next, it would replicate the data from the log volume in the other site and write it to the log volume there. Because the application does not have to wait for the remote site, we do not need to have such strict requirements on the network layer; this allows asynchronous deployment in WAN scenarios. It is also important to be aware that if the network link between the two sites fails, the log volumes on the source would store all block changes until the link comes back up and then replicate the changes that happened since the link went down.
There are some requirements we need to be aware of before configuring this feature.
- We need to have to volumes available on each location, one for data and one for logs.
- Volumes need to be configured as GPT and be the same size at the source and destination.
- Log volumes need to be identical sizes on both source and destination.
- Data volumes should not be higher than 10 TB.
- We need to have Windows Server 2016 as the source and as the target resource.
It is important to note that Storage Replica is a Windows feature that will be available only in the Datacenter edition of 2016.
Installing Storage Replica
Launch a PowerShell console with administrator privileges and execute the following command:
Install-WindowsFeature Storage-Replica, FS-FileServer -Restart –IncludeManagementTools
This PowerShell command also installs the File and Storage Services server role i (FS-Fileserver). Whereas this feature is not required for Storage Replica to work, we will use it later in this post to run the Test-SRTopology command. After we have successfully run the command Test-SRTopology, we can safely remove the file server role from the servers we want to use with Storage Replica.
For this setup, I have two virtual machines, which have two additional volumes each; I will use them as data and log volumes. Since Storage Replica does not have any UI management, we must use PowerShell to do all configuration.
Before we set up any storage replication options, we need to verify support for our topology. We can do this using the PowerShell command Test-SRTopology; it will generate an HTML report that we can use to see if we have a supported topology. We can use the cmdlet in a requirements-only mode for a quick test as well as a long-running performance-evaluation mode. It is also important that we generate some IO against the source volume while we are running the test to get more-detailed information about the benchmark.
Open PowerShell and make sure that Storage Replica Module is present.
Then we need to test our Storage Replica topology.
Test-SRTopology -SourceComputerName NTX-SR01 -SourceVolumeName f: -SourceLogVolumeName g: -DestinationComputerName NTX-SR02 -DestinationVolumeName f: -DestinationLogVolumeName g: -DurationInMinutes 30 -ResultPath c:\temp
NOTE: If you are using non-US regional settings on the Windows Server 2016 TP5, the Test-SRTopology cmdlet might fail when generating the report, giving you the following error message:
WARNING: Plotting chart from file c:\temp\SRDestinationDataVolumeBytesPerSec.csv failed.
In that case, you need to switch the regional settings to US, reboot, and rerun the cmdlet.
After running the command, PowerShell will generate an HTML report, which will list whether the environment meets all the requirements.
After we successfully run the cmdlet, we can start setting up our replication configuration.
New-SRPartnership -SourceComputerName NTX-SR01 -SourceRGName SR01 -SourceVolumeName e: -SourceLogVolumeName f: -DestinationComputerName NTX-SR02 -DestinationRGName SR02 -DestinationVolumeName e: -DestinationLogVolumeName f:
We can now run Get-SRgroup to see the configuration’s properties. By default, it is set up to run with synchronous replication, and by default the log file is set to 8 GB. You can change it to asynchronous by using the command Set-SRPartnership -ReplicationMode Asynchronous.
If we open File Explorer on the destination machine, we will also notice that the E:\ drive is inaccessible and that the log file is stored on the F:\ drive.
When we start to write data to the E:\ drive on the source computer, it will replicate block by block to the destination computer. The easiest way to see how the progress is going is by using Performance Monitoring, since Storage Replica includes a set of built-in metrics.
Subscribe to 4sysops newsletter!
In upcoming posts, we will take a closer look at more-advanced configuration of Storage Replica using delegated access, sizing, and network configuration; we will also look at how to configure Storage Replica in a Stretched Cluster environment.
Want to write for 4sysops? We are looking for new authors.
Here is my environment:
NAS unit with 18TB of space running Storage Server 2012 R2
Hyper-V environment with a Server 2016 DataCenter VM
NAS units with 18TB of space running Storage Server 2012 R2
Hyper-V environment with a Server 2016 DataCenter VM
The sites are connected via 100Mbps connection. Is the best option making the NAS units an iSCSI Server and presenting a volume to each of the Server 2016 VMs? It doesn’t appear you can use a CIF Share with Storage Replication?
I successfully established cluster to cluster sychronous storage replica. In time of creation it looks like that all is ok, but if I look into events, I can see events 10448 – Storage Replica has failed an application IO.
Can you help me on this problem?
Imagine that your source machine fails and doesn't come back online, how can you make the destination available again? I cannot remove the partnership because it cannot find the replication group (because the source machine is gone). And the disk is locked on the destination.