- Disaster recovery strategies for vCenter Server appliance VM - Fri, Nov 26 2021
- Install the free new VMware Tanzu Community Edition - Fri, Nov 12 2021
- How to migrate VCSA to a new host without VMotion - Fri, Nov 5 2021
If you're a VMware vSAN user or planning to use vSAN in your cluster, you probably know that vSAN, as distributed storage, stores chunks of data on disks within disk group(s) on each host. The disk group comprises at least one cache device and one or more capacity devices.
As spinning media are phased out, both tiers are now flash devices, usually with high performance and endurance NVMe devices for the caching tier and the SATA/SAS capacity tier.
The data within vSAN storage are spread across all of your cluster's ESXi hosts that participate in the vSAN storage (note you can have hosts that do not participate in vSAN storage, but still be part of the vSAN cluster, as computing nodes).
The initial placement of data is directed by the cluster-level object manager; over time, however, you might see Skyline Health Check reporting disks that are high on space usage and the Skyline health alarm triggers (yellow).
When you look at the disks, you'll see that some disks are used high, and some are not. You may need to run Proactive/Automatic Rebalance and let the system distribute the data across disks so that the space usage is equalized. We'll look at how this is done in the latest vSphere 7.0 Update 2.
VMware has two major rebalancing functions ^
vSAN Reactive Rebalancing—This happens when vSAN detects over 80% usage on any storage device. The system will try to move some of the data elsewhere to other disks and disk groups to free capacity.
vSAN sees which disks underutilize and which overutilize their disk space. This reactive rebalancing is completely automatic. VMware calls it "Capacity Constrained Rebalancing." From the name, you can see that the system simply wants to keep the capacity on all storage devices at the same level.
I should mention that vSAN needs some of the disk's capacity to be below 80%. If not, the function won't trigger. If all disks are over 80%, it won't work.
vSAN Proactive Rebalancing—vSAN detects that a storage device is consuming way too much compared to other disks. This number has to be roughly 30% larger than the rest of the disks (default value). VMware calls this "Capacity Symmetry Rebalancing."
Note: Older vSAN releases—vSphere 6.7 U2 (or older) versions—had the Proactive Rebalance manual option, which could be triggered through the vSAN Health Plugin on the vCenter Server or via Ruby vCenter Console (RVC).
vSphere 6.7 U3 and higher does not need a manual rebalance operation. All rebalance operations are completely automatic based on the threshold value, as stated above. The defaults are set to 30%.
Service responsible for Automatic Rebalance ^
There is a vSAN service called Automatic Rebalance, which must be running. By default, it is not running, so you must enable it.
You can check whether the service is active by connecting via the vSphere client and selecting the vSAN cluster > Configure TAB > vSAN > Services > Edit Advanced Options > Click to Enable or Disable Automatic Rebalance.
As you can see, you can also change the rebalancing threshold, which is a threshold that considers the difference in capacity usage for two different storage devices.
The rebalancing will continue until half of that number (15% by default), and then it stops. If the automatic rebalance is currently balancing the system, and you disable the service, it will stop immediately.
Where else can I see the state of my vSAN disk's health? ^
You can do a vSAN Disk Balance health check, which enables you to see disk usage details on your cluster.
Normally, the health check should show a green status indicator if your automatic rebalance is ON. vSAN tries to keep this health check green, but if, for some reason, the color is yellow or red, you should start investigating. VMware recommends keeping the number by default at 30%.
Note that, by default, the Automatic Rebalance is disabled. This means that the status of this health check shows you a yellow color if the imbalance exceeds a system-determined threshold. When Automatic Rebalance is enabled, vSAN is able to automatically rebalance the data and keep the status green.
One interesting thing is that the rebalance can wait up to 30 minutes to start because of the priorities of other tasks. You can have some object repairs running or other higher usage operations, which means that vSAN Rebalance is aware of them.
A screenshot from the lab shows how to get a view of the vSAN Disk Balance and its color. Note the Configure Automatic Rebalance link above. This allows you to get to the pop-up window, as shown in the screenshot above.
Final thoughts ^
I think it that Automatic and Proactive Rebalance are good additions to VMware vSAN, but as admin, you must activate the service, as it's disabled by default. Once done, you can benefit from automatic balancing of your disk's capacity, and you won't have unbalanced clusters in terms of storage.
Subscribe to 4sysops newsletter!
VMware updates vSAN and vSphere with every release. The latest vSphere 7.0 U2 is the one from which those screenshots were taken. I believe that vSAN will bring more granular options, which might give you hands-on resource utilization or network throughput. But those are just my thoughts. We'll see what VMware will bring us next.