- VMware vSAN Automatic Rebalance vs. Proactive Rebalance - Fri, Oct 15 2021
- VMware Tanzu Services overview - Fri, Oct 8 2021
- New features in VMware vSphere 7 Update 3 - Fri, Oct 1 2021
We saw that the capacity and deduplication numbers give us an overview of the space utilization, deduplication ratio, and savings by deduplication and compression. Here, I’ll explain how deduplication and compression work within a VSAN environment.
Each time a duplicate copy of data is stored, space is wasted. These blocks of data should only be stored once to ensure data is stored efficiently.
The blocks of data stay in the cache tier when they are being accessed on a regular basis. Once those blocks stop being accessed, the deduplication engine checks to see if the block of data that is in the cache tier has already been stored in the capacity tier. If so, the engine doesn’t store the block twice. Only unique chunks of data are stored in the capacity tier.
The deduplication and compression operation happens only during the destage from the cache tier to the capacity tier, so there is no performance penalty or overhead.
To track those blocks of data, a technique called hashing is used. Hashing is the process of creating a short, fixed-length data string from a large block of data. The hash identifies the data chunk and is used in the deduplication process to determine if the chunk has been stored before.
As the data comes to the environment, when it becomes colder and if the data is unique, then it’s compressed with very efficient OV4 compression. This is near-line compression, not in-line compression, which occurs in the RAM or cache tier. Only data coming from the write buffer to the capacity tier is touched. The data that is still “hot” isn’t compressed, as it is still modified/updated. It is important to note that data is compressed after deduplication.
VMware published a good diagram that explains VSAN deduplication and compression. As you can see in the image below, deduplication and compression work on disk groups.
Capacity monitoring ^
Information about capacity monitoring in a VSAN environment can be accessed in different places. Log in to vSphere web client and go to the datastore view.
A more detailed view with all the VSAN components is available when you select VSAN Cluster > Monitor > Virtual SAN > Capacity View.
This screen gives you both a capacity overview and a deduplication and compression overview (if activated).
You can get a more granular view of VSAN datastore consumption within the Used capacity breakdown section. As you can see in the screenshot below, there are VMDKs, VM home namespaces, and swap objects for virtual machines. There are also performance management objects when the performance service is enabled. The file system overheads are value-associated with the on-disk format file system and checksum overhead.
Capacity of individual disks ^
You can view details of the capacity of individual disks within the VSAN cluster. For this, select VSAN Cluster > Monitor > Virtual SAN > Physical Disks.
VMware VSAN performance monitoring ^
Performance monitoring of a VSAN cluster is easily accessible through vSphere web client. Log in to your vSphere web client and select VSAN Cluster > Monitor TAB > Performance > Virtual SAN.
There you can chose from two different performance monitoring options:
- Virtual SAN – Virtual Machine Consumption (shows cluster metrics from the perspective of virtual machine consumption)
- Virtual SAN – Backend (shows cluster metrics from the perspective of the Virtual SAN backend)
VMware VSAN performance monitoring – VM consumption
You can see there are different metrics:
- IOPS – IO operations per second (IOPS) consumed by all VSAN clients, like VMs and stats objects
- Throughput – shows throughput consumed by all VSAN clients, like VMs and stats objects
- Latency – average latency of IOs generated by all VSAN clients
- Congestions – congestions of IOs generated by VSAN clients
- Outstanding IO – outstanding IO from all VSAN clients in the cluster. The outstanding IO value is determined when an application requests a certain IO to be performed (read or write). These commands are sent to storage devices, and until the commands finish executing, they are considered outstanding IO commands.
The other view, Virtual SAN – Backend, is used to show cluster metrics from the perspective of the Virtual SAN backend. You can find the same values as in the first option but from the VSAN cluster perspective instead.
Then, at the individual ESXi level, you can show the disk group performance. ESXi host views are performance views into disk groups and disk devices. They can be found by selecting the host object, then Monitor > Performance > Virtual SAN – Disk Group, as shown below:
You can also monitor VM performance. In order to obtain views related to virtual machines, select VM > Monitor > Performance and the appropriate view. Below is the virtual disk view:
Subscribe to 4sysops newsletter!
Performance and capacity monitoring in VMware VSAN 6.2 are good enough to give you an overview of various crucial performance parameters. You can dive deep into individual hosts, disk groups, or VMs if you have to track a particular performance issue. Previous versions of VSAN only offered rudimentary monitoring without detailed views, and you often had to work with third-party tools to monitor your VSAN environment.