If you are a VMware administrator, you are probably familiar with the ESXi host maintenance mode. We place an ESXi host into maintenance mode especially during vSphere upgrades, hardware maintenance, and troubleshooting activities. Virtual machines on that ESXi host will be migrated to other ESXi hosts in the cluster using vMotion when we place that ESXi host into maintenance mode in the DRS cluster. vMotion will automatically run before the ESXi host enters maintenance mode in a fully automated DRS cluster.
In case of Partial or Manual DRS mode, we need to manually perform vMotion of virtual machines to other ESXi hosts in the cluster. This is a simple, straight forward process. Typically, data will reside on a shared datastore (i.e., shared LUNs, such as FC, iSCSI or NFS), which is shared across the multiple ESXi hosts in the cluster.
This will not occur when placing an ESXi host into maintenance mode in a vSAN cluster. A vSAN cluster is only a group of ESXi hosts in which each host provides at least one SSD (flash) disk for caching and one magnetic disk (HDD) for virtual machine data storage in order to create a shared high-performance centralized datastore (vSAN datastore).
In a VMware vSAN cluster, all the virtual machine data will reside in the vSAN datastore, which is nothing but the local disks carved out from multiple ESXi hosts in the vSAN cluster. Since virtual machines are stored on the ESXi local disks, we need to understand the various data evacuation modes that are available before placing the vSAN node(s) into maintenance mode.
Pre-requisites for placing the ESXi host into maintenance mode ^
Before you can put an ESXi host into maintenance mode in a vSAN cluster, you have to perform the following tasks:
- Validate that there is enough flash capacity available from the remaining hosts to handle any flash read cache reservations.
- Ensure that there is enough free capacity on the remaining hosts to accommodate the amount of data that must be migrated from the host that is entering maintenance mode.
- Verify that enough hosts and storage capacities are available in the vSAN cluster to meet the configured primary level of the Failures to Tolerate (FTT) policy.
- If any stripe policy has been configured, ensure that enough capacity devices are available in the remaining hosts to handle the stripe width policy.
Different types of vSAN data migration ^
vSAN data migration or evacuation options specify how vSAN will evacuate the data residing on the ESXi host before entering maintenance mode. Let's first look at the different vSAN data migration (data evacuation) modes:
- Evacuate all data to other ESXi hosts;
- Ensure data accessibility from other ESXi hosts;
- No data evacuation.
The Confirm Maintenance Mode dialog box is self-explanatory. It displays the impact of selecting each data migration option, and it provides the information presented below:
- Whether or not sufficient capacity is available to perform the operation, as well as the amount of data that will be moved;
- Whether or not data will be moved, and how many objects will become non-compliant with this storage policy;
- The number of objects that will become inaccessible with no data evacuation option.
Let's discuss details about each of the vSAN data migration or data evacuation options.
Evacuate all data to other hosts ^
If you evacuate all data to other hosts, you have to consider the following points:
- This option is advisable if you are planning to remove the ESXi host permanently from the vSAN cluster.
- vSAN moves all data in this host to other ESXi hosts in the vSAN cluster.
- You will see the amount of data that needs to be moved, and also if enough capacity is available in the vSAN cluster. Usually a large amount of data has to be transferred.
- All VMs will have access to their storage components during the migration process.
- It is ensured that all the components are compliant with their assigned storage policies if sufficient resources are available in the vSAN cluster.
- The ESXi host will not enter maintenance mode if a virtual machine object has data on the host that is not accessible and is not fully evacuated.
- If the resources are available and FTT is set to 1 or higher, the data is automatically re-protected against failure.
- This option cannot be used with a 3-nodes VSAN cluster. It is recommended that vSAN clusters be designed with four or more hosts for maximum availability.
Ensure data accessibility from other hosts ^
This option is the default vSAN data migration option. Its main characteristics are listed below:
- This option is recommended if you want to place the ESXi host into maintenance mode for short maintenance activities, such as upgrading the ESXi host, installing patches, or short hardware maintenance activities (for instance, NIC or memory replacements). For disk replacements, individual disks or disk group(s) available in the vSAN can be evacuated instead of migrating all the data of the ESXi host. Refer to this VMware article for more information about disk replacements with vSAN hosts.
- The ESXi host will be placed into maintenance mode faster in comparison to the "Evacuate all data option" because it only migrates the components from the ESXi hosts that are essential for running the virtual machine rather than moving all the data.
- Your data will not be re-protected, and the availability of the virtual machine will be affected.
- You might experience data loss if you encounter a failure while the host is in maintenance mode and the FTT is set to 1 for the vSAN cluster.
- You will see how many objects will become non-complaint with the assigned storage policy. This means that the host might not have access to all its replicas.
- For a 3-nodes vSAN cluster or a vSAN cluster configured with three fault domains, this is the only data migration option or data evacuation option.
No data evacuation ^
With the last option, vSAN does not evacuate any data from the ESXi host. Thus, you have to consider these two points:
- If you power off or remove the ESXi host from the vSAN cluster, some of the virtual machines might become inaccessible.
- If you have to shut down the entire vSAN, you will be unable to evacuate any data.