Many reasons may cause vMotion to fail during the different phases. In this article, I will cover the top reasons vMotion could fail.

Mohammed Raffic

Mohammed Raffic Kajamoideen is a subject matter expert in VMware and Microsoft virtualization, a vExpert 2014-2017, VCIX-NV, VCAP4/5-DCA, VCP, CCA, and MCSA. He blogs at vmwarearena.com and can be found on Twitter at @vmwarearena.

Latest posts by Mohammed Raffic (see all)

VMware vMotion is a zero downtime live migration of workloads from one ESXi host to another. During the vMotion migration, the application running on the virtual machine will still be running, and end-users continue to access the systems without noticing any issues.

The virtual machine retains its network identity and connections, ensuring a seamless migration process. This vMotion migration saves the downtime for any application while performing maintenances on the underlying ESXi hardware and software. vMotion of the virtual machine is not only limited to within the ESXi cluster but also possible across clusters, vSwitches, vCenter Servers, and even the cloud.

vMotion phases ^

The steps to performing vMotion a virtual machine from one ESXi host to another include:

  1. vMotion request is sent to the vCenter Server
  2. vCenter Server sends the vMotion request to the destination ESXi host
  3. vCenter Server computes the specifications of the virtual machine to migrate
  4. vCenter Server sends the vMotion request to the source ESXi host to prepare the virtual machine for migration
  5. vCenter Server initiates the destination virtual machine
  6. vCenter Server initiates the source virtual machine
  7. vCenter Server switches the virtual machine's ESXi host from the source to the destination
  8. vCenter Server completes the vMotion task

Any of these steps is prone to failure. Let's have a look at the top 10 reasons of vMotion failures.

1. CPU incompatibility ^

VMware vMotion transfers the running state of a virtual machine between underlying VMware ESXi servers. vMotion compatibility requires that the processors of the target ESXi host be able to resume execution using instructions equivalent to what the processors of the source ESXi host were using when suspended. Processor clock speeds, cache sizes, and the number of processor cores may vary, but processors must come from the same vendor class (Intel or AMD) and the same processor family to be compatible for migration with vMotion.

When scaling the cluster, it is not possible to use the same generations of CPU because of technology advances and new features. In that case you might run into this error message:

The target host does not support the virtual machine's current hardware requirements.
Use a cluster with Enhanced vMotion Compatibility (EVC) enabled to create a uniform set of CPU features across the cluster, or use per-VM EVC for a consistent set of CPU features for a virtual machine and allow the virtual machine to be moved to a host capable of supporting that set of CPU features. See KB article 1003212 for cluster EVC information.

In this case, we can make use of vSphere EVC (Enhanced vMotion compatibility) features. vSphere EVC ensures that the virtual machines can be migrated live using vMotion between ESXi hosts in a cluster that is running different CPU generations. EVC allows for uniform vMotion compatibility by enforcing a CPUID (instruction) baseline for the virtual machines running on the ESXi hosts. That means EVC will allow and expose CPU instruction-sets to the virtual machines depending on the chosen and supported compatibility level. EVC is cluster level feature. With the release of vSphere 6.7, we now have Per-VM EVC to add more flexibility.

2. vMotion not enabled on VMkernel Interfaces ^

One of the pre-requisites for vMotion is to have at least one VMkernel interface configured and enabled with vMotion. This vMotion network is used to securely perform the data transfer during vMotion operations. In most of the scenarios, vMotion will be failing due to vMotion not being enabled on the VMkernel interface of source or target ESXi hosts.

vMotion not enabled on VMkernel interface

vMotion not enabled on VMkernel interface

It is very important to have both the source and the destination host configured with vMotion VMkernel interfaces with Unique IP addresses.

3. Misconfigured ESXi network settings ^

There are many network level misconfigurations at the ESXI host level may also cause the vMotion failures. As I said, there are many settings which may lead the vMotion failures. During vMotion, the source host transfers the memory pages of the virtual machine to the destination host. If the destination host does not receive any data from the source host for a default period of 120 seconds, vMotion fails.

Some common settings are:

  • Misconfigured Jumbo Frames

It is very important to have the same MTU settings configured between the ESXi hosts and across the network layer (port groups, virtual switches, and physical switches). Different jumbo frame settings will cause the vMotion to fail.

  • IP Conflict for vMotion interface

A unique IP address must be configured for the VMkernel interface for the ESXI hosts. If two hosts share the same vMotion VMkernel interface IP address, the destination host refuses the source's initial handshake message, suggesting that you are not connecting to the correct destination host over the vMotion network. Normally, this is due to IP address conflicts within the vMotion network, with two hosts sharing the destination's IP address.

  • vSwitch security settings

It is important to have the same security settings configured across the ESXi hosts, if you are using standard Switches. We can make use of Distributed virtual switches to maintain network consistency across the connected ESXi hosts.

  • Packet loss and Network Latency

Packet loss and network latency is one of major concerns when it comes to networking. This may be due to network adapter driver, network adapter firmware, and other factors like cabling fault, SFP fault or even at physical switch. Huge packet loss at vMotion VMkernel interface may cause the vMotion failure and timeout during vMotion.

4. Non-Shared storage between hosts in the cluster ^

Shared storage was hardcoded requirement prior to vSphere 5.1. It is necessary to have shared storage between the hosts in the cluster to ensure that virtual machines are accessible to both source and target hosts. During vMotion, the migrated virtual machine must be on storage accessible to both the source and target ESXi hosts. This is needed especially when VM's are migrated automatically to   the cluster by DRS.

VMotion between non shared storage

VMotion between non shared storage

After vSphere 5.1, we can migrate the virtual machines using vMotion without the shared storage. In that case, we must select the option "Change both compute resource and storage". This option performs both vMotion and Storage vMotion. Time taken to migrate the VM's will not be same as vMotion because it must migrate the VM data as well.

MIgrate both compute and storage

MIgrate both compute and storage

5. Inaccessible CD/DVD or ISO image ^

If a virtual machine has a mounted ISO image residing on storage that is not accessible by the ESXi host where you want the VM migrated to, vMotion will fail. You can change the CD/DVD device to be a client device or detach the ISO image and change the device to be a client device.

CD DVD drive connected to host

CD DVD drive connected to host

6. Anti-Affinity rules ^

Affinity and anti-affinity rules are the DRS rules. While affinity rules will help keep the group of virtual machines together, anti-affinity rules are the opposite. You can create an anti-affinity rule to place a specific group of VMs across multiple hosts in the cluster by separating each VM in the group to run on a different ESXi host. This will improve redundancy.

If you have anti-affinity rules created to separate a virtual machine, it will restrict the virtual machines to migrate to the other ESXi hosts where it is already hosting the VM, which is part of the anti-affinity group, especially when placing the ESXi host into maintenance mode. Usually, vMotion will work when manually initiated.

The workaround is to temporarily disable the anti-affinity rule, which is restricting the vMotion, and to migrate the virtual machine if you are planning for ESXi maintenance activities in the cluster.

7. Resource starvation at target ESXi host ^

Imagine your source target host is already consuming above 95% of CPU or memory utilization. This prevents correct operation of the ESXi host. In another use case, consider the virtual machine is configured with memory reservation. Memory reservation for the virtual machine is the guaranteed memory should be available to the virtual machine.  If the target host does not have enough memory to satisfy the reservation of the virtual machine, vMotion will fail.

Target host does not have sufficient memory

Target host does not have sufficient memory

To fix this, migrate the virtual machine to another ESXi host that can provide the guaranteed memory for the VM or reduce the memory reservation of the virtual machine.

8. vCenter server is hung ^

The vCenter server is a key component in vMotion. Without the vCenter server, vMotion is not possible. A vMotion request is first sent to the vCenter, which then co-ordinates between the source and target ESXi hosts to complete the vMotion. If the vCenter server is hung or not responding, vMotion might fail. Consider restarting the vCenter services or rebooting the vCenter server to fix the issue with vCenter server.

Restart vCenter server appliance services

Restart vCenter server appliance services

9. vMware tools installation is in progress ^

If the vMware tools installation is in progress for the virtual machine, we will not be able to migrate the virtual machine as it will show this error: “The virtual machine is installing vMware tools and cannot initiate a migration operation.” This may not be the case every time because there are instances where vMware tools installation would have completed, but we have forgotten to unmount the vMware tools’ ISO from the virtual machine. Which may be preventing the vMotion.

Virtual Machine installing VMware Tools

Virtual Machine installing VMware Tools

We must unmount the vMware tools’ ISO image from the virtual machine to get the virtual machine migrated using vMotion.

10. Check ESXi advanced settings ^

In some instances, the advanced setting "Migrate. Enabled" is set at 0 (disabled) by some backup softwares to ensure completion of the backup jobs. Usually, backup software will revert the setting once backup jobs are completed. But it may leave the settings without reverting it in case of some host or network outage. Ensure the value "Migrate. Enabled" is always set to 1 (enabled) for the vMotion to work.

Migrate enabled advanced settings

Migrate enabled advanced settings

Are you an IT pro? Apply for membership!

Your question was not answered? Ask in the forum!

1+

Users who have LIKED this post:

  • avatar
Share
0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2019

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account