- How to use VMware vSAN ReadyNode Configurator - Fri, Dec 17 2021
- VMware Tanzu Kubernetes Toolkit version 1.3 new features - Fri, Dec 10 2021
- Disaster recovery strategies for vCenter Server appliance VM - Fri, Nov 26 2021
One important factor in performance tuning is the number of applications and services that run on a particular VM. Think about a desktop PC where you see a user running 100 tabs on a Chrome browser, complaining that their machine is slow. We all know that Chrome is memory hungry, so most likely the machine lacks crucial physical memory in this example.
The rule of thumb for achieving the best performance in VM is to strip the number of applications and services running to the bare minimum because that's where the consumption of vCPU and memory are most often the first two limiting factors. Do this before adding vCPU to the VM to see if application performance improves. I can recommend a free tool from VMware called the VMware OS Optimization Tool (OSOT), which is designed to help with guest OS optimization.
You can deactivate services, registry keys, and many other options while still being able to revert to default. OSOT uses templates, VMware recommendations, and best practices. In addition to desktop operating systems, such as Windows 7x–10x, the tool also supports server operating systems such WS2008x–WS2019.
You can copy existing templates, save them as your own, and personalize their content so you only keep the optimizations that you really need.
Other VM performance improvements
Other than the guest OS improvements, we can also improve the way each VM consumes the overall host resources, such as CPU and memory.
In fact, we can reduce to the maximum virtual hardware consumption configured (or attached) to each VM. Those steps might reduce the physical CPU consumption only a little, but when you apply those tips at scale, it saves a lot of CPU cycles.
VMs often waste their time with CD-ROMs being inserted. You want to prevent this, so you should disable the CD-ROM for each VM. To configure this, select VM > Actions > Edit Settings. Then expand the CD-ROM drive and clear the Connected and Connect At Power On checkboxes.
Check CPU overload and CPU limit
If your ESXi has a CPU overload, you can connect via an SSH client such as Putty and initiate an esxtop command to verify the %READY field. This gives you an idea of how long (in percentage) the VM was ready but could not be scheduled to run on a physical CPU. This value should stay under 5% under normal workloads.
Make sure the VM is not configured with a CPU limit that is a limiting factor. The CPU limit can be set on the per-VM level or at the resource pool level (if used).
If there is no limit configuration on the VM, then the load is too high and the only thing you can do is either increase the physical CPU number on the host (if possible) or reduce the number of virtual CPUs within each VM. Finally, you can move some VMs to other hosts to free resources.
We can also check whether ESXi is struggling from a memory standpoint and reclaiming some memory from VMs. This is often the case when the physical memory of the host is fully utilized, and you have actually configured more memory for your VMs (overcommit).
Again, we can use the esxtop command here and check the value. On the main screen, type m for memory, then type f for fields and select the letter J (memory ballooning statistics). Then look at the MCTLSZ value, which shows the amount of guest physical memory that was reclaimed (if any) by the balloon driver. If you see only values of zero, as in our case, then your ESXi has still enough physical memory, and this memory reclaiming technique has not been used (yet).
While the newest virtual infrastructure deployments get the best of the best, all-flash storage, an old infrastructure might still be using spinning storage. "Spinning rusts," as we say. Spinning media often caused poor performance in virtualization infrastructures. Volumes created on top of spinning medias were slow, unreliable, and take a long time to rebuild if any of the underlying hard drives failed. You can easily test it if you migrate your VMs to different, more performant storage, attached to the same host.
You can also reduce the number of VMs per LUN if you're not using VMware vSAN, which is not relevant for this situation.
If you identify a storming-related problem, you can also verify that:
- Your host bus adapters (HBAs) are on VMware HCL and are certified for ESXi.
- Make sure that you have the latest BIOS on your physical hosts.
- Check to make sure the firmware of your HBAs is up to date.
Troubleshooting a performance problem is a job in and of itself. In this post, we gave you a few paths to explore, but we have just scratched the surface.
One recommendation here is to isolate the problems and see whether it is the VM itself or the physical ESXi first. Then you can investigate further to see whether it is the shared storage, network, physical CPU, or a physical RAM problem.
Subscribe to 4sysops newsletter!
Often there is only one limiting factor that causes problems, but there are cases where two or more limiting factors might be involved. Good luck with your troubleshooting.