Latest posts by Leos Marek (see all)
- Configuring Windows Server 2019 Essentials with PowerShell - Thu, Jan 16 2020
- Windows Server 2016/2019 Group Policy security settings - Thu, Jan 9 2020
- Connecting to a Hyper-V Server Core in a workgroup with Hyper-V Manager - Tue, Dec 10 2019
Performance Charts ^
Performance Charts, as its name suggests, is a graphical way to display performance information in different chart types. The ESXi host Web Client itself shows a very limited set of them (total CPU socket % used, total host memory GB, basic network, and disk) for only the last hour. To see a larger range of data with more details, vCenter Server is required.
vCenter Server not only collects statistical data from all ESXi hosts and VMs connected but also aggregates, calculates, and archives the data in configured intervals. To access the data, start vSphere Client, select your VM or host, and go to Monitor > Performance > Overview to get summary of main indicators, like CPU, Memory, Disk, and Network. Go to Advanced and use the View drop-down menu at the top-right corner to see more details about each indicator.
Performance Charts are really useful for basic, very quick, and easy-to-understand performance overviews. If delegated, non-administrator users can also view them. They are the right tool to see how the VM is doing over longer periods of time. This is not the case for more advanced, real-time monitoring because the refresh interval is 20 seconds and is unchangeable. This is where ESXTOP comes into play.
Starting ESXTOP ^
ESXTOP (esxtop) is a built-in command-line tool that provides very detailed and complex real-time information. To use ESXTOP, you need direct access to the specific host where the VM is running, either via the Direct Console User Interface (DCUI) or preferably via Secure Shell (SSH). To start ESXTOP, simply log in to the host SSH session (root privileges required) and type esxtop.
ESXTOP views ^
ESXTOP has several separate screens or views. A specific view exists for each metric. Changing a view is easy—just press the associated key: c for CPU, m for Memory, and so on. There is no need to remember any of these; press h or ? for help on all available commands.
When you start ESXTOP, you will get the CPU view by default.
There are several global statistics available on top. The first row, from left to right, shows the following:
- Host uptime
- The number of worlds running; world is VMware terminology for a schedulable entity, similar to a process or thread
- The number of VMs running and the number of virtual CPUs (vCPUs) assigned to them
- The total CPU load average in intervals of 1, 5, and 15 minutes
In the second marked section, you can see the actual load for each physical CPU (PCPU UTIL/USED%) and average (AVG) for all together. CORE UTIL% displays when enabling hyper-threading, and it presents the load of the full core.
Refresh interval ^
ESXTOP starts with a default refresh interval of five seconds. Press s followed by a number in seconds to change it. The minimum value is two seconds. The maximum limit is apparently so high, we can say there is no limit at all.
Note: If you are running ESXTOP for very first time, it might be good idea to change the interval to a higher value (such as 20 to 30 seconds). It will be a bit easier to get more familiar with the tool if it does not refresh too quickly.
Limiting results ^
ESXTOP shows many rows of different (system) information, especially in the CPU and Memory views, and it tends to be difficult to find what you are looking for. Luckily, you can limit the results. Press the V key to see only VM entities. Press V again to get all information. Another option is to limit the view to a single entity. Press l (lowercase L) followed by an entity identifier (GID). Press l again followed by 0 to reset the view.
Expanding results ^
ESXTOP groups statistics by entities: VMs, logical unit numbers (LUNs), and so on. Let's take my VM named DC1 as an example. In the picture above, you can see %WAIT shows almost 1100%. Now you may think this is a strange number, right? How is this possible?
To answer this tricky question, we need to expand the group by pressing the e key followed by the GID.
Suddenly, it shows 11 rows. Each of them represents a world. The worlds named vmx-vcpu-0:DC1 and vmx-vcpu-1:DC1 are my two vCPUs I have assigned to this VM. The other worlds are for the network stack, disk system, and some "auxiliary" worlds, like vmx-mks for the console and vmx-svga for the graphics card. As you can see, these two are using some CPU resources (%USED and %RUN), which is because I was actually doing some work on the VMs via Remote Console.
Now you understand that the %WAIT close to 1100% is a summary of all the VM worlds. If I add more vCPUs, I will have higher overall numbers. That's why you always need to expand the entity to see exactly where the issue might be.
Customizing fields ^
You may want to add or remove some fields or change the order of appearance. To add or remove fields, press the f key. For each view, you have different field options. Add or remove fields by pressing the associated key. To change column order, press o and follow the instructions as shown in the picture.
Use the W key to save your modified configuration either to the default esxtop4rc file or a new file.
Capturing ESXTOP results ^
To run ESXTOP in batch mode, use this command:
esxtop -b -d 5 -n 200 > results.csv
Here, -b means batch mode, -d 5 is a delay of five seconds, and -n 200 is the number of samples.
To zip the file directly, use this command:
esxtop -b -d 5 -n 200 | gzip -9c > results.csv.gz
You can analyze results with tools like VisualEsxtop or Windows Performance Monitor.
ESXTOP is definitely the number-one tool to analyze vSphere performance issues properly. After reading this post, you should be familiar with its usage. On the other hand, understanding all the various metrics is much more complex. Check out my next article, where I explain the most important counters and thresholds.