Analyzing and solving performance issues in Ubuntu VM after OS upgrades inside VM and on physical host

Written by - 0 comments

Published on - last updated on December 10th 2022 - Listed in KVM Linux Virtualization Kubernetes Docker

After a physical server was upgraded from Debian 10 (Buster) to 11 (Bullseye), one Virtual Machine started to show a significant decrease in performance. Monitoring stats showed pretty clearly that IOWAIT inside the guest OS was responsible for the slowness:

Yes, these stats are pretty bad. Especially bad, when the host itself is running with SSD drives.

So what has changed?

OS Upgrades inside VM and on the host

This particular VM was running Ubuntu 18.04 as guest OS and was upgraded to Ubuntu 20.04. At the same time the host's OS was upgraded from Debian 10 (Buster) to 11 (Bullseye).

The VM itself is a Kubernetes node and therefore part of a Kubernetes cluster. The guest OS upgrade also changes the container runtime:

  • is upgraded from 20.10.7 to 20.10.12
  • containerd is upgraded from 1.5.5 to 1.5.9

The host's OS upgrade also changes the hypervisor (KVM/QEMU) and the VM management layer (libvirt):

  • KVM/QEMU is upgraded from 3.1 to to 5.2 (relevant package: qemu-system-x86)
  • libvirt is upgraded from 5.0 to 7.0 (relvant package: libvirt-daemon)

Following the reboot of the host, we can see the VM running much slower, with a very high percentage of IOWAIT. These are the CPU stats seen by the guest OS itself:

You don't need to be a pro in analytics to see the major change since the VM was started again under the new Host OS.

But why would a newer Hypervisor version be slower than before?

Upgrading QEMU? Don't forget the VM machine type!

How a VM is created using libvirt and QEMU can have a significant impact on a Virtual Machine's performance. I have already seen this in the past where I needed to analyze packet losses on a Ubuntu 18.04 KVM guest running on a Debian 9 physical host. Back then the culprit was the VM suffering from packet losses didn't use the virtio network driver. But in this new case the VM is already configured to use virtio, also and especially on the virtual disk (raw device).

Let's create a new Ubuntu 20.04 VM, now under Debian Bullseye, and see what the XML definition of that VM will look like.

root@bullseye:~# virt-install --connect qemu:///system --virt-type kvm -n kvmtest --import --memory 2048 --vcpus 2 --disk /dev/vgdata/kvmtest --network=bridge:virbr1 --graphics vnc,password=secret,port=5999 --noautoconsole --os-variant ubuntu20.04
Starting install...
Domain creation completed.

By using virsh dumpxml kvmtest the VM's virtual hardware configuration can be retrieved. And we can compare this output from the dumpxml of the VM having the IOWAIT issues. The relevant differences are:

root@bullseye:~# diff /tmp/kvmtest.xml /tmp/vm-with-iowait.xml
< <domain type='kvm' id='5'>
> <domain type='kvm' id='4'>
<       <libosinfo:os id=""/>
>       <libosinfo:os id=""/>

<     <type arch='x86_64' machine='pc-q35-5.2'>hvm</type>
>     <type arch='x86_64' machine='pc-q35-3.1'>hvm</type>

<libosinfo />

First up: libosinfo.

Yes, our vm-with-iowait is meanwhile running Ubuntu 20.04 but the OS version still mentions Ubuntu 18.04 in the VM's configuration. But is this information actually relevant for a VM performance? After all the <libosinfo> context is located inside the <metadata> context. Unfortunately this is not well (if not at all) documented. But a hint can be found in an old RedHat mailing list entry from Cole Robinson:

Right now in virt-manager we only track a VM's OS name (win10, fedora28, etc.) during the VM install phase. This piece of data is important post-install though: if the user adds a new disk to the VM later, we want to be able to ask libosinfo about what devices the installed OS supports, so we can set optimal defaults, like enabling virtio. [...] I want to add something similar to virt-manager but it seems a shame to invent our own private schema for something that most non-trivial virt apps will want to know about. I was thinking a schema we could document with libosinfo, something like

Updating the OS version in the XML is therefore a wise choice, but, according to this information, has only an effect of adding devices. And as mentioned before, virtio drivers are already configured for the virtual devices.

<type />

And then there is the VM type.

Unfortunately this is even less documented than libosinfo. Either I was looking at the completely wrong parts of the Internet or the documentation got lost/deleted, the documentation is somewhere inside the source code or it just never was documented officially in the first place. Yes, sure, there are some hints that this defines the "machine type" - but what impact this has on a VM is not really described. 

The best (and pretty much only) description yet was found on the Ubuntu server guide on Virtualization with QEMU:

If you are unsure what this is, you might consider this as buying (virtual) Hardware of the same spec but a newer release date. You are encouraged in general and might want to update your machine type of an existing defined guests in particular to:

to pick up latest security fixes and features
continue using a guest created on a now unsupported release

In general it is recommended to update machine types when upgrading qemu/kvm to a new major version. But this can likely never be an automated task as this change is guest visible. The guest devices might change in appearance, new features will be announced to the guest and so on. Linux is usually very good at tolerating such changes, but it depends so much on the setup and workload of the guest that this has to be evaluated by the owner/admin of the system. Other operating systems where known to often have severe impacts by changing the hardware. Consider a machine type change similar to replacing all devices and firmware of a physical machine to the latest revision - all considerations that apply there apply to evaluating a machine type upgrade as well.

Reading this, the <type> setting of a QEMU Virtual Machine is therefore something like the "firmware version" of a VM. Or in terms of VMware: Virtual Hardware version.

Looking at the diff above again shows the VM with our IOWAIT issue runs a type pc-q35-3.1. The newly created VM from scratch shows a type pc-q35-5.2 . Looking closer at this number reveals that 3.1 and 5.2 actually identify the installed QEMU version (see upgrade notes above). But could this really have such a strong impact on a VM's performance? Let's try.

Changes to the Virtual Machine

Both information in libosinfo and type were changed on our problem VM using virsh edit. The values were adjusted to the ones seen in the difference of the new VM kvmtest. 

Besides this, swap was also disabled on this VM (inside /etc/fstab) as the Ubuntu 20.04 guest is a node and part of a Kubernetes cluster.

But these changes are not applied immediately. The VM needs to be shut down and started again in order to read the updated XML definition.

Once the VM in question started up again, the stats were checked during the next few hours. And what a significant change this shows!

Once the Ubuntu 20.04 VM was booted again with the updated XML specs and swap disabled no IOWAIT could be seen anymore. And to make this even nicer: The performance turned out to be even better than before, when running as a Ubuntu 18.04 VM under a Debian Buster Host!

But which one of these changes helped fix the IOWAIT issue? 

High IOWAIT issue reproducible on a similar VM

The question remains open whether the QEMU/libvirt XML changes or the swap change solved the performance issue. But I didn't need to look far, because another VM showed the exact same high IOWAIT stats. And this VM also has the exact same upgrade path behind it:

  • Inside the VM, the guest OS was upgraded from Ubuntu 18.04 to Ubuntu 20.04
  • The VM is a Kubernetes node, part of a Kubernetes cluster
  • The VM's physical host was upgraded from Debian 10 to Debian 11
  • VM running with libosinfo set to Ubuntu 18.04, machine type pc-q35-3.1
  • High IOWAIT since VM start after all upgrades are completed

Time to find out what exactly changes, when libosinfo and the VM machine type are adjusted.

libosinfo and machine type adjusted: Changes inside the VM

Before having made any actual changes, several information was collected:

  • The complete boot output (using journalctl -k)
  • The block device paths (using ls -la /sys/block/)
  • The virtual "hardware" output (using dmidecode)
  • The modules loaded by the Kernel (using lsmod)
  • Some CPU stats using after 10min runtime (using iostat)

After libosinfo and machine type were adjusted, again by using virsh edit, and this VM restarted, the output of the previous commands were saved and compared. Here are the changes inside the VM.

Boot + block device output

The output of journalct -k was saved. First after booting with the old libosinfo and machine type pc-q35-3.1 and again with the updated libosinfo and machine type pc-q35-5.2.

$ diff boot-3.1.txt boot-5.2.txt
<  kvm-clock: cpu 0, msr 32d201001, primary cpu clock
<  kvm-clock: using sched offset of 9680708323 cycles
>  kvm-clock: cpu 0, msr 435a01001, primary cpu clock
>  kvm-clock: using sched offset of 16819579002 cycles
<  kvm-clock: cpu 1, msr 32d201041, secondary cpu clock
>  kvm-clock: cpu 1, msr 435a01041, secondary cpu clock
<  kvm-clock: cpu 2, msr 32d201081, secondary cpu clock
>  kvm-clock: cpu 2, msr 435a01081, secondary cpu clock
<  kvm-clock: cpu 3, msr 32d2010c1, secondary cpu clock
>  kvm-clock: cpu 3, msr 435a010c1, secondary cpu clock
<  kvm-clock: cpu 4, msr 32d201101, secondary cpu clock
>  kvm-clock: cpu 4, msr 435a01101, secondary cpu clock
<  kvm-clock: cpu 5, msr 32d201141, secondary cpu clock
>  kvm-clock: cpu 5, msr 435a01141, secondary cpu clock
<  kvm-clock: cpu 6, msr 32d201181, secondary cpu clock
>  kvm-clock: cpu 6, msr 435a01181, secondary cpu clock
<  kvm-clock: cpu 7, msr 32d2011c1, secondary cpu clock
>  kvm-clock: cpu 7, msr 435a011c1, secondary cpu clock
<  lpc_ich 0000:00:1f.0: I/O space for GPIO uninitialized
>  input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input4
>  input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input3
>  lpc_ich 0000:00:1f.0: I/O space for GPIO uninitialized
>  virtio_blk virtio2: [vda] 209715200 512-byte logical blocks (107 GB/100 GiB)
>   vda: vda1
<  virtio_blk virtio2: [vda] 209715200 512-byte logical blocks (107 GB/100 GiB)
<  input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input4
<  input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input3
<  virtio_net virtio0 enp1s0: renamed from eth0
<  usb 1-1: SerialNumber: 42
>  usb 1-1: SerialNumber: 28754-0000:00:1d.7-1

The first thing which looks like a difference is a different msr value in the kvm-clock output. However the msr value is not a fixed value, as it contains dynamic elements, such as time. This explains the changed value and actually shows that in terms of CPU nothing has changed.

A bit further down we see a VirtualPS/2 (virtual mouse) connected, but here it's just a slightly different order of the boot output. All the entries are basically the same with one exception: The line "vda: vda1" only shows up in the newer boot output. But given that the previous line already indicates vda detected as a virtio2 drive, no major change here either.

This can also be verified by looking at the output of ls -la /sys/block/. Both vda devices show up under the same PCI device path (under 3.1 and 5.2 boot).

root@vm:~# ll /sys/block/
total 0
drwxr-xr-x  2 root root 0 Dec  9 17:07 ./
dr-xr-xr-x 13 root root 0 Dec  9 17:07 ../
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop0 -> ../devices/virtual/block/loop0/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop1 -> ../devices/virtual/block/loop1/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop2 -> ../devices/virtual/block/loop2/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop3 -> ../devices/virtual/block/loop3/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop4 -> ../devices/virtual/block/loop4/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop5 -> ../devices/virtual/block/loop5/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop6 -> ../devices/virtual/block/loop6/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 loop7 -> ../devices/virtual/block/loop7/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 sr0 -> ../devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sr0/
lrwxrwxrwx  1 root root 0 Dec  9 17:07 vda -> ../devices/pci0000:00/0000:00:02.2/0000:03:00.0/virtio2/block/vda/

dmidecode + lsmod output

Maybe some virtual hardware change can be spotted with dmidecode? Well there is indeed some change, but it only contains the updated machine type version:

$ diff dmidecode-3.1.txt dmidecode-5.2.txt
<     Version: pc-q35-3.1
>     Version: pc-q35-5.2
<     Version: pc-q35-3.1
>     Version: pc-q35-5.2

And what about the Kernel modules? Maybe the new machine type or libosinfo update triggered additional Kernel modules to be loaded?

$ diff lsmod-3.1.txt lsmod-5.2.txt
< ipt_REJECT             16384  42
> ipt_REJECT             16384  51
< xt_comment             16384  1426
> xt_comment             16384  1384
< xt_nat                 16384  103
> xt_nat                 16384  92
< xt_statistic           16384  32
< xt_tcpudp              20480  299
> xt_statistic           16384  30
> xt_tcpudp              20480  279

But another "disappointment". The diff shows the same modules loaded, just with a different  "used by" value.

iostat output

And what about the iostat output? Are the IOWAIT's gone or still here after a 10min runtime?

$ cat iostat-3.1.txt
root@vm:~# uptime && iostat -c 5
 17:05:22 up 11 min,  1 user,  load average: 6.33, 6.84, 4.72
Linux 5.4.0-126-generic (vm)     12/09/22     _x86_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          21.95    0.00    5.79   31.11    0.15   41.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.42    0.00    3.54   24.77    0.08   66.20

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.20    0.00    3.60   20.55    0.05   70.59

$ cat iostat-5.2.txt
root@vm:~# uptime && iostat -c 5
 17:16:28 up 9 min,  1 user,  load average: 5.10, 6.71, 4.21
Linux 5.4.0-126-generic (vm)     12/09/22     _x86_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          27.01    0.00    5.80   27.63    0.18   39.38

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.84    0.00    2.97   55.14    0.05   37.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.05    0.03    3.91   42.49    0.05   47.47

The IOWAIT values are still way too high.

This means: A change of libosinfo and the machine type did not change anything inside the VM and did not fix the IOWAIT issue!

Kubernetes + swap = no love

In the recent past, I've already run into problems when trying to run Kubernetes on nodes with swap enabled. Funny enough this seems to have worked fine before though. Could the upgrade of and containerd be the actual cause for the IOWAIT issue?

Now with the VM running an updated libosinfo and VM machine type, swap was disabled (swapoff -a). This took a couple of minutes but eventually finished after ~5mins. And my mind was blown.

Roughly two minutes after completely disabling swap, the IOWAITs disappeared (besides one lonely spike). 

Conclusion: High IOWAIT caused by enabled swap on Kubernetes node

Combining all these findings lead to an (unexpected) conclusion: 

  • Neither the libosinfo nor the VM machine type changes had an effect on the virtual hardware and performance
  • The high IOWAIT was caused by swap being enabled on this Kubernetes node

The fact that this IOWAIT issue did not occur before the upgrades (swap was already enabled (by mistake) before the upgrades) puts the blame on the apt-get dist-upgrade inside the guest OS. The newer and containerd packages don't seem to handle swap well (if at all).

But it could also be a combination of several facts. This VM which suffered from the IOWAIT issue is part of a cluster which does some heavy lifting, with a lot of deployments. Yet another VM, which was upgraded in the exact same way and hosted on the same physical host, is part of a much more silent Kubernetes cluster, not doing much load. Inside this second VM the IOWAIT issue cannot be reproduced.

Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Icingaweb   Icingaweb2   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder   

Update cookies preferences