As I was running some tests on a newly created LXC container with Debian 12 (Bookworm) running, I was stunned to see all the physical CPU cores appearing in the htop output:
This container is supposed to have 2 cpu's, which are set in the container's config file using cgroup limits. Yet the output clearly shows much more CPUs...
Did I forget to set the cgroup limits on the LXC host (running Debian 11)? I verified and nope, the cgroupv2 cpu limits are set:
root@lxchost ~ # cat /var/lib/lxc/bookworm/config |grep cgroup
lxc.cgroup2.cpuset.cpus = 12-13
lxc.cgroup2.cpu.weight = 100
lxc.cgroup2.memory.max = 10G
lxc.cgroup2.memory.high = 10G
I also double-checked that lxcfs is installed on the host, which is a requirement for the containers to correctly interpret the limits; and yes, it's installed as well.
Compared to other LXC containers running on the same host, only the new Debian 12 container showed this issue.
My first thought was: Hmm... maybe in Debian 12 something changed with interpreting the cgroup limits set by and on the LXC host? Was the (older) LXC version 4.x on the host (running an older Debian 11) the problem?
To verify this, I opened a topic in the LXC discussion forums, but after a hint from Stéphane Graber I quickly realized something: Although the htop output above shows 24 CPUs, only the first two of them are actually used. The others remained at 0% usage. So the cgroup limits actually seem to work - but is not shown for the cpus (memory limits are correctly shown by the way).
With the current findings, the problem seems to be htop itself. Somehow the CPU information is read at the wrong place (?) or cgroups limits for CPUs are somewhat ignored.
Debian Bookworm comes with htop 3.2.2:
root@bookworm:~# dpkg -l|grep htop
ii htop 3.2.2-2 amd64 interactive processes viewer
On Bullseye it was an older version 3.0.5:
root@bullseye:~# dpkg -l|grep htop
ii htop 3.0.5-7 amd64 interactive processes viewer
In the changelog of the latest htop release (3.2.2) there is a line hinting to a behaviour change for containers and cgroup limits:
On Linux, improvements to cgroup and container identification
Well, maybe this caused a regression?
Let's find out by using an older version of htop!
I turned to my lab environment and decided to compile one htop version after another, until the problematic version is found. Luckily htop is a pretty small software and doesn't require hours of compiling. A different release can therefore quickly be downloaded an compiled.
To get all the necessary compiling tools, a few packages must be installed first:
root@bookworm:~# apt install libncursesw5-dev autotools-dev autoconf automake build-essential
After this we can download, unpack and compile the older version - 3.2.1 in this case:
root@bookworm:~# wget https://github.com/htop-dev/htop/releases/download/3.2.1/htop-3.2.1.tar.xz
root@bookworm:~# tar -xf htop-3.2.1.tar.xz
root@bookworm:~# cd htop-3.2.1
root@bookworm:~/htop-3.2.1# ./autogen.sh && ./configure && make
This results in a htop binary in the same directory:
root@bookworm:~/htop-3.2.1# ls -ltr| tail
-rw-r--r-- 1 root root 30408 Nov 20 20:48 TasksMeter.o
-rw-r--r-- 1 root root 44096 Nov 20 20:48 TraceScreen.o
-rw-r--r-- 1 root root 28408 Nov 20 20:48 UptimeMeter.o
-rw-r--r-- 1 root root 8544 Nov 20 20:48 UsersTable.o
-rw-r--r-- 1 root root 29288 Nov 20 20:48 Vector.o
-rw-r--r-- 1 root root 40416 Nov 20 20:48 XUtils.o
drwxr-xr-x 3 fhadm 121 4096 Nov 20 20:48 generic
drwxr-xr-x 3 fhadm 121 4096 Nov 20 20:48 linux
drwxr-xr-x 3 fhadm 121 4096 Nov 20 20:48 zfs
-rwxr-xr-x 1 root root 1301968 Nov 20 20:48 htop
And this can be executed and compared to the other htop binary, installed through the Debian repos:
root@bookworm:~/htop-3.2.1# ./htop
The ncurses output speaks for itself:
Two CPUs are showing with htop 3.2.1 - the correct amount which was set by the cgroup limit. The problem must indeed be some change in htop 3.2.2.
This looks pretty much like a regression to me and I opened up issue #1332. Hopefully this is confirmed and fixed soon, but even then, it might take quite some time until the upstream fix makes it into the Debian repositories.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Icingaweb Icingaweb2 Influx Internet Java KVM Kibana Kodi Kubernetes LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder