Regression found in latest Varnish Enterprise (varnish-plus) 6.0.16r2

Written by - 0 comments

Published on - Listed in Varnish Linux


Updating Varnish, whether the Open Source project "Varnish Cache" or the Enterprise Edition "Varnish Enterprise" (requires a license), has been a no-brainer in past years. There are very few issues I encountered, in 99% of all cases Varnish just continues to work like a champ after an update.

Stumbled on regression in 6.0.16r2

But there's that 1% that even the best software can run into. When I saw the Changelog of 6.0.16r2 I thought "cool, that sounds like a great new feature". To be able to handle cache for objects (URLs) without a defined TTL or Grace Time can be helpful in certain situations.

After 6.0.16r2 was installed on a new testing environment, it didn't take long and the disk space on this Varnish server was fully used. Disk space? Wasn't Varnish supposed to be ultra-fast and keep the objects in memory using the MSE memory storage? Yes - of course.

The alerts from monitoring forced me to investigate, just to find out that the Varnish process frequently died and created crash dumps (through apport) - which in turn filled up the /var file system.

Thanks to the check_varnish monitoring plugin, which I've contributed several times to, we could quickly confirm there are indeed a lot of crashes (MGT.child_died) happening inside Varnish:

A closer look at the logs revealed recurring crashes, showing these errors and backtrace:

Sep 15 11:32:57 varnish varnishd[2281691]: Child (2438673) died signal=6 (core dumped)
Sep 15 11:32:57 varnish varnishd[2281691]: Child (2438673) Panic at: Mon, 15 Sep 2025 09:32:57 GMT
Assert error in ban_start_list_try_remove_oc(), cache/cache_ban.c line 397:
  Condition(bsl->refcount > 0) not true.

version = varnish-plus-6.0.16r2 revision c8612d31fee299b8c525c235a3dd6ab37b6e60e8, vrt api = 6011.0
ident = Linux,6.14.0-1012-aws,x86_64,-junix,-smse,-hcritbit,epoll
now = 248868.331797 (mono), 1757928776.468119 (real)
Backtrace:
  ip=0x5b44872195bd sp=0x7e5f4e3f6e10 <VBT_format+0x2d>
  ip=0x5b448711025b sp=0x7e5f4e3f6e30 <pan_ic+0x25b>
  ip=0x5b448720aee9 sp=0x7e5f4e3f6f40 <VAS_Fail_Dump+0x19>
  ip=0x5b448720af13 sp=0x7e5f4e3f6f50 <VAS_Fail+0x13>
  ip=0x5b44870d6306 sp=0x7e5f4e3f6f60 <BAN_DestroyObj+0x246>
  ip=0x5b448710637c sp=0x7e5f4e3f6f90 <HSH_DerefObjCoreUnlock+0x12c>
  ip=0x5b44872570e2 sp=0x7e5f4e3f6ff0 <exp_derefobjcore.isra.0+0x92>
  ip=0x5b44870efd7e sp=0x7e5f4e3f7020 <exp_thread+0x47e>
  ip=0x5b448714641a sp=0x7e5f4e3f70a0 <wrk_bgthread_run+0x11a>
  ip=0x5b4487146598 sp=0x7e5f4e3f7c20 <wrk_bgthread+0x68>
  ip=0x7e5f6409caa4 sp=0x7e5f4e3f7c50 <pthread_condattr_setpshared+0x684>
  ip=0x7e5f64129c3c sp=0x7e5f4e3f7d00 <__clone+0x24c>
addr = (nil),
thread = (cache-exp)
thr.req = (nil) {
},
thr.busyobj = (nil) {
},
vmods = {
  std = {Varnish Plus 6.0.16r2 c8612d31fee299b8c525c235a3dd6ab37b6e60e8, 0.0},
  stale = {Varnish Plus 6.0.16r2 c8612d31fee299b8c525c235a3dd6ab37b6e60e8, 0.0},
},
vge = 0x7e5f64913000 {
  epitaphs = 0/3,
},
Sep 15 11:32:57 varnish varnishd[2281691]: Child cleanup complete

Cache invalidation using BAN triggers the regression

In the affected environment we use BAN as cache invalidation method. This seems to have triggered the assert error and therefore the regression.

It didn't take long until my support request was answered from the Varnish team. After collecting data, they were quickly able to reproduce the error/crash. As always, it's fantastic to work together with the ladies and gentlemen from Varnish Software - even though the reason wasn't such a "joy" this time ;-).

As a result, the release 6.0.16r2 was pulled from their package repositories (hosted on packagecloud). A bug fix release is already in the works.


More recent articles:

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PHP   Perl   Personal   PostgreSQL   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder    Linux