I hit a very strange problem today. After a server running openSUSE 13.2 (yes, a server, don't ask...) was patched and rebooted in the night, our Icinga monitoring started sending alerts for a special mountpoint:
DISK CRITICAL - /special is not accessible: Permission denied
Special in this case, because this path (/special) is a cifs/samba mount:
opensuse:~ # mount | grep /special
//sambaserver.example.local/special on /special type cifs (rw,...)
The permissions on /special were correct (0755), so the "nagios" user has the correct permissions to access the mountpoint. But even as root the check_disk failed:
opensuse:~ # /usr/lib/nagios/plugins/check_disk -w 10% -c 5% -p /special ; date
DISK CRITICAL - /special is not accessible: Permission denied
Thu Dec 10 08:14:04 CET 2015
In systemd's journal I found the following errors:
opensuse:~ # journalctl --system | grep "Dec 10 08:14:04"
Dec 10 08:14:04 opensuse kernel: CIFS VFS: Error -13 sending data on socket to server
At first I thought the problems arose because of the installed patches and a bug was installed. But I was able to run the same check_disk just fine out of /tmp:
opensuse:~ # cp /usr/lib/nagios/plugins/check_disk /tmp/
opensuse:~ # /tmp/check_disk -w 15% -c 5% -p /special
DISK OK - free space: /special 64882412 MB (49% inode=-);| /special=66640360MB;111794356;124946633;0;131522772
I tried several other directories to launch check_disk from and all worked - except the important one: /usr/lib/nagios/plugins.
By grepping for "plugins" in /etc/*, I came across the following result:
/etc/apparmor.d/usr.lib.nagios.plugins.check_disk:2:/usr/lib/nagios/plugins/check_disk {
/etc/apparmor.d/usr.lib.nagios.plugins.check_disk:7: /usr/lib/nagios/plugins/check_disk rm,
Apparmor!
The file name contains the full path to the executable and looks like this:
opensuse:~ # cat /etc/apparmor.d/usr.lib.nagios.plugins.check_disk
#include
/usr/lib/nagios/plugins/check_disk {
#include
#include
/etc/mtab r,
@{PROC}/[0-9]*/mounts r,
/usr/lib/nagios/plugins/check_disk rm,
}
In this apparmor rule, the full path of the executable check_disk is given. So now it made sense, why check_disk didn't work in /usr/lib/nagios/plugins but anywhere else.
For a quick resolution, I disabled this rule by moving it into the disabled folder followed by a restart of apparmor:
opensuse:~ # mv /etc/apparmor.d/usr.lib.nagios.plugins.check_disk /etc/apparmor.d/disable/
opensuse:~ # systemctl restart apparmor
Afterwards check_disk was working again:
opensuse:~ # /usr/lib/nagios/plugins/check_disk -w 10% -c 5% -p /special
DISK OK - free space: /special 64881742 MB (49% inode=-);| /special=66641030MB;118370494;124946633;0;131522772
But why did it work until the system was rebooted?
When the monitoring-plugins package was installed (on December 8th), the apparmor rule files came with it:
opensuse:~ # stat /etc/apparmor.d/usr.lib.nagios.plugins.check_* | egrep "File|Change"
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_dhcp’
Change: 2015-12-08 12:43:54.558360191 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_icmp’
Change: 2015-12-08 12:43:54.837370296 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_ide_smart’
Change: 2015-12-08 12:43:54.928373593 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_load’
Change: 2015-12-08 12:43:54.962374824 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_ntp_time’
Change: 2015-12-08 12:43:55.152381708 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_ping’
Change: 2015-12-08 12:43:55.266385838 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_procs’
Change: 2015-12-08 12:43:55.296386925 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_ssh’
Change: 2015-12-08 12:43:55.388390258 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_swap’
Change: 2015-12-08 12:43:55.439392106 +0100
File: ‘/etc/apparmor.d/usr.lib.nagios.plugins.check_users’
Change: 2015-12-08 12:43:55.571396888 +0100
2015-12-08 12:43 is exactly the time, the monitoring-plugins package was installed with zypper. However apparmor was not restarted as this can be seen with journalctl:
opensuse:~ # journalctl -u apparmor.service
[...]
-- Reboot --
Oct 26 09:56:17 opensuse boot.apparmor[798]: Starting AppArmor ..done
Oct 26 10:16:32 opensuse boot.apparmor[2839]: Unloading AppArmor profiles ..done
-- Reboot --
Oct 26 10:16:50 opensuse boot.apparmor[901]: Starting AppArmor ..done
Dec 09 20:09:19 opensuse boot.apparmor[22683]: Unloading AppArmor profiles ..done
-- Reboot --
Dec 09 20:12:16 opensuse boot.apparmor[1019]: Starting AppArmor ..done
Dec 10 10:32:22 opensuse boot.apparmor[3016]: Unloading AppArmor profiles ..done
Dec 10 10:32:23 opensuse boot.apparmor[3032]: Starting AppArmor ..done
Only when the server booted up on December 9th 20:12, apparmor was loaded with the new rules. And since then the check_disk plugin stopped working on /special.
Instead of just ranting here, I will contact the maintainer of the monitoring-plugins (I installed from zypper repo server:monitoring) and ask for a correction of the apparmor rule. Information about building and maintaining this package (in this case monitoring-plugins-disk) can be found here: Monitoring Plug-Ins on openSUSE Build. According to Request 341451, the apparmor rules were added on August 1 2015:
+-------------------------------------------------------------------
+Sat Aug 1 19:09:11 UTC 2015 - lars@linux-schulserver.de
+
+- add apparmor profiles for the following checks:
+ + check_disk
+ + check_load
+ + check_procs
+ + check_swap
+ + check_users
No comments yet.