Monitoring plugin check_smart 6.12.1 released: Security fix, NVMe perfdata fix, Erase_Fail_Count_Total

Written by - 0 comments

Published on December 10th 2021 - last updated on December 10th 2021 - Listed in Hardware Monitoring Security


A new version of check_smart, an open source monitoring plugin to monitor the health of hard drives, solid state drives and NVMe drives, is now available!

Release 6.12.0 adds a couple of important changes to the plugin. All check_smart users are encouraged to update to 6.12.0 as soon as possible.

Security fix in trailing path of pseudo-devices

The plugin allows the usage of so-called pseudo-devices. These devices are (in most cases) physical drives "hiding" behind a RAID controller. Depending on the controller, the Kernel then presents the drives under a path (/dev/bus/N).

By adding the possibility to check pseudo-devices, a security vulnerability was introduced. This gave check_smart the "honour" of its own CVE (CVE-2021-42257). However the security fix in version 6.9.1 only covered a part of the vulnerability. After discussions with Wolfgang Frisch from SUSE and John Runyon, an additional vulnerability was found in the trailing path of pseudo-devices. By appending the trailing path an attacker could break out of the plugin and execute additional commands with sudo privileges:

$ sudo ./check_smart.pl -d '/dev/bus/1 >/dev/null 2>&1; whoami' -i auto
root
UNKNOWN: Drive  S/N : |

The trailing path is now also fixed and the plugin returns the following output:

$ sudo ./check_smart.pl -d '/dev/bus/1 >/dev/null 2>&1; whoami' -i auto
Could not find any valid block/character special device for device /dev/bus/1 >/dev/null 2>&1; whoami  !

Added Erase_Fail_Count_Total to default raw list

In issue #73, additional health monitoring of Samsung SSDs was discussed. This led to additional research on Samsung SSD drives and an official Samsung document revealed four important ATA attributes:

The four SMART attributes listed in the table below are the most important indicators of drive health. if any of the normalized values drop below the 10% threshold, it’s recommended to replace the drive as soon as possible because it’s approaching the end of its life and may become unreliable if used longer.

179 Unused Reserved block Count (Used_Rsvd_Blk_Cnt_Tot)
181 Program fail Count (Program_Fail_Cnt_Total)
182 Erase Fail Count (Erase_Fail_Count_Total)
183 Runtime Bad Count (Runtime_Bad_Block) 

The attributes Program_Fail_Cnt_Total and Runtime_Bad_Block were already part of the default raw list, the Erase_Fail_Count_Total attribute was now added to the default raw list.

Bugfix in NVMe performance data

Where a human codes, there might be errors. This unfortunately happened, when check_smart 6.11.0 was released. The "handling dots in attribute names" request introduced a regression which basically removed the performance data on NVMe drives:

# /usr/lib/nagios/plugins/check_smart.pl -d /dev/nvme1n1 -i nvme
OK: Drive  UCS-SDHPCIE 800GB S/N XXX: no SMART errors detected. |=0x00 =42 =100 =10 =0 =242 =2913064 =12586 =13282120 =26 =57 =4140 =44 =0 =0

Unfortunately I did not test this suggested code change properly (I did not have any NVMe devices at hand back then) - hence this created the regression. Sorry!

Version 6.12.0 now fixes the regression and the performance data are back for NVMe drives:

# /usr/lib/nagios/plugins/check_smart.pl -d /dev/nvme1n1 -i nvme
OK: Drive  UCS-SDHPCIE 800GB S/N XXX: no SMART errors detected. |Temperature=42 Available_Spare=100 Available_Spare_Threshold=10 Percentage_Used=0 Data_Units_Read=242 Data_Units_Written=2913064 Host_Read_Commands=12586 Host_Write_Commands=13282120 Controller_Busy_Time=26 Power_Cycles=57 Power_On_Hours=4141 Unsafe_Shutdowns=44 Media_and_Data_Integrity_Errors=0 Error_Information_Log_Entries=0

Regression in 6.12.0: Invalid interface

Unfortunately 6.12.0 introduced yet another regression. Interfaces with additional comma separated input (for example -i megaraid,1) are ignored by the plugin and the following error message is shown:

# ./check_smart.pl -d /dev/sda -i megaraid,14
invalid interface megaraid,14 for /dev/sda!
[...]

This is fixed in release 6.12.1, released today (December 10th, 2021) as well.



Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.