check_mssql_health: Monitoring MSSQL with SQL query and negative thresholds

Written by - 0 comments

Published on - Listed in Windows Monitoring Database DB


When monitoring a MSSQL server, my first choice is the monitoring plugin check_mssql_health by Gerhard Lausser. By using its different modes, the most important checks are already pre-defined and ready to use. 

The plugin also supports to run SQL queries on the target MSSQL server. And this is actually what I needed to do to collect performance data from a specially created database (which itself collects statistics from the SQL instance).

But first a SQL query needs to be encoded so check_mssql_health can understand it:

$ echo 'SELECT TOP (1) [buffer_cache_hit] FROM [PerformanceDB].dbo.tblPerformanceSampling ORDER BY [timestamp] DESC;' | /usr/lib/nagios/plugins/check_mssql_health --mode encode
SELECT%20TOP%20%281%29%20%5Bbuffer%5Fcache%5Fhit%5D%20FROM%20%5BPerformanceDB%5D%2Edbo%2EtblPerformanceSampling%20ORDER%20BY%20%5Btimestamp%5D%20DESC%3B

Now the plugin can be run with mode "sql" followed by the encrypted SQL query as "name" parameter:

$ /usr/lib/nagios/plugins/check_mssql_health --server="mssqlserver\instance001" --port=1433 --username="sql_monitoring" --password=supersecret --mode=sql --name="SELECT%20TOP%20%281%29%20%5Bbuffer%5Fcache%5Fhit%5D%20FROM%20%5BPerformanceDB%5D%2Edbo%2EtblPerformanceSampling%20ORDER%20BY%20%5Btimestamp%5D%20DESC%3B" --commit --warning=99 --critical=98
CRITICAL - select top (1) [buffer_cache_hit] from [PerformanceDB].dbo.tblperformancesampling order by [timestamp] desc;: 100.000000 | 'select'=100.00;99;98;;

So far so good. The SQL query worked and it read a result from the database: 100. But as this is a value of the buffer_cache_hit, everything below 100 is not good.

Note: The plugin allows to check the buffer hit ratio directly (without having to create SQL queries) using "--mode mem-pool-data-buffer-hit-ratio". But in this particular scenario, the DBA didn't want to give my monitoring user the needed privileges on the MSSQL server... Don't ask for details.

There's however a possibility to tell the plugin to use "negative thresholds" by using colons after the thresholds (source):

$ /usr/lib/nagios/plugins/check_mssql_health --server="mssqlserver\instance001" --port=1433 --username="sql_monitoring" --password=supersecret --mode=sql --name="SELECT%20TOP%20%281%29%20%5Bbuffer%5Fcache%5Fhit%5D%20FROM%20%5BPerformanceDB%5D%2Edbo%2EtblPerformanceSampling%20ORDER%20BY%20%5Btimestamp%5D%20DESC%3B" --commit --warning=99: --critical=98:
OK - select top (1) [buffer_cache_hit] from [PerformanceDB].dbo.tblperformancesampling order by [timestamp] desc;: 100.000000 | 'select'=100.00;99:;98:;;

Now the plugin will return a WARNING if the result from the SQL query is 99 and CRITICAL if the result is 98 or below.


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.