Monitoring plugin check_netio 1.6 released: Limit tcp statistic output in performance data

Written by - 0 comments

Published on September 4th 2020 - Listed in Monitoring Linux Network


A new version of check_netio, a monitoring plugin to check network interfaces and their statistics (such as input/output and errors) on Linux, is available!

The newest release 1.6 introduces a new parameter -r which can be used in combination with the already existing parameter -t (to collect additional tcp statistics).

The problem with the -t parameter was, that the performance data output could become pretty large considering all the different TCP statistics found in /proc/net/netstat:

/usr/lib/nagios/plugins/check_netio.sh -i eth0 -t
NETIO OK - eth0: Receive 23284649421956 Bytes, Transmit 24036117884475 Bytes|NET_eth0_RX=23284649421956B;;;; NET_eth0_TX=24036117884475B;;;; NET_eth0_ERR_RX=0;;;; NET_eth0_ERR_TX=0;;;; NET_eth0_DROP_RX=236;;;; NET_eth0_DROP_TX=0;;;; SyncookiesSent=0;;;; SyncookiesRecv=0;;;; SyncookiesFailed=10708;;;; EmbryonicRsts=11;;;; PruneCalled=1132207;;;; RcvPruned=57163;;;; OfoPruned=210;;;; OutOfWindowIcmps=1756;;;; LockDroppedIcmps=0;;;; ArpFilter=0;;;; TW=8844650;;;; TWRecycled=0;;;; TWKilled=3966390849;;;; PAWSPassive=265;;;; PAWSActive=0;;;; PAWSEstab=69;;;; DelayedACKs=145526473;;;; DelayedACKLocked=86600;;;; DelayedACKLost=15639;;;; ListenOverflows=0;;;; ListenDrops=0;;;; TCPPrequeued=2257462558;;;; TCPDirectCopyFromBacklog=1634542275;;;; TCPDirectCopyFromPrequeue=910314455124;;;; TCPPrequeueDropped=0;;;; TCPHPHits=6748083892;;;; TCPHPHitsToUser=77529122;;;; TCPPureAcks=18521820705;;;; TCPHPAcks=2042634393;;;; TCPRenoRecovery=0;;;; TCPSackRecovery=23122;;;; TCPSACKReneging=157;;;; TCPFACKReorder=9886;;;; TCPSACKReorder=5568;;;; TCPRenoReorder=0;;;; TCPTSReorder=18825;;;; TCPFullUndo=19886;;;; TCPPartialUndo=95443;;;; TCPDSACKUndo=94;;;; TCPLossUndo=532376;;;; TCPLoss=6206;;;; TCPLostRetransmit=117;;;; TCPRenoFailures=0;;;; TCPSackFailures=629;;;; TCPLossFailures=241;;;; TCPFastRetrans=78667;;;; TCPForwardRetrans=3195;;;; TCPSlowStartRetrans=11752;;;; TCPTimeouts=795630;;;; TCPRenoRecoveryFail=0;;;; TCPSackRecoveryFail=105;;;; TCPSchedulerFailed=19;;;; TCPRcvCollapsed=11304426;;;; TCPDSACKOldSent=15653;;;; TCPDSACKOfoSent=4;;;; TCPDSACKRecv=30684;;;; TCPDSACKOfoRecv=2;;;; TCPAbortOnData=1363337089;;;; TCPAbortOnClose=104580;;;; TCPAbortOnMemory=0;;;; TCPAbortOnTimeout=11042;;;; TCPAbortOnLinger=0;;;; TCPAbortFailed=0;;;; TCPMemoryPressures=0;;;; TCPSACKDiscard=0;;;; TCPDSACKIgnoredOld=1;;;; TCPDSACKIgnoredNoUndo=23213;;;; TCPSpuriousRTOs=34;;;; TCPMD5NotFound=0;;;; TCPMD5Unexpected=0;;;; TCPSackShifted=39551;;;; TCPSackMerged=28792;;;; TCPSackShiftFallback=197072;;;; TCPBacklogDrop=1682;;;; TCPMinTTLDrop=0;;;; TCPOFOQueue=209658;;;; TCPOFODrop=22478;;;; TCPOFOMerge=4;;;; TCPChallengeACK=50266;;;; TCPSYNChallenge=49967;;;; BusyPollRxPackets=0;;;; TCPFromZeroWindowAdv=138540;;;; TCPToZeroWindowAdv=138540;;;; TCPWantZeroWindowAdv=3576590;;;; TCPACKSkippedSynRecv=0;;;; TCPACKSkippedPAWS=1;;;; TCPACKSkippedSeq=98514;;;; TCPACKSkippedFinWait2=0;;;; TCPACKSkippedTimeWait=0;;;; TCPACKSkippedChallenge=14;;;;

If check_netio is executed using NRPE, the output is cut (due to a output limit in NRPE) and depending where the output was cut, this could lead to performance data errors. See issue 11 for more details.

The new -r parameter was added for this purpose. This parameter awaits a comma-separated list of strings. Each of these strings will be compared and matched against the tcp statistics using regular expression. If a match happens, the tcp statistic will be added into the performance data. If not, that statistic will be skipped.

Here's a practical example where only statistics matching "loss" or "drop" should show up in the performance data:

$ ./check_netio.sh -i enp5s0 -t -r "loss,drop"
NETIO OK - enp5s0: Receive 27690651157 Bytes, Transmit 13173160148 Bytes|NET_enp5s0_RX=27690651157B;;;; NET_enp5s0_TX=13173160148B;;;; NET_enp5s0_ERR_RX=0;;;; NET_enp5s0_ERR_TX=0;;;; NET_enp5s0_DROP_RX=0;;;; NET_enp5s0_DROP_TX=0;;;; LockDroppedIcmps=0;;;; ListenDrops=0;;;; TCPLossUndo=119;;;; TCPLossFailures=12;;;; TCPLossProbes=5241;;;; TCPLossProbeRecovery=2666;;;; TCPBacklogDrop=0;;;; PFMemallocDrop=0;;;; TCPMinTTLDrop=0;;;; TCPDeferAcceptDrop=0;;;; TCPReqQFullDrop=0;;;; TCPOFODrop=0;;;;

This output is now definitely shortened enough so NRPE server can send the full output to the remote NRPE check plugin.

Besides helping to cope with the NRPE output limit, this is in general helpful for users to define themselves which statistics are relevant and should be graphed for long term statistics.


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.