When I released the monitoring plugin check_es_store back in June this year, I didn't really think I'd need something else than the disk check.
Yet a couple of days ago we've hit a production downtime which was traced back to our ElasticSearch cluster running in the cloud. The reason: ES ran out of memory and started to run garbage collection (gc) which slowed down ES a great deal.
Lesson learned. I added a memory usage check in the plugin. Because the plugin now does more than to check the storage (hence store in the name), I renamed the plugin to check_es_system (to check the underlying system).
In order to launch the disk or mem check, a new parameter (-t for checktype) was added.
Please check out the documentation of the monitoring plugin check_es_system for more information.
No comments yet.