Our monitoring informed me about a HTTP 500 error from a central reverse proxy running with Nginx. Checking the error logs revealed the following issue:
2019/05/09 08:43:35 [crit] 25655#0: *524505514 open() "/usr/share/nginx/html/50x.html" failed (24: Too many open files)
2019/05/09 09:04:27 [alert] 28720#0: *59757 socket() failed (24: Too many open files) while connecting to upstream,
This basically means that the Nginx process had too many files open, which could also be checked on the Nginx status page. Here the graph from check_nginx_status.pl:
The default is set to a limit of 4096 files per (worker) process, which can be seen in /etc/default/nginx:
# cat /etc/default/nginx
# Note: You may want to look at the following page before setting the ULIMIT.
# Set the ulimit variable if you need defaults to change.
# Example: ULIMIT="-n 4096"
However don't be fooled. Changing this file doesn't help. Instead this needs to be set in /etc/security/limits.conf:
# tail /etc/security/limits.conf
#@faculty hard nproc 50
#ftp hard nproc 0
#ftp - chroot /ftp
#@student - maxlogins 4
# Added Nginx limits
nginx soft nofile 30000
nginx hard nofile 50000
# End of file
Here a soft limit of 30k and a hard limit of 50k files are defined per nginx process.
Note: I tried this here with www-data first (the user under which Nginx runs), but this didn't work. Although a user name could be used as a "domain" in this config file...
Additionally Nginx should be told how many files can be opened. In the main config file /etc/nginx/nginx.conf add:
# head /etc/nginx/nginx.conf
# 2019-05-09 Increase open files
After a service nginx restart the limits of the worker processes can be checked:
# ps auxf | grep nginx
root 7027 0.0 0.3 103620 13348 ? Ss 09:21 0:00 nginx: master process /usr/sbin/nginx
www-data 7028 8.6 1.0 127900 40724 ? R 09:21 2:37 \_ nginx: worker process
www-data 7029 8.9 1.0 127488 40536 ? S 09:21 2:44 \_ nginx: worker process
www-data 7031 9.5 1.0 127792 40896 ? S 09:21 2:53 \_ nginx: worker process
www-data 7032 8.1 1.0 128472 41244 ? S 09:21 2:29 \_ nginx: worker process
# cat /proc/7028/limits | grep "open files"
Max open files 30000 30000 files
The "too many open files" errors disappeared from the Nginx logs after this change.
But what did cause this sudden problem? As you can see in the graph above this Writing (and Waiting) connections suddenly sharply increased. It turned out that an upstream server behind this reverse proxy did not work anymore and this particular virtual host received a lot of traffic, causing general slowness and holding files open while waiting for a timeout from Nginx (504 in this case).
No comments yet.
AWS Android Ansible Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Container Containers CouchDB DB DNS Database Databases Docker ELK ElasticSearch Elasticsearch Filebeat FreeBSD GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Icingaweb2 InfluxDB Internet Java Kibana Kubernetes LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Rancher SSL Security Shell SmartOS Solaris Surveillance SystemD TLS Tomcat Ubuntu Unix VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder