Some servers not logging on remote syslog server (rejecting connection)

Written by - 0 comments

Published on March 11th 2014 - Listed in Linux

A set of servers (both virtual and physical) were set to log to a syslog server running in a LXC (Linux Container).
I adapted all the syslog-ng config files to write certain logs to the remote syslog server and after a syslog-ng restart, the first log entries arrived on the syslog server.

But to my big surprise only LXC's were sending their logs to the LXC syslog server. The physical servers did not send anything.

I have verified the connectivity with nc (netcat), telnet and tcpdump (if you wonder about the programs mentioned, the syslog server listens on tcp).
Here's the tcpdump output between a physical machine and the syslog server:

11:25:41.800587 IP (tos 0x0, ttl 64, id 32684, offset 0, flags [DF], proto TCP (6), length 60)
    physical-server.local.59783 > syslog-server.local.1000: Flags [S], cksum 0x78d5 (correct), seq 2384712082, win 14600, options [mss 1460,sackOK,TS val 334999102 ecr 0,nop,wscale 7], length 0
11:25:41.800634 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    syslog-server.local.1000 > physical-server.local.59783: Flags [S.], cksum 0x86f5 (incorrect -> 0xa494), seq 1047269898, ack 2384712083, win 14480, options [mss 1460,sackOK,TS val 23495371 ecr 334999102,nop,wscale 7], length 0
11:25:41.800818 IP (tos 0x0, ttl 64, id 32685, offset 0, flags [DF], proto TCP (6), length 52)
    physical-server.local.59783 > syslog-server.local.1000: Flags [.], cksum 0x0b7e (correct), seq 1, ack 1, win 115, options [nop,nop,TS val 334999102 ecr 23495371], length 0
11:25:41.801069 IP (tos 0x0, ttl 64, id 32686, offset 0, flags [DF], proto TCP (6), length 159)
    physical-server.local.59783 > syslog-server.local.1000: Flags [P.], cksum 0xba07 (correct), seq 1:108, ack 1, win 115, options [nop,nop,TS val 334999102 ecr 23495371], length 107

That looks about right. There is definitely a connection established between the two and it also looks like the physical server is trying to send data.
I was focusing so much on a network related issue that I completely forgot the check the logs on the syslog server itself.
One day later that idea finally came up. And I found the reason: 

syslog syslog-ng[5549]: Number of allowed concurrent connections reached, rejecting connection; client='AF_INET(', local='AF_INET(', max='10'

So that was never really a networking issue but rather a default limitation (10 concurrent connections) to syslog-ng acting as syslog server.
Although in a default syslog-ng config the max connections value is nowhere set, the value can of course be increased.

My config line before:

source s_net { tcp(ip( port(1000)); };

I added max connection and keep alive values:

source s_net { tcp(ip( port(1000) max-connections(50) so_keepalive(yes)); };

A restart of syslog-ng then gave the following output:

/etc/init.d/syslog-ng restart
[ ok ] Stopping system logging: syslog-ng.
[....] Starting system logging: syslog-ngWARNING: window sizing for tcp sources were changed in syslog-ng 3.3, the configuration value was divided by the value of max-connections(). The result was too small, clamping to 100 entries. Ensure you have a proper log_fifo_size setting to avoid message loss.; orig_log_iw_size='20', new_log_iw_size='100', min_log_fifo_size='5000'
. ok

By setting a higher max-connection value, the division of "log_iw_size" through "max-connection" was lower than a pre-defined threshold. Syslog-ng therefore automatically increased "log_iw_size" to 100.

Logging now works from all machines, also the physical ones (it was a coincidence that all LXC's started to send their logs first).

Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.