Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Nagios/Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

New version of check_equallogic features snmp connection check
Friday - Jul 25th 2014 - by - (0 comments)

The newest version of the Nagios/Icinga plugin check_equallogic with version number 20140711 contains a snmp connection check. This was requested a lot over the last months and since I published the plugin on github (see https://github.com/Napsty/check_equallogic), there were even some issues and pull requests opened for that (thanks guys). 

But instead of  just creating a new check type (like -t snmp), I wanted that all checks are automatically using the snmp connection check. Otherwise every Nagios/Icinga admin would have to define service dependencies which would complicate configurations. Lame.

So the snmp connectivity check is defined as a function at the begin of the plugin which makes an snmp query and gets all the member names of the Equallogic group. This function is then used in all the checks so in case the snmp connection fails for reason XYZ, the checks all return the connection failure. Before that, some of the check types still returned "OK" even though the values from an Equallogic member couldnt be read.

The plan is also to use the information queried by the snmp connectivity check as global information for future checks (e.g. to check the values of only one member).

So again to summarize: The new snmp connectivity check is built-in and you don't need to change your configurations to enable it. Simply replace the plugin with the new version and you're good to go. 

Enjoy.

And I'll enjoy my birthday now. 

 

One year of collected data - a review of temperatures in Zurich
Friday - Jul 25th 2014 - by - (0 comments)

Back in August 2013 I wrote about a very hot previous month in Switzerland (July 2013 in Switzerland - a hot month (graph)). The data was collected from a temperature sensor integrated in Icinga (Nagios) monitoring and automatically graphed with an interval of one minute.

Now there's one year of data collected and the hot month of July 2013 was a lot hotter than this year. It's interesting to see that Switzerland can have peaks of more than 30 degrees Celsius (24 hour average!) but also around 0 degrees, even though the winter 2013/2014 was pretty mild.

Here's the graphic for temperatures in Zurich, Switzerland from end of June 2013 - July 25th 2014:

Temperature Zurich Switzerland 2013-2014

 

Cannot connect to SSH: Read from socket failed: Connection reset by peer
Wednesday - Jul 23rd 2014 - by - (0 comments)

Cloned an LXC container from an existing one and then tried to connect to the new LXC through SSH and got this error:

ssh lxc24
Read from socket failed: Connection reset by peer

Logged in through lxc-console, the following error found in /var/log/auth.log describes the source of the problem pretty clear:

lxc24 login[1619]: pam_unix(login:session): session closed for user root
lxc24 sshd[1913]: error: Could not load host key: /etc/ssh/ssh_host_rsa_key
lxc24 sshd[1913]: error: Could not load host key: /etc/ssh/ssh_host_dsa_key
lxc24 sshd[1913]: error: Could not load host key: /etc/ssh/ssh_host_ecdsa_key

 Somehow during the clone process the host keys were removed. I simply recreated them using:

ssh-keygen -b 1024 -t rsa -f /etc/ssh/ssh_host_key
ssh-keygen -b 1024 -t rsa -f /etc/ssh/ssh_host_rsa_key
ssh-keygen -b 1024 -t dsa -f /etc/ssh/ssh_host_dsa_key

or as a quicker alternative, run dkpg-reconfigure openssh-server (thanks Fabien):

dpkg-reconfigure openssh-server

And the SSH login worked again (magic! lol):

ssh lxc24
Linux lxc24 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u1 x86_64


 

Automate Postfix installation in Debian and Ubuntu with debconf
Monday - Jul 21st 2014 - by - (0 comments)

Usually a Postfix installation under a Debian or Ubuntu Linux is followed by an interactive question like this:

apt-get install postfix

 Postfix Installation

Nowadays in the age of LXC, this can be annoying, if the LXC template contains the installation of the postfix package.

But this can be automated through the debconf command. I added the following lines into the "configure_debian" section in Debian Wheezy's /usr/share/lxc/templates/lxc-debian template and into the "configure_ubuntu" section in Ubuntu 14.04's /usr/share/lxc/templates/lxc-ubuntu template:

echo "postfix postfix/main_mailer_type select smarthost" | chroot $rootfs debconf-set-selections
echo "postfix postfix/mailname string $hostname.localdomain" | chroot $rootfs debconf-set-selections
echo "postfix postfix/relayhost string smtp.localdomain" | chroot $rootfs debconf-set-selections

This "pre-answers" the questions coming up during the Postfix installation and the postfix installation runs through without asking anything:

apt-get install postfix
[...]
Setting up postfix (2.11.0-1) ...
Creating /etc/postfix/dynamicmaps.cf
Adding tcp map entry to /etc/postfix/dynamicmaps.cf
Adding sqlite map entry to /etc/postfix/dynamicmaps.cf
setting myhostname: myhostname
setting alias maps
setting alias database
changing /etc/mailname to myhostname.localdomain
setting myorigin
setting destinations: localhost.localdomain, localhost
setting relayhost: smtp.localdomain
setting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
setting mailbox_size_limit: 0
setting recipient_delimiter: +
setting inet_interfaces: all

By the way, the settings set by "debconf-set-selections" can be verified or manually edited in /var/cache/debconf/config.dat:

cat /var/cache/debconf/config.dat | grep -B 4 seen
[...]
Name: postfix/mailname
Template: postfix/mailname
Value: myhostname.localdomain
Owners: postfix
Flags: seen
--
Name: postfix/main_mailer_type
Template: postfix/main_mailer_type
Value: smarthost
Owners: postfix
Flags: seen
--
Name: postfix/relayhost
Template: postfix/relayhost
Value: smtp.localdomain
Owners: postfix
Flags: seen


 

MySQL Galera cluster not starting (failed to open channel)
Monday - Jul 14th 2014 - by - (0 comments)

On a Galera Cluster test environment which was previously shut down (two virtual servers on the same physical machine), I got the following error message when I tried to start MySQL on the first cluster node:

/etc/init.d/mysql start
 * Starting MariaDB database server mysqld     [fail]

The detailed information was logged in the background in syslog:

Jul 14 15:17:07 node1 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
Jul 14 15:17:07 node1 mysqld_safe: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.iuhkNF' --pid-file='/var/lib/mysql/node1-recover.pid'
Jul 14 15:17:09 node1 mysqld_safe: WSREP: Recovered position cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:17:09 node1 mysqld: 140714 15:17:09 [Note] WSREP: wsrep_start_position var submitted: 'cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488'
[...]
Jul 14 15:17:13 node1 mysqld: 140714 15:17:13 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50224S), skipping check
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused")'
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]:
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [Note] WSREP: view((empty))
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: #011 at gcomm/src/pc.cpp:connect():141
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed to open backend connection: -110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'Galera Test' at 'gcomm://192.168.41.11,192.168.41.12': -110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs connect failed: Connection timed out
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: wsrep::connect() failed: 7
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] Aborting
Jul 14 15:17:42 node1 mysqld:
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [Note] WSREP: Service disconnected.
Jul 14 15:17:43 node1 mysqld: 140714 15:17:43 [Note] WSREP: Some threads may fail to exit.
Jul 14 15:17:43 node1 mysqld: 140714 15:17:43 [Note] /usr/sbin/mysqld: Shutdown complete
Jul 14 15:17:43 node1 mysqld:
Jul 14 15:17:43 node1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended

The important information here is the following line:

[ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)

When node1 starts MySQL, it tries to join an existing cluster. But because both nodes are currently down, there is no primary node available (see this page for a good and short explanation).

So when a Galera Cluster must be started from "zero" again, the first node must be started with the "wsrep-new-cluster" command (exactly during the set up of a new cluster):

service mysql start --wsrep-new-cluster
 * Starting MariaDB database server mysqld                               [ OK ]
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

In syslog, the following log entries can be found:

Jul 14 15:18:43 node1 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
Jul 14 15:18:43 node1 mysqld_safe: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.hK64YC' --pid-file='/var/lib/mysql/node1-recover.pid'
Jul 14 15:18:46 node1 mysqld_safe: WSREP: Recovered position cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: wsrep_start_position var submitted: 'cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488'
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Start replication
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Setting initial position to cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: protonet asio version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: backend: asio
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: GMCast version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: (62a145e9-0b59-11e4-9a2f-c62c46c73c36, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: (62a145e9-0b59-11e4-9a2f-c62c46c73c36, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: EVS version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: PC version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: gcomm: bootstrapping new group 'Galera Test'
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: gcomm: connected
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Opened channel 'Galera Test'
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Quorum results:
Jul 14 15:18:46 node1 mysqld: #011version    = 3,
Jul 14 15:18:46 node1 mysqld: #011component  = PRIMARY,
Jul 14 15:18:46 node1 mysqld: #011conf_id    = 0,
Jul 14 15:18:46 node1 mysqld: #011members    = 1/1 (joined/total),
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] /usr/sbin/mysqld: ready for connections.
Jul 14 15:18:46 node1 mysqld: Version: '10.0.10-MariaDB-1~trusty-wsrep-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution, wsrep_25.10.r3968
Jul 14 15:18:47 node1 /etc/mysql/debian-start[13061]: Upgrading MySQL tables if necessary.

The other nodes can be started normally and they will automatically connect to the primary node.

 

LXC start fails with get_cgroup failed to receive response error
Monday - Jul 14th 2014 - by - (0 comments)

After a reboot of a physical test server, two out of 5 Linux Containers (LXC) didn't start up automatically anymore.

When I manually tried to start them, I got the following error:

lxc-start: command get_cgroup failed to receive response

Although my research on the web pointed me to an Apparmor bug (Ubuntu bug #1295774), I could rule this bug out because the "fixed" Apparmor version was already installed:

dpkg -l | grep appa
ii  apparmor          2.8.95~2430-0ubuntu5  amd64 User-space parser utility for AppArmor
ii  libapparmor-perl  2.8.95~2430-0ubuntu5  amd64 AppArmor library Perl bindings
ii  libapparmor1:amd64 2.8.95~2430-0ubuntu5 amd64 changehat AppArmor library

Interestingly, as I mentioned at the begin, other LXC's were started without problem. I checked out the config files and found a difference that the started containers were using the direct path of a logical volume (LV) as rootfs while the other two (which didn't start) were using a directory path.

Turns out... this path was not mounted (I forgot the entry in /etc/fstab). ^^
After mounting the LV's to the expected path, lxc-start worked fine.

So the error message "get_cgroup failed to receive response" can also appear if the rootfs is missing or not mounted.

 

Bye Bye Windows XP
Monday - Jul 7th 2014 - by - (0 comments)

The Microsoft support for Windows XP ended already in April 2014 but only today I saw this warning message on a virtual machine running Windows XP:

Windows XP End of support warning 

Looks like an official bye bye wave.

 

Presenting new Nagios plugin: check_promise_vtrak
Friday - Jul 4th 2014 - by - (0 comments)

I'd like to announce the immediate availability of a new Nagios/Icinga plugin called check_promise_vtrak.pl, a plugin to monitor a Vtrak storage device from Promise.

It is based on the already existing open source plugin check_promise_chassis.pl and many helpful information was taken from this plugin written by Barry O' Donovan.

Although both plugins do similar checks, check_promise_vtrak.pl was completely rewritten and follows the programming structure (and layout) of check_ibm_ts_tape.pl, a plugin I wrote in the past, also allowing separate checks through option parameters and check types.

The plugin page contains the official documentation of the parameters and how to use it. It also links to the corresponding github repository. Yes, this is an invitation to contribute to the plugin, to make it better and to report about bugs! At this point I'd like to thank the open source community, especially Barry O'Donovan for his original plugin (check_promise_chassis.pl) and Fabien Huttin for testing several Vtrak devices with the new plugin for me.

 

Bugfix in etherror check type in check_equallogic
Friday - Jun 27th 2014 - by - (0 comments)

Florian Dhomps detected a small inconsistency in the etherror check in the Nagios plugin check_equallogic and contributed directly with a patch yesterday. 

The patch was verified and merged today and can be traced with this commit.

Therefore a new version of check_equallogic was released (version number is 20140626).

Thanks to Florian for the discovery and fix.

 

Use of uninitialized value XXX in concatenation (.) or string (perl snmp)
Thursday - Jun 26th 2014 - by - (0 comments)

While I was working on a Perl script which uses Net::SNMP, I got the following error:

Need to query .1.3.6.1.4.1.7933.1.20.2.1.1.14.1.15
Odd number of elements in hash assignment at /usr/lib/perl5/vendor_perl/5.16.2/Net/SNMP.pm line 2278.
Use of uninitialized value $enclosure in concatenation (.) or string at ./myscriptpl line 288.

What I was doing in my perl script was this:

      print "Need to query $oid_base.2.1.1.14.1.$oidend\n"; # debug
      my @oidlist2 = ($oid_base.2.1.1.14.1.$oidend);
      my $response = $session->get_request(-varbindlist => \@oidlist2);
      my $enclosure = $$response{"$oid_base.2.1.1.14.1.$oidend"};

Now the issue is that I missed the fact that I can't just put multiple variables together like this. It works correctly in the "print" line, because there I have set everything as a string in double-quotes.
In the definition of oidlist2, I forgot to use the double-quotes - so the OID wasn't really a SNMP OID anymore, which caused the snmp request to fail.

Ti simply put the oidlist2 definition into double-quotes solved that issue:

      #print "Need to query $oid_base.2.1.1.14.1.$oidend\n"; # debug
      my @oidlist2 = ("$oid_base.2.1.1.14.1.$oidend");
      my $response = $session->get_request(-varbindlist => \@oidlist2);
      my $enclosure = $$response{"$oid_base.2.1.1.14.1.$oidend"};
      #print "This drive is in Enclosure: $enclosure\n"; # debug

On my research "perl snmp Use of uninitialized value in concatenation" I did not find a single website pointing me to the solution. So once I figured it out myself, I thought I'd share that.

 


Go to Homepage home
Linux Howtos how to's
Nagios Plugins nagios plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

8576 Days
until Death of Computers
Why?