Header
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup Hardware Internet Linux Nagios/Monitoring Personal PHP Proxy Shell VMware Windows Wyse

Solving Bind9 reload errors after Debian upgrade to Squeeze
Thursday - May 17th 2012 - 12.29 pm (+0200) - Prangins, Switzerland - (0 comments)

By upgrading a Debian server from Lenny to Squeeze, the version of the DNS name server Bind changes from 8.4.7 to 9.7.3.
If the same configuration files are re-used, bind will have problems to reload the configuration. While it worked fine under Debian Lenny, errors now appear on Squeeze.

The errors look like these:

# /etc/init.d/bind9 reload
Reloading domain name service...: bind9rndc: connect failed: 127.0.0.1#953: connection refused
 failed!

# rndc reload
rndc: connect failed: 127.0.0.1#953: connection refused

The port 953 is used for the rndc (control-) command, usually used for reloading the bind server.

In /etc/bind there is a file called rndc.key. This file (or the content) need to be included in named.conf or named.conf.options. Furthermore the control definition needs to be added, so that rndc listens on port 953.
I defined both in named.conf.options:

# cat named.conf.options
options {
...
};

key "rndc-key" {
        algorithm hmac-md5;
        secret "xxxYOURSECRETKEYxxx==";
};

controls {
        inet 127.0.0.1 port 953 allow { 127.0.0.1; } keys { rndc-key; };
        };

After a bind9 restart, the config could be successfully reloaded again:

# /etc/init.d/bind9 reload
Reloading domain name service...: bind9.

# rndc reload
server reload successful

 

Proftpd: 530 Login incorrect due to invalid shell
Monday - May 14th 2012 - 11.49 pm (+0200) - Prangins, Switzerland - (0 comments)

In case you have a Proftpd FTP server and you receive the following error message in your FTP log, it does not necessarily mean that your password is wrong:

Status:    Verbinde mit xxx.xxx.xxx.xxx:21...
Status:    Verbindung hergestellt, warte auf Willkommensnachricht...
Antwort:    220 FTP Server ready.
Befehl:    USER web24
Antwort:    331 Password required for web24
Befehl:    PASS ********
Antwort:    530 Login incorrect.
Fehler:    Kritischer Fehler

Obviously you need to check on the server if the password is really correct.
The next step is to use proftpd's debugging mode. Stop the daemon and launch the following command:

proftpd -nd6

This command launches proftpd in debug mode, where you can trace everything what happens:

# proftpd -nd6
 - using TCP receive buffer size of 87380 bytes
 - using TCP send buffer size of 16384 bytes
 - disabling runtime support for IPv6 connections
 - mod_tls/2.4.2: using OpenSSL 0.9.8o 01 Jun 2010
 - <IfModule>: using 'mod_tls.c' section at line 9
ftp.server.ip.address -
ftp.server.ip.address - Config for example.com:
ftp.server.ip.address - Limit
ftp.server.ip.address -  DenyGroup
ftp.server.ip.address - DefaultServer
ftp.server.ip.address - ServerIdent
ftp.server.ip.address - ListOptions
ftp.server.ip.address - IdentLookups
ftp.server.ip.address - TimesGMT
ftp.server.ip.address - LangEngine
ftp.server.ip.address - Umask
ftp.server.ip.address - UserID
ftp.server.ip.address - UserName
ftp.server.ip.address - GroupID
ftp.server.ip.address - GroupName
ftp.server.ip.address - TransferLog
ftp.server.ip.address - AllowOverwrite
ftp.server.ip.address - DefaultRoot
ftp.server.ip.address - TLSEngine
ftp.server.ip.address - TLSLog
ftp.server.ip.address - TLSRSACertificateFile
ftp.server.ip.address - TLSRSACertificateKeyFile
ftp.server.ip.address - TLSOptions
ftp.server.ip.address - TLSRequired
ftp.server.ip.address - mod_lang/0.9: skipping possible language 'it': not supported by setlocale(3); see `locale -a'
ftp.server.ip.address - mod_lang/0.9: skipping possible language 'ru': not supported by setlocale(3); see `locale -a'
ftp.server.ip.address - mod_tls/2.4.2: passphrase locked into memory
ftp.server.ip.address - ProFTPD 1.3.3a (maint) (built Sun Nov 13 2011 22:40:44 UTC) standalone mode STARTUP
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - session requested from client in unknown class
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - mod_cap/1.0: adding CAP_AUDIT_WRITE capability
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - mod_ident/1.0: ident lookup disabled
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - connected - local  : ftp.server.ip.address:21
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - connected - remote : my.remote.ip.address:52478
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - FTP session opened.
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_rewrite
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_tls
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_core
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_core
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_delay
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'USER web24' to mod_auth
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching CMD command 'USER web24' to mod_auth
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching POST_CMD command 'USER web24' to mod_sql
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching POST_CMD command 'USER web24' to mod_delay
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching LOG_CMD command 'USER web24' to mod_sql
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching LOG_CMD command 'USER web24' to mod_log
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_rewrite
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_tls
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_core
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_core
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_wrap
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_sql
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_delay
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching PRE_CMD command 'PASS (hidden)' to mod_auth
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching CMD command 'PASS (hidden)' to mod_auth
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - user 'web24' authenticated by mod_auth_pam.c
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - USER web24 (Login failed): Invalid shell: '/bin/false'
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching POST_CMD_ERR command 'PASS (hidden)' to mod_sql
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching POST_CMD_ERR command 'PASS (hidden)' to mod_delay
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching LOG_CMD_ERR command 'PASS (hidden)' to mod_sql
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching LOG_CMD_ERR command 'PASS (hidden)' to mod_log
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - dispatching LOG_CMD_ERR command 'PASS (hidden)' to mod_auth
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - mod_tls/2.4.2: scrubbing 1 passphrase from memory
ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - FTP session closed.

Yes.. the important line is this one:

ftp.server.ip.address (my.remote.ip.address[my.remote.ip.address]) - USER web24 (Login failed): Invalid shell: '/bin/false'

Either the user web24 needs a valid shell like /bin/bash or the proftpd.conf setting needs the following line:

# grep Shell /etc/proftpd/proftpd.conf
RequireValidShell             off

By setting this option, proftpd accepts users without valid shells and will allow the FTP session.

 

NTP servers should be physical
Monday - May 14th 2012 - 12.27 pm (+0200) - Geneva, Switzerland - (0 comments)

A while ago I experienced NTP synchronisation problems between Linux clients and the NTP server, a Windows 2003 server (Domain Controller role). As a temporary solution, a virtual machine was then set up which served as NTP server.

Due to virtualized hardware, the virtual machines often have problems with keeping up their (virtual) time. They often run too fast, are therefore in the future, sometimes they're lagging behind. So it was clear: To use a virtual NTP server must be a temporary solution.
After re-modifications and adaption of the Windows settings to be able to serve NTP requests, the Linux guests were switched back to re-use the physical NTP server.

But would it really make such a big difference? The following graphs prove it: Yes.

Difference between virtual and physical NTP server

As one can see on the graphic, before the switch to the physical NTP server, many spikes of offsets can be seen. This happened on both physical and virtual servers, has therefore nothing to do with the NTP client. Instead both physical and virtual servers received the same time from the virtual NTP server which itself presents its own time, synchronized with an external source. After the switch to a physical NTP server, there are no offset spikes anymore and the whole synchronization is much steadier.

 

Webalizer Statistics stopped working after Debian upgrade
Monday - May 7th 2012 - 7.38 pm (+0200) - Prangins, Switzerland - (0 comments)

Just experienced a 'funny' issue on a recently upgraded Debian server (upgraded from Lenny to Squeeze): The Webalizer statistics were not created anymore.

On web servers with Confixx installed, the script which creates the Webalizer statistics is located in /root/confixx/runwebalizer.sh. After a manual launch, the following errors were shown:

~/confixx # ./runwebalizer.sh
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat
Error Opening file /usr/share/GeoIP/GeoIP.dat

After a quick research I found the following hint in Debian Bug #532123:

This file is now part of the geoip-database package, which webalizer should depend on.

So even if you have libgeoip1 installed (and this was sufficient on Debian Lenny), you now need to install geoip-database as well:

# apt-get install geoip-database

Shortly after that, Webalizer ran fine again.

 

Update May 11th 2012:
The issue went deeper than I previously thought. Although the errors of the GeoIP.dat file disappeared, the Webalizer statistics weren't updated automatically anymore. Only a manual launch of the mentioned runwebalizer.sh script worked.

So I tried to run the basic Webalizer command with the configuration file of a Confixx user:

# /usr/bin/webalizer -c /var/www/web24/.configs/webalizer.conf -d

Nothing.. There was no output at all (even with the debug parameter). A quick check at the statistics showed: No update was made. So something still doesn't seem to work. After doing some research, I came across a Swiss webpage mentioning problems with automatic updates of Webalizer statistics. There they blame the history files (webalizer.current and webalizer.hist) which are found in each user's Webalizer folder. It doesn't hurt to try and I deleted the file webalizer.current and relaunched the same command as before:

# /usr/bin/webalizer -c /var/www/web24/.configs/webalizer.conf -d
--> unresolved country for '199.19.249.196' (GeoIP says (null):(null))
--> unresolved country for '107.21.154.144' (GeoIP says (null):(null))

Ha! That looks different!!! And the statistics are now shown updated.


 

Using aptitude/apt-get with Proxy Authentication
Friday - May 4th 2012 - 11.28 am (+0200) - Geneva, Switzerland - (0 comments)

After making a few tests with openSUSE 12.1 and Ubuntu 12.04 LTS on a workstation, I ran into proxy authentication problems when trying to update the OS.

While on openSUSE zypper simply doesn't support proxy authentication (yet... a bug is still open), at least on Ubuntu there's the possibility to add the proxy settings to the apt configuration.

In your /etc/apt/apt.conf file you need to enter the following lines:

# cat /etc/apt/apt.conf
Acquire::http::proxy "http://proxyuser:password@proxyserver:8080/";
Acquire::https::proxy "http://proxyuser:password@proxyserver:8080/";
Acquire::ftp::proxy "http://proxyuser:password@proxyserver:8080/";

Note that without proxy authentication, it is not necessary to enter the user and password credentials.

 

Bugfix in check_equallogic (info check)
Thursday - May 3rd 2012 - 10.25 am (+0200) - Geneva, Switzerland - (0 comments)

In previous versions, the info check of the Nagios plugin check_equallogic couldn't correctly handle the output, when several members were used in the same group. As it is known, each member shows information about all members in the same group - this is a wanted design of Equallogic. 

The fix in version 20120503 addresses this issue and now shows a correct information of all devices. 

Furthermore a firmware check has been added (still in this 'info' check type). The firmware version is controlled on every member and a WARNING is issued if the firmware versions differ. Reminder: All Equallogic members of the same group should have the same firmware version.

 

VMware creates new KB entry for slow hardware status bug
Wednesday - May 2nd 2012 - 4.45 pm (+0200) - Geneva, Switzerland - (0 comments)

As I already wrote in an article (Slow hardware discovery/check with ESXi 5.0 U1) in March, there is a bug in the current ESXi version (5.0 U1) which slows down the query of the CIM entries. VMware acknowledged the bug and announced a bugfix for Q3.

Now I just saw that VMware has created a new Knowledge Base entry on April 17th for this particular bug:
http://kb.vmware.com/kb/2016538

This is not only great for ESXi admins to track this bug and find updated information but it is also great for the Nagios plugin check_esxi_hardware.py as it is mentioned directly in the KB article!
Take a look at the screenshot (or visit the KB link above):

VMware Knowledge Base Article Slow Hardware Status

Thanks to all the check_esxi_hardware users who contacted VMware concerning this bug. I'm pretty sure that users of check_esxi_hardware.py were the first ones hitting that bug ;-).


 

Bugfix in check_esxi_hardware (Manufacturer discovery)
Tuesday - May 1st 2012 - 12.10 pm (+0200) - Geneva, Switzerland - (0 comments)

Craig Hart discovered a bug in the monitoring plugin/script check_esxi_hardware.py which occurred on self-built servers or generally on servers which did not have a CIM entry for Manufacturer.

The typical error message, when the script tries to handle the Manufacturer value, looks like this:

TypeError: cannot concatenate 'str' and 'NoneType' objects

The bugfix added in todays version (20120501) handles this case and sets the manufacturer string to Unknown Manufacturer.

Big thanks to Craig for his discovery and fix.

 

Monitor Equallogic Snapshots with check_equallogic
Monday - Apr 30th 2012 - 4.10 pm (+0200) - Switzerland - (0 comments)

Already a while ago Roland Penner has send me an interesting patch which added a new check type (snapshots) in the Nagios/Monitoring plugin check_equallogic.

As the name of the type already describes it, this check type is used to check the reserved disk space on your Dell Equallogic SAN. 

This new features is available now and version 20120430 of check_equallogic has been released.

Big thanks to Roland for his contribution!

 

Linux server crash due to defect memory
Friday - Apr 27th 2012 - 1.18 pm (+0200) - Switzerland - (0 comments)

Just recently I had to handle two crashes of the same Linux server. As soon as I launched some I/O intensive process (rsync in my case), the machine crashed.

The following log entries were written in the kern.log.

First crash:

Apr 25 20:12:15  kernel: [12156.863672] BUG: unable to handle kernel NULL pointer dereference at (null)
Apr 25 20:12:15  kernel: [12156.863728] IP: [] writeback_inodes_wb+0xf6/0x4ff
Apr 25 20:12:15  kernel: [12156.863765] PGD 0
Apr 25 20:12:15  kernel: [12156.863787] Oops: 0002 [#1] SMP
Apr 25 20:12:15  kernel: [12156.863812] last sysfs file: /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
Apr 25 20:12:15  kernel: [12156.863862] CPU 4
Apr 25 20:12:15  kernel: [12156.863883] Modules linked in: acpi_cpufreq cpufreq_conservative cpufreq_powersave cpufreq_stats cpufreq_userspace ext3 jbd loop snd_pcm snd_timer i2c_i801 snd soundcore snd_page_alloc i2c_core video wmi button output pcspkr evdev ext4 mbcache jbd2 crc16 dm_mod aacraid 3w_9xxx 3w_xxxx raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 md_mod sata_nv sata_sil sata_via sd_mod crc_t10dif ahci libata ehci_hcd r8169 xhci scsi_mod usbcore thermal nls_base mii processor thermal_sys [last unloaded: scsi_wait_scan]
Apr 25 20:12:15  kernel: [12156.864195] Pid: 9876, comm: flush-253:1 Not tainted 2.6.32-5-amd64 #1 System Product Name
Apr 25 20:12:15  kernel: [12156.864246] RIP: 0010:[]  [] writeback_inodes_wb+0xf6/0x4ff
Apr 25 20:12:15  kernel: [12156.864298] RSP: 0018:ffff88043b4c9d00  EFLAGS: 00010286

Second crash, very similar log entries:

Apr 26 11:11:12 kernel: [ 2942.917788] BUG: unable to handle kernel NULL pointer dereference at (null)
Apr 26 11:11:12 kernel: [ 2942.917838] IP: [<(null)>] (null)
Apr 26 11:11:12 kernel: [ 2942.917862] PGD 0
Apr 26 11:11:12 kernel: [ 2942.917884] Oops: 0010 [#1] SMP
Apr 26 11:11:12 kernel: [ 2942.917907] last sysfs file: /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
Apr 26 11:11:12 kernel: [ 2942.917952] CPU 0
Apr 26 11:11:12 kernel: [ 2942.917971] Modules linked in: acpi_cpufreq cpufreq_conservative cpufreq_powersave cpufreq_stats cpufreq_userspace ext3 jbd loop i2c_i801 i2c_core video snd_pcm evdev output wmi snd_timer snd soundcore snd_page_alloc pcspkr button ext4 mbcache jbd2 crc16 dm_mod aacraid 3w_9xxx 3w_xxxx raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 md_mod sata_nv sata_sil sata_via sd_mod crc_t10dif ahci libata ehci_hcd scsi_mod xhci r8169 mii thermal usbcore nls_base processor thermal_sys [last unloaded: scsi_wait_scan]
Apr 26 11:11:12 kernel: [ 2942.918246] Pid: 1288, comm: flush-253:1 Not tainted 2.6.32-5-amd64 #1 System Product Name
Apr 26 11:11:12 kernel: [ 2942.918292] RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
Apr 26 11:11:12 kernel: [ 2942.918320] RSP: 0018:ffff88043b651c28  EFLAGS: 00010087

First I assumed a bug in the kernel for EXT4 file systems but after an extended hardware stress test, a defect memory dimm was found.

After replacing the dimm I launched the same rsync process again and no problems (and therefore no crashes) occured this time.


 


Go to Homepage home RSS Feed
About ck about
Linux Howtos how to's
Nagios Plugins nagios plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

9375 Days
until Death of Computers
Why?