Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

apt-get update error: Could not get lock (SystemD and unattended APT)
Friday - Jun 23rd 2017 - by - (0 comments)

On an Ubuntu 16.04 Xenial machine I got the following error:

 # apt-get update
Reading package lists... Done
E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable)
E: Unable to lock directory /var/lib/apt/lists/

A quick look at the processes showed that a daily run of apt-get update, managed by systemd, seemed hanging:

# ps auxf| grep -i apt
root     28044  0.0  0.0   4508  1700 ?        Ss   03:34   0:00 /bin/sh /usr/lib/apt/apt.systemd.daily
root     28077  0.0  0.1  44628  7344 ?        S    03:34   0:03  \_ apt-get -qq -y update
_apt     28081  0.2  1.4 237124 57924 ?        S    03:34   1:44      \_ /usr/lib/apt/methods/https
_apt     28082  0.0  0.1  43212  5844 ?        S    03:34   0:00      \_ /usr/lib/apt/methods/http
_apt     28083  0.0  0.1  43276  5584 ?        S    03:34   0:00      \_ /usr/lib/apt/methods/http
_apt     28588  0.0  0.1  41036  5432 ?        S    03:36   0:00      \_ /usr/lib/apt/methods/gpgv

I tried to see what these processes are doing... Well for some reason the main process (PID 28077) seemed to have run in a timeout and was stuck in a loop:

# strace -s 1000 -f -p 28077
strace: Process 28077 attached
select(10, [5 6 7 9], [], NULL, {0, 76562}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}) = 0 (Timeout)
select(10, [5 6 7 9], [], NULL, {0, 500000}^Cstrace: Process 28077 detached
 

So what is causing this timeout? Let's check what exactly SystemD did at 03:34 this morning:

Jun 23 03:34:19 onl-lb04-s systemd[1]: Started Daily apt activities.
Jun 23 03:34:19 onl-lb04-s systemd[1]: apt-daily.timer: Adding 9h 3min 10.786515s random time.
Jun 23 03:34:19 onl-lb04-s systemd[1]: apt-daily.timer: Adding 6h 10min 44.777700s random time.
Jun 23 03:34:19 onl-lb04-s systemd[1]: Starting Daily apt activities...

WTF? SystemD seems to have added a total of 15 hours 13mins to the process? So this is why the process keeps hanging in a timeout and therefore locking apt?

There's quite some information on the Internet (if one knows what to look for) concerning this "problem". Obviously I'm not the only one stumbling on the automatic apt updates/upgrades which were enabled by default since Ubuntu 16.04 Xenial. Some good reads:

In general the apt folks agree that the current setup with adding random times to execute is not a good idea:

"We should think about this a bit more" - Julian Andres Klode (apt maintainer)

In the second bug 1686470 (which serves as general brainstorming and technical re-setup of the whole automatic apt update/upgrade) a definitive solution seems to be implemented. But it is yet to be released on Xenial:

Ubuntu Xenial Bug Automatic APT Unattended Updates

Until a definitive fix (re-setup of the automatic apt process) is released, there are the following workarounds:

  • Manually kill such apt processes which appear "hanging" (due to the random and sometimes huge added time)
  • Disable automatic updates/upgrades in /etc/apt/apt.conf.d/20auto-upgrades (by setting both values to 0)
  • If you want automatic apt updates and/or upgrades on your Xenial system, do so the legacy way using cron
  • Wait until bug 1686470 is fixed


 

Debian 9 Stretch and Nagios NRPE (command args and SSL compatibility)
Thursday - Jun 22nd 2017 - by - (0 comments)

Debian Stretch (Debian 9) was released a couple of days ago, on June 17th 2017. In March 2016 I wrote about Debian Jessie (Debian 8) and the problem that the NRPE package was compiled without command arguments allowed. I won't go into details why the command arguments were disabled (read the mentioned article to get these details). This article is somewhat of a follow-up.

In Stretch this is still the same "default"; command arguments are still disabled. But additionally Stretch features the new NRPE version 3.x (3.0.1 to be exact). This can be seen as a breakthrough because NRPE came with version 2.1x for the last many years. It's definitely a big and necessary change, because NRPE became outdated yet it is still widely used in combination with Nagios and Icinga. The NRPE project is now also publicly developed (see Nagios NRPE Github repository).

This means that not only one has to recompile the nagios-nrpe Debian source package to enable command arguments, but one also needs to be made aware how to solve backward compatibility issues.

Let's tackle the first challenge: [ Compatibility between NRPE 2.x and 3.x ]

NRPE 3.x is built on much newer SSL/TLS protocols than NRPE 2.x. Therefore SSL communication between the two NRPE versions doesn't work.
Here I tried to connect from check_nrpe (2.15) to a nagios-nrpe-server (3.0.1):

$ ./check_nrpe -H 10.10.45.10
CHECK_NRPE: Error - Could not complete SSL handshake.

On the server side, the following log entries appeared (NRPE debug logging enabled):

Jun 22 09:04:51 stretch nrpe[1267]: Connection from 10.10.45.50 port 33773
Jun 22 09:04:51 stretch nrpe[1267]: Host address is in allowed_hosts
Jun 22 09:04:51 stretch nrpe[1267]: Error: Request packet version was invalid!
Jun 22 09:04:51 stretch nrpe[1267]: Could not read request from client 10.10.45.50, bailing out...
Jun 22 09:04:51 stretch nrpe[1267]: Connection from 10.10.45.50 closed.

The logs clearly show a problem between the packet versions. But if check_nrpe is launched without SSL encryption (using the -n parameter), the connection works:

$ ./check_nrpe -H 10.10.45.10 -n
NRPE v3.0.1

Server side logging now shows:

Jun 22 09:06:18 stretch nrpe[1301]: Connection from 10.10.45.50 port 5528
Jun 22 09:06:18 stretch nrpe[1301]: Host address is in allowed_hosts
Jun 22 09:06:18 stretch nrpe[1301]: Host 10.10.45.50 is asking for command '_NRPE_CHECK' to be run...
Jun 22 09:06:18 stretch nrpe[1301]: Response to 10.10.45.50: NRPE v3.0.1
Jun 22 09:06:18 stretch nrpe[1301]: Return Code: 0, Output: NRPE v3.0.1
Jun 22 09:06:18 stretch nrpe[1301]: Connection from 10.10.45.50 closed.

Disabling SSL encryption is not a good idea, I agree. But until all hosts (monitoring server and clients) are updated to a newer NRPE 3.x version it is at least a workaround to ensure compatibility between NRPE 2.x and 3.x. As long as the NRPE connection is happening only in the internal networks, there's not too much to worry about either (but be careful if you happen to check servers through the Internet!).

And now to the second challenge: [ Enable command arguments ]

Heads-up: A ready to use and install package for Debian Stretch (and other Debian and Ubuntu versions) can be found here: https://www.claudiokuenzler.com/downloads/nrpe/ 

1) Add the deb-src line into your /etc/apt/sources.list file, if it doesn't exist yet. Use your preferred mirror:

deb-src http://mirror.switch.ch/ftp/mirror/debian/ stretch main

Update the repository list afterwards:

apt-get update

2) Install the build tools and dependencies needed to compile the package:

apt-get build-dep nagios-nrpe
apt-get install devscripts build-essential

3) Download the nagios-nrpe source package:

apt-get source nagios-nrpe

The files will be downloaded into the current directory.

4) Change into the package directory and adapt the debian/rules file:

cd nagios-nrpe-3.0.1/; vi debian/rules

At the end of the "override_dh_auto_configure" the "--enable-command-args" need to be added:

    dh_auto_configure -- \
        --prefix=/usr \
        --sysconfdir=/etc \
        --libdir=/usr/lib/nagios \
        --libexecdir=/usr/lib/nagios/plugins \
        --localstatedir=/var \
        --enable-ssl \
        --with-ssl-lib=/usr/lib/$(DEB_HOST_MULTIARCH) \
        --with-piddir=/var/run/nagios \
        --enable-command-args

5) Edit the changelog:

dch -i

This command will ask you to enter information what exactly you have done to this package. In my case I entered the following information:

 nagios-nrpe (3.0.1-1) stable; urgency=medium

  * Non-maintainer upload.
  * Compiled with command arguments enabled

 -- Claudio Kuenzler   Thu, 22 Jun 2017 09:15:13 +0200

6) Create the new deb package:

debuild -us -uc -sa

7) Move one directory up and you will see the newly created files:

cd ..; ls -la | grep 3.0.1-1
-rw-r--r--  1 ckadm ckadm  53352 Jun 22 09:24 nagios-nrpe-plugin-dbgsym_3.0.1-1_amd64.deb
-rw-r--r--  1 ckadm ckadm  30118 Jun 22 09:24 nagios-nrpe-plugin_3.0.1-1_amd64.deb
-rw-r--r--  1 ckadm ckadm  73252 Jun 22 09:24 nagios-nrpe-server-dbgsym_3.0.1-1_amd64.deb
-rw-r--r--  1 ckadm ckadm 347278 Jun 22 09:24 nagios-nrpe-server_3.0.1-1_amd64.deb
-rw-r--r--  1 ckadm ckadm 347278 Jun 22 09:24 nagios-nrpe-server_3.0.1-1_amd64.stretch.deb
-rw-r--r--  1 ckadm ckadm  13792 Jun 22 09:24 nagios-nrpe_3.0.1-1.debian.tar.xz
-rw-r--r--  1 ckadm ckadm   1225 Jun 22 09:24 nagios-nrpe_3.0.1-1.dsc
-rw-r--r--  1 ckadm ckadm  50600 Jun 22 09:24 nagios-nrpe_3.0.1-1_amd64.build
-rw-r--r--  1 ckadm ckadm   5787 Jun 22 09:24 nagios-nrpe_3.0.1-1_amd64.buildinfo
-rw-r--r--  1 ckadm ckadm   2880 Jun 22 09:24 nagios-nrpe_3.0.1-1_amd64.changes

8) The deb package can now be installed:

root@stretch:/ # dpkg -i /home/ckadm/nagios-nrpe-server_3.0.1-1_amd64.deb
dpkg: warning: downgrading nagios-nrpe-server from 3.0.1-3 to 3.0.1-1
(Reading database ... 36589 files and directories currently installed.)
Preparing to unpack .../nagios-nrpe-server_3.0.1.1_amd64.deb ...
Unpacking nagios-nrpe-server (3.0.1-1) over (3.0.1-3) ...
Setting up nagios-nrpe-server (3.0.1-1) ...
Processing triggers for systemd (232-25) ...
Processing triggers for man-db (2.7.6.1-2) ...

To make sure the new binary is used, restarted the daemon:

root@stretch:/etc/nagios# systemctl restart nagios-nrpe-server

NRPE checks using arguments are now working:

$ ./check_nrpe -H 10.10.45.10 -n -c check_load -a "1,2,3" "4,5,6"
OK - load average: 0.22, 0.09, 0.04|load1=0.220;1.000;4.000;0; load5=0.090;2.000;5.000;0; load15=0.040;3.000;6.000;0;

NRPE server side logging shows:

Jun 22 09:18:03 stretch nrpe[17194]: Connection from 10.10.45.50 port 26246
Jun 22 09:18:03 stretch nrpe[17194]: Host address is in allowed_hosts
Jun 22 09:18:03 stretch nrpe[17194]: Host 10.10.45.50 is asking for command 'check_load' to be run...
Jun 22 09:18:03 stretch nrpe[17194]: Running command: /usr/lib/nagios/plugins/check_load -w 1,2,3 -c 4,5,6
Jun 22 09:18:03 stretch nrpe[17194]: Command completed with return code 0 and output: OK - load average: 0.28, 0.14, 0.05|load1=0.280;1.000;4.000;0; load5=0.140;2.000;5.000;0; load15=0.050;3.000;6.000;0;
Jun 22 09:18:03 stretch nrpe[17194]: Return Code: 0, Output: OK - load average: 0.28, 0.14, 0.05|load1=0.280;1.000;4.000;0; load5=0.140;2.000;5.000;0; load15=0.050;3.000;6.000;0;
Jun 22 09:18:03 stretch nrpe[17194]: Connection from 10.10.45.50 closed.


 

Cannot boot Linux Mint 18.1 after running apt-get upgrade
Monday - Jun 19th 2017 - by - (0 comments)

Last Friday (June 16th 2017) I ran apt-get upgrade on a Linux Mint 18.1 machine and then rebooted the machine. To my big disbelieve the machine didn't come up anymore. After the grub2 menu, the following error appeared:

Loading Linux 4.4.0-53-generic ...
error: attempt to read or write outside of disk 'hd0'.
Loading initial ramdisk ...
error: you need to load the kernel first.

Press any key to continue...

Erm...Noooo! T_T

I booted into the Linux Mint 18.1 Live CD, mounted the drive /dev/sda1 as /mnt, bind mounted sys dev and proc and recreated the initramfs inside the chroot jail of the installed Linux Mint:

mount /dev/sda1 /mnt
mount --bind /sys /mnt/sys
mount --bind /proc /mnt/proc
mount --bind /dev /mnt/dev
chroot /mnt /bin/bash
mkinitramfs -o /boot/initrd.img-4.4.0-53-generic
update-grub
grub-install /dev/sda
exit
reboot

But I still got the same error after the reboot... I also manually selected the correct partition in grub in order to rule out a grub2 error:

grub> ls
(hd0) (hd0,msdos5) (hd0,msdos1)
grub> ls (md0,msdos5)
        Partition hd0,msdos5: No known filesystem detected...
grub> ls (md0,msdos1)
        Partition hd0,msdos1: Filesystem type ext* - last modification time 2017-06-19 05:44:26....
Partition start at 1024KiB - Total size 486333440KiB
grub> set root=(hd0,msdos1)
grub> set prefix=(hd0,msdos1)/boot/grub
grub> insmod normal
grub> normal

This got me back to the (updated) grub menu, but once I selected the entry for Linux Mint I got the same error again.

I booted again with the Live CD, this time with the goal to reinstall the Linux Kernel.

mount /dev/sda1 /mnt
mount --bind /sys /mnt/sys
mount --bind /proc /mnt/proc
mount --bind /dev /mnt/dev
chroot /mnt /bin/bash
apt-get install linux-image-generic

To my big surprise a new Linux Kernel 4.4.0-79 was installed. I expected apt to return "already installed" or something like this.

The installer returned some errors complaining about missing Linux headers but still finished the installation of the new linux-image package. I installed the header package afterwards:

apt-get install linux-headers-4.4.0-79-generic

The installation of the new Kernel packages created a new initramfs in /boot/initrd.img-4.4.0-79-generic and also a new/updated entry in /boot/grub/grub.cfg with the new available Kernel version.

I exited the chroot environment and rebooted the machine:

exit
reboot

And this time Linux Mint 18.1 was booting again.

 

AWS EC2 instance unreachable after reboot ([Errno 101])
Friday - Jun 16th 2017 - by - (0 comments)

Already two weeks ago I had a problem with a newly installed EC2 instance launched from an AMI image on the AWS cloud (Amazon Web Services). After I rebooted the instance, an updated Ubuntu 16.04 Xenial machine, the system didn't come up anymore. Even the ping from another instance in the same subnet didn't work. 

Today I came across the same problem. Different region (this time Europe-Ireland, the one from two weeks ago was Europe-Frankfurt), different VPC, different network segments, but the exact same problem. The instance didn't come up anymore after a reboot. Just prior to this instance I have set up two other instances the exact same way (except a different Availability Zone was used) and neither of these had a problem after a reboot.

By using Instance "Settings -> Get System Log" I was able to get some information whats going on:

 [[0;32m  OK  [0m] Reached target Network.
[    7.511380] cloud-init[770]: Cloud-init v. 0.7.9 running 'init' at Fri, 16 Jun 2017 11:46:18 +0000. Up 7.41 seconds.
[    7.515990] cloud-init[770]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
[    7.520087] cloud-init[770]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[    7.524021] cloud-init[770]: ci-info: | Device |   Up  |  Address  |    Mask   | Scope |     Hw-Address    |
[    7.528206] cloud-init[770]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[    7.532219] cloud-init[770]: ci-info: |  ens3  | False |     .     |     .     |   .   | 0a:76:72:96:c7:52 |
[    7.536186] cloud-init[770]: ci-info: |   lo   |  True | 127.0.0.1 | 255.0.0.0 |   .   |         .         |
[    7.540244] cloud-init[770]: ci-info: |   lo   |  True |  ::1/128  |     .     |  host |         .         |
[    7.544243] cloud-init[770]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[    7.548428] cloud-init[770]: 2017-06-16 13:46:18,547 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))]
[    8.514438] cloud-init[770]: 2017-06-16 13:46:19,550 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))]
[    9.517358] cloud-init[770]: 2017-06-16 13:46:20,553 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))]
[   10.520266] cloud-init[770]: 2017-06-16 13:46:21,555 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [3/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))]
[... messages repeating ...]

For some reason, the network interface ens3 didn't come up:

| Device |   Up  |  Address  |    Mask   | Scope | Hw-Address        |  
|  ens3  | False |     .     |     .     |   .   | 0a:76:72:96:c7:52 |

No reason whatsoever in the log file.

Even in the EC2 administration UI the status check was at 1/2, so something's definitely not working correctly:

AWS EC2 instance unreachable after reboot

As there is no console terminal access in AWS, I had no other choices than to:

  • reboot the instance in the EC2 administration -> didn't help
  • stop the instance, then start the instance -> didn't help
  • stop the instance, change the instance type -> didn't help

All the tools given by AWS didn't help. As the instance is not usable anymore, I terminated it and created a new instance from the same AMI image. Same instance type, same network settings, everything exactly the same. This time, the instance was able to reboot with no problem whatsoever. I came across a question on serverfault (amazon ec2 - Unable to connect to EC2 instance after "reboot") which describes the same problem as I experienced (already twice now in two weeks), but unfortunately without solution. For now the most plausible explanation to me is a bug in the AWS infrastructure which hits randomly (?). I didn't find a way to reproduce this problem.

Summary: You rebooted your EC2 instance (from within the guest OS)? It doesn't come up afterwards? Even pings from the same subnet don't work? You get the same message as above in the system log? Recreate the instance from scratch. :-(

 

Easy FX currency converter in PHP
Wednesday - Jun 14th 2017 - by - (0 comments)

While I was programming an order form which at the end shows a price in the local currency (CHF), I also wanted to show the price in EUR and USD as a reference.

I already thought I'd have to use some very complex tasks but then I came across fixer.io, an open source exchange rate API. In the background it uses exchange rates updated and released once a day from the ECB (European Central Bank). As I don't need realtime exchange rates, that's perfect.

As my base currency is Swiss Francs (CHF), I set the base currency to CHF like this:

$ curl http://api.fixer.io/latest?base=CHF
{"base":"CHF","date":"2017-06-14","rates":{"AUD":1.359,"BGN":1.7986,"BRL":3.4051,"CAD":1.3607,"CNY":7.0027,"CZK":24.06,"DKK":6.8386,"GBP":0.8089,"HKD":8.0368,"HRK":6.8048,"HUF":281.76,"IDR":13684.0,"ILS":3.6363,"INR":66.245,"JPY":113.65,"KRW":1159.1,"MXN":18.584,"MYR":4.3868,"NOK":8.6573,"NZD":1.4211,"PHP":51.007,"PLN":3.8594,"RON":4.1994,"RUB":58.805,"SEK":8.9616,"SGD":1.4215,"THB":34.967,"TRY":3.6198,"USD":1.0303,"ZAR":13.123,"EUR":0.91962}}

As you can see, this results in a json output. To have the output readable for a human, one can use jshon:

$ curl -q http://api.fixer.io/latest?base=CHF |jshon
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   447  100   447    0     0   5623      0 --:--:-- --:--:-- --:--:--  5658
{
 "base": "CHF",
 "date": "2017-06-14",
 "rates": {
  "AUD": 1.359,
  "BGN": 1.7986,
  "BRL": 3.4051,
  "CAD": 1.3607,
  "CNY": 7.0026999999999999,
  "CZK": 24.059999999999999,
  "DKK": 6.8385999999999996,
  "GBP": 0.80889999999999995,
  "HKD": 8.0367999999999995,
  "HRK": 6.8048000000000002,
  "HUF": 281.75999999999999,
  "IDR": 13684.0,
  "ILS": 3.6362999999999999,
  "INR": 66.245000000000005,
  "JPY": 113.65000000000001,
  "KRW": 1159.0999999999999,
  "MXN": 18.584,
  "MYR": 4.3868,
  "NOK": 8.6572999999999993,
  "NZD": 1.4211,
  "PHP": 51.006999999999998,
  "PLN": 3.8593999999999999,
  "RON": 4.1993999999999998,
  "RUB": 58.805,
  "SEK": 8.9616000000000007,
  "SGD": 1.4215,
  "THB": 34.966999999999999,
  "TRY": 3.6198000000000001,
  "USD": 1.0303,
  "ZAR": 13.122999999999999,
  "EUR": 0.91961999999999999
 }
}

Nice! This gets me somewhere already!

Now I needed to use this data in my PHP code. And this is actually bloody simple:

// Currency converter
$fxrates = json_decode(file_get_contents('http://api.fixer.io/latest?base=CHF'),true);
//echo '<pre>' . print_r($fxrates, true) . '</pre>';
echo $fxrates['rates']['EUR'];

First the API URL is called and loaded into PHP using file_get_contents. The json content is immediately decoded using json_decode and stored in a nested array $fxrates.
The commented line can be used for printing the whole $fxrates array (which helped me find the final variable name).
The last line finally prints/echoes the exchange rate CHF/EUR.

$ php chfeur.php
0.91962

This works of course with all the other listed currencies, too.

Kudos to Hakan Ensari, who has also put fixer.io on Github.

 

Rename jpeg files to contain the date the picture was taken
Saturday - May 6th 2017 - by - (0 comments)

I have a ton of pictures on my mobile phone. From time to time I put them on my central backup machine. However I came across a few issues when I wanted to sort them (for example when I wanted to create a family album):

  • The filenames are often just a file counter. For example the camera just counts the picture number up like DSC0001, DSC0002, DSC0003 etc. Somewhere in Android this can (sometimes) be changed in the settings where the file name should contain the date.
  • When I moved pictures from the internal memory to the external (SD card) memory, all picture's file dates were changed to the day the files were moved. 
  • Manually trying to find the original date and then rename the pictures takes hours.

Luckily I came across jhead. An open source tool, written by Matthias Wandel, which can be used on all platforms. For Windows, Linux and Mac OS X there's a pre-compiled executable file which can be launched from the command line (no graphical interface).

jhead is able to read the Exif JPEG headers which still (should) contain the original date the picture was taken. For example:

D:\Downloads\Grafik\JHead>jhead.exe D:\tmp\DSC*
File name    : D:\tmp\DSC_0261.JPG
File size    : 3264489 bytes
File date    : 2016:03:24 12:50:06
Camera make  : Sony
Camera model : D6503
Date/Time    : 2016:02:13 12:17:55
Resolution   : 3840 x 2160
Flash used   : No
Focal length :  4.9mm
Exposure time: 0.0010 s  (1/1000)
Aperture     : f/2.0
ISO equiv.   : 50
Whitebalance : Auto
Metering Mode: pattern
GPS Latitude : ? ?
GPS Longitude: ? ?
JPEG Quality : 97

( Yes, for once I'm on a Windows OS )

As you can see above, the file date and the date/time differ. The "File date" represents the date the picture was moved from the phone's internal memory to the SD card. The "Date/Time" value represents the date and time when the picture was taken.

But jhead is not only able to read the Exif JPEG headers. It comes with a function (-n) to rename the JPEG file using certain variables. In the following example I renamed all files starting with DSC in the folder D:\tmp using the Exif JPEG Date/Time in the format Month-Day-Year-Filename (%m-%d-%Y-%f):

D:\Downloads\Grafik\JHead>jhead.exe -n%m-%d-%Y-%f D:\tmp\DSC*
D:\tmp\DSC_0261.JPG --> D:\tmp\02-13-2016-DSC_0261.jpg
D:\tmp\DSC_0262.JPG --> D:\tmp\02-13-2016-DSC_0262.jpg

Jpeg photos renamed to date picture was taken 

The result is exactly what I needed: The date the picture was taken is now the prefix of the filename, followed by the original filename. Finally I'm able to quickly and properly sort all kinds of pictures - even when they were taken by different cameras and had different filenames.

 

Ubuntu 16.04 reboot and shutdown hanging on LVM metadata daemon
Friday - May 5th 2017 - by - (0 comments)

I've seen this problem already for a couple of months now, basically since we first started to install Ubuntu 16.04 Xenial machines. When a reboot or a shutdown was launched, the system kept hanging. In the VM console the last entry was either on

[ OK ] Stopped LVM2 metadata daemon.

Xenial Reboot hanging

or on

[ *** ] A start job is running for Unattended Upgrades Shutdown (1min 27s / no limit)

Xenial reboot hanging

At the begin I always suspected some problem in the LVM settings. All the affected machines were either clones from another VM or a new VM deployed from a template. In /etc/lvm/backup/VGNAME I saw the name of the template appeared.

But further research shows it is a problem in the unattended-upgrade package. In Ubuntu bug #1654600 it is reported that this shutdown hanging happens when /var is a separate file system and LVM is being used; which is the case on almost all our machines. As of this writing (May 5th 2017) a fix of the unattended-upgrades package is in the pipeline (Fix Committed) but not yet released.

An immediate workaround was found in Ubuntu bug #1661611, which is marked as duplicate of #1654600:

 - fix /usr/share/unattended-upgrades/unattended-upgrades-shutdown to expect "false" instead of "False"

# sed -n "120p" /usr/share/unattended-upgrades/unattended-upgrade-shutdown
    if apt_pkg.config.find_b("Unattended-Upgrade::InstallOnShutdown", False):

# sed -i "120s/False/false/" /usr/share/unattended-upgrades/unattended-upgrade-shutdown

# sed -n "120p" /usr/share/unattended-upgrades/unattended-upgrade-shutdown
    if apt_pkg.config.find_b("Unattended-Upgrade::InstallOnShutdown", false):

But this only "works" because the Python syntax is incorrect now (Python wants "False", not "false"):

So this is not advisable either (although it saves you a 10 minute wait until your machine reboots). Best solution is to wait for the definitive fix of bug #1654600 (or in the meantime uninstall the unattended-upgrade package).

Update May 10th 2017:
I just checked the current status of the unattended-upgrade package and version 0.90ubuntu0.5 (which contains the fix) was released and can now be installed.

 

Monitor a PostgreSQL database in AWS RDS with check_postgres
Thursday - Apr 27th 2017 - by - (0 comments)

Monitoring an AWS RDS database (using PostgreSQL as database engine) is not particularly difficult. The most important points can be checked with check_postgres.pl. This monitoring plugin is already part of the Icinga2 ITL definition and can be used (in a standard setup) using the check "postgres".

An example to monitor the number of connections on a DB:

# check postgres connections
object Service "Postgres Connections" {
  import "service-5m-normal"
  host_name = "aws-rds"
  check_command = "postgres"
  vars.postgres_host = "myfancypgsql.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com"
  vars.postgres_dbuser = "mydbuser"
  vars.postgres_dbpass = "mydbpass"
  vars.postgres_dbname = "mydbname"
  vars.postgres_action = "backends"
}

This results in the following output:

POSTGRES_BACKENDS OK: DB "mydbname" (host:myfancypgsql.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com) 4 of 648 connections (1%)

But certain checks require special attention. One of them is the plugin's action "database_size". This action checks all databases running on the target host for their current size and compare the size against thresholds:

# check postgres connections
object Service "Postgres Connections" {
  import "service-5m-normal"
  host_name = "aws-rds"
  check_command = "postgres"
  vars.postgres_host = "myfancypgsql.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com"
  vars.postgres_dbuser = "mydbuser"
  vars.postgres_dbpass = "mydbpass"
  vars.postgres_dbname = "mydbname"
  vars.postgres_action = "database_size"
  vars.postgres_warning = "500MB"
  vars.postgres_critical = "1000MB"
}

This resulted in the following error message:

Permission denied on database rdsadmin.

That's right. AWS creates a database called "rdsadmin" on each RDS instance. Purpose? Most likely handle privileges coming from the AWS console/UI. But what else is in there? We'll never know because only AWS know the credentials to this database. However this causes a problem with check_postgres.pl.
A way to handle this "hidden database" and exclude it is to only check for database objects belonging to the instance's master user. This can be achieved by using the plugin's option "includeuser". In Icinga 2's service object definition, this looks like:

# check postgres connections
object Service "Postgres Connections" {
  import "service-5m-normal"
  host_name = "aws-rds"
  check_command = "postgres"
  vars.postgres_host = "myfancypgsql.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com"
  vars.postgres_dbuser = "mydbuser"
  vars.postgres_dbpass = "mydbpass"
  vars.postgres_dbname = "mydbname"
  vars.postgres_action = "database_size"
  vars.postgres_includeuser = "masteruser"
  vars.postgres_warning = "500MB"
  vars.postgres_critical = "1000MB"
}

This tells the plugin to look only for objects (like databases) which belong to "masteruser" (database owner). Result:

POSTGRES_DATABASE_SIZE CRITICAL: DB "mydbname" (host:host:myfancypgsql.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com) mydbname: 32215345336 (30 GB) postgres: 6744248 (6586 kB) template1: 6744248 (6586 kB)

As you can see, the database "rdsadmin" was excluded. All the checked databases are owned by the master user "masteruser".

Note: Another way would be to define "excludeuser" but at this moment I am not aware of the username being owner of "rdsadmin".

 

Empty page returned from Nginx with fastcgi caching (HEAD vs GET)
Friday - Apr 21st 2017 - by - (0 comments)

A while ago I wrote about Enable caching in Nginx Reverse Proxy (and solve cache MISS only). The article back then was focusing on caching on a reverse proxy setup, meaning it was a proxy_cache. But Nginx also has the possibility to cache FastCGI backends.

For a customer I wanted to enable this kind of caching (fastcgi_cache) for the PHP backend using PHP-FPM. The setup is pretty much the same as proxy_cache - just call it fastcgi_cache instead ;-)

In Nginx's http section:

    ##
    # Caching settings
    ##
    fastcgi_cache_path  /var/www/cache levels=1:2 keys_zone=cachecool:100m max_size=1000m inactive=60m;
    fastcgi_cache_key "$scheme$host$request_uri";

And in the vhost config file within the .php file extension location:

  location ~ \.php$ {
    try_files $uri =404;
    default_type text/html; charset utf-8;
    fastcgi_split_path_info ^(.+\.php)(.*)$;
    fastcgi_pass unix:/run/php5-fpm.sock;
    fastcgi_next_upstream error timeout invalid_header http_500 http_503;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
    fastcgi_param HTTP_X_REAL_IP $remote_addr;

    # FastCGI Caching
    fastcgi_cache cachecool;
    fastcgi_cache_valid 200 60m;
    fastcgi_cache_methods GET;
    add_header X-Cache $upstream_cache_status;
  }

Then reload Nginx.

So far so good, caching seems to work which I was able to test with curl:

$ curl -I https://app.example.com/phpinfo.php -v
* Hostname was NOT found in DNS cache
*   Trying 1.2.3.4...
* Connected to app.example.com (1.2.3.4) port 443 (#0)
[...ssl stuff...]
> HEAD /phpinfo.php HTTP/1.1
> User-Agent: curl/7.35.0
> Host: app.example.com
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
* Server nginx/1.10.3 is not blacklisted
< Server: nginx/1.10.3
Server: nginx/1.10.3
< Date: Fri, 21 Apr 2017 06:16:56 GMT
Date: Fri, 21 Apr 2017 06:16:56 GMT
< Content-Type: text/html; charset=UTF-8
Content-Type: text/html; charset=UTF-8
< Connection: keep-alive
Connection: keep-alive
< Vary: Accept-Encoding
Vary: Accept-Encoding
< X-Cache: MISS
X-Cache: MISS

Second curl request:

$ curl -I https://app.example.com/phpinfo.php -v
* Hostname was NOT found in DNS cache
*   Trying 1.2.3.4...
* Connected to app.example.com (1.2.3.4) port 443 (#0)
[...ssl stuff...]
> HEAD /phpinfo.php HTTP/1.1
> User-Agent: curl/7.35.0
> Host: app.example.com
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
* Server nginx/1.10.3 is not blacklisted
< Server: nginx/1.10.3
Server: nginx/1.10.3
< Date: Fri, 21 Apr 2017 06:16:57 GMT
Date: Fri, 21 Apr 2017 06:16:57 GMT
< Content-Type: text/html; charset=UTF-8
Content-Type: text/html; charset=UTF-8
< Connection: keep-alive
Connection: keep-alive
< Vary: Accept-Encoding
Vary: Accept-Encoding
< X-Cache: HIT
X-Cache: HIT

The X-Cache header confirmed me: I hit the cache. So it's working.

But when I wanted to check the same URL in the browser, I got slapped in the face:

Cached php page empty in browser 

The page is empty?!

I manually removed the Nginx cache...

$ rm -rf /var/www/cache/*

... and then refreshed the browser. phpinfo.php was shown correctly now!
What's the difference between my curl and my browser request? Note the -I parameter after the curl command. This uses the HEAD request method. HEAD only returns the HTTP headers, no body. Accessing a website in the browser (usually) uses the GET method. So here we have the difference and also the explanation why the body was empty, resulting in an empty page.

So how do I solve this? The solution (found on ServerFault.com) is actually pretty self-explanatory: The caching key also needs to contain the request method. Remember the setup in Nginx's http section?

    fastcgi_cache_key "$scheme$host$request_uri";

 Here I only set the $scheme (https), the $host (app.example.com) and the $request_uri (/phpinfo.php). The caching mechanism doesn't differ between a GET, a HEAD or even another request method like POST. So let's add this:

    fastcgi_cache_key "$scheme$request_method$host$request_uri";

After a Nginx reload, I tried to reproduce the empty cache problem:

  • Removed the cache (rm -rf /var/www/cache/*)
  • Launched the curl command with HEAD request method (curl -I) from above - X-Cache was a MISS
  • Launched the same curl command again - X-Cache was a HIT
  • Opened the same URL in browser: phpinfo was shown correctly

Solved!

 

Nginx compile fails with fatal error: ld terminated with signal 9
Friday - Apr 21st 2017 - by - (0 comments)

If you ever experience a failed nginx compile process, which ends with the following error:

collect2: fatal error: ld terminated with signal 9 [Killed]
compilation terminated.
objs/Makefile:355: recipe for target 'objs/nginx' failed
make[2]: *** [objs/nginx] Error 1
make[2]: Leaving directory '/home/builder/build/nginx-1.12.0'
Makefile:8: recipe for target 'build' failed
make[1]: *** [build] Error 2
make[1]: Leaving directory '/home/builder/build/nginx-1.12.0'

Check out your memory usage. Dmesg is your friend:

[237469.710196] Out of memory: Kill process 685 (ld) score 716 or sacrifice child
[237469.712587] Killed process 685 (ld) total-vm:374356kB, anon-rss:357624kB, file-rss:0kB

After increasing the memory capacity of this VM, the nginx compile worked just fine.

 


Go to Homepage home
Linux Howtos how to's
Monitoring Plugins monitoring plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7514 Days
until Death of Computers
Why?