Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Nagios/Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

rsyslog not logging into /var/log/mail.log? Check permissions!
Monday - Aug 22nd 2016 - by - (0 comments)

For a week or so I wondered why on one SMTP server (Ubuntu 16.04 Xenial with Rsyslogd and Postfix) rsyslog never logged into /var/log/mail.log, although this is clearly defined in the rsyslog config file:

 # cat /etc/rsyslog.d/50-default.conf | grep mail
mail.*                -/var/log/mail.log
# Logging for the mail system.  Split it up so that
#mail.info            -/var/log/mail.info
#mail.warn            -/var/log/mail.warn
mail.err            /var/log/mail.err
#    news.none;mail.none    -/var/log/debug
#    mail,news.none        -/var/log/messages
#daemon,mail.*;\
daemon.*;mail.*;\

Instead all log entries from the mail facility were logged into /var/log/syslog.

Yet on another SMTP server the mail facility log entries were correctly logged into /var/log/mail.log. Strangely enough, both systems were set up the same way.

Today I got some time for investigation and found out, that the permissions of the folder /var/log was different:

On SMTP01 (where mail logging happened into /var/log/syslog):

root@smtp01:/var# stat log
  File: 'log'
  Size: 4096          Blocks: 8          IO Block: 4096   directory
Device: fc00h/64512d    Inode: 1005        Links: 11
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (  108/  syslog)
Access: 2016-08-22 08:29:56.243493060 +0200
Modify: 2016-08-22 08:29:55.747484499 +0200
Change: 2016-08-22 08:29:55.747484499 +0200
 Birth: -

On SMTP01 (where mail logging happened correctly into /var/log/mail.log):

 root@smtp02:/var# stat log
  File: 'log'
  Size: 4096          Blocks: 8          IO Block: 4096   directory
Device: fc01h/64513d    Inode: 1005        Links: 11
Access: (0775/drwxrwxr-x)  Uid: (    0/    root)   Gid: (  108/  syslog)
Access: 2016-08-22 08:25:37.991669507 +0200
Modify: 2016-08-22 06:25:04.620044011 +0200
Change: 2016-08-22 06:25:04.620044011 +0200
 Birth: -

On SMTP01 the permissions were 0755, on SMTP02 0775. Big difference!

After I set the same permissions on smtp01 and restarting rsyslogd, logging of the mail facility started into /var/log/mail.log.

However I still don't know where this permission diff came from. In no logfile (and I have command auditing active) I was able to find a command who'd have edited the permissions.

 

Testing multicast is working with omping
Wednesday - Aug 10th 2016 - by - (0 comments)

Needed to see if multicast connections between two hosts were working. omping is designed to do this for you:

"The omping is program which uses User Datagram Protocol to determine if computer is able to send and/or receive IP unicast and multicast or Broadcast packets from the network."

In my case I needed to see if two CentOS 7 hosts are able to communicate with each other with multicast. On each host the program omping was installed:

yum install omping

And then omping is launched on both hosts:

omping host1 host2

 The output on both hosts should show the omping replies from the other host:

[root@host1 ~]# omping host1 host2
host2 : waiting for response msg
host2 : joined (S,G) = (*, 232.43.211.234), pinging
host2 :   unicast, seq=1, size=69 bytes, dist=0, time=0.122ms
host2 :   unicast, seq=2, size=69 bytes, dist=0, time=0.258ms
host2 : multicast, seq=2, size=69 bytes, dist=0, time=0.371ms
host2 :   unicast, seq=3, size=69 bytes, dist=0, time=0.328ms
host2 : multicast, seq=3, size=69 bytes, dist=0, time=0.434ms
host2 :   unicast, seq=4, size=69 bytes, dist=0, time=0.249ms
host2 : multicast, seq=4, size=69 bytes, dist=0, time=0.374ms
host2 :   unicast, seq=5, size=69 bytes, dist=0, time=0.270ms
host2 : multicast, seq=5, size=69 bytes, dist=0, time=0.379ms
host2 :   unicast, seq=6, size=69 bytes, dist=0, time=0.240ms
host2 : multicast, seq=6, size=69 bytes, dist=0, time=0.348ms
host2 :   unicast, seq=7, size=69 bytes, dist=0, time=0.272ms

[root@host2 ~]# omping host1 host2
host1 : waiting for response msg
host1 : waiting for response msg
host1 : joined (S,G) = (*, 232.43.211.234), pinging
host1 :   unicast, seq=1, size=69 bytes, dist=0, time=0.116ms
host1 : multicast, seq=1, size=69 bytes, dist=0, time=0.235ms
host1 :   unicast, seq=2, size=69 bytes, dist=0, time=0.219ms
host1 : multicast, seq=2, size=69 bytes, dist=0, time=0.329ms
host1 :   unicast, seq=3, size=69 bytes, dist=0, time=0.700ms
host1 : multicast, seq=3, size=69 bytes, dist=0, time=0.800ms
host1 :   unicast, seq=4, size=69 bytes, dist=0, time=0.249ms
host1 : multicast, seq=4, size=69 bytes, dist=0, time=0.360ms
host1 :   unicast, seq=5, size=69 bytes, dist=0, time=0.232ms
host1 : multicast, seq=5, size=69 bytes, dist=0, time=0.323ms
host1 :   unicast, seq=6, size=69 bytes, dist=0, time=0.291ms
host1 : multicast, seq=6, size=69 bytes, dist=0, time=0.377ms
host1 :   unicast, seq=7, size=69 bytes, dist=0, time=0.291ms
host1 : multicast, seq=7, size=69 bytes, dist=0, time=0.382ms

Multicast works between these two hosts running in the same data center.

Now I added a third host, running outside of the data center. Which means the traffic needs to go through a couple of routers and switches. Will multicast still work?

[root@host1 ~]# omping host1 host2 host3
host2 : waiting for response msg
host3 : waiting for response msg
host2 : waiting for response msg
host3 : waiting for response msg
host2 : joined (S,G) = (*, 232.43.211.234), pinging
host2 :   unicast, seq=1, size=69 bytes, dist=0, time=0.158ms
host2 : multicast, seq=1, size=69 bytes, dist=0, time=0.283ms
host3 : waiting for response msg
host2 :   unicast, seq=2, size=69 bytes, dist=0, time=0.199ms
host2 : multicast, seq=2, size=69 bytes, dist=0, time=0.323ms
host3 : joined (S,G) = (*, 232.43.211.234), pinging
host3 :   unicast, seq=1, size=69 bytes, dist=6, time=4.050ms
host2 :   unicast, seq=3, size=69 bytes, dist=0, time=0.267ms
host2 : multicast, seq=3, size=69 bytes, dist=0, time=0.382ms
host3 :   unicast, seq=2, size=69 bytes, dist=6, time=3.990ms
host2 :   unicast, seq=4, size=69 bytes, dist=0, time=0.299ms
host2 : multicast, seq=4, size=69 bytes, dist=0, time=0.308ms
host3 :   unicast, seq=3, size=69 bytes, dist=6, time=4.110ms

[root@host2 ~]# omping host1 host2 host3
host1 : waiting for response msg
host3 : waiting for response msg
host1 : joined (S,G) = (*, 232.43.211.234), pinging
host1 :   unicast, seq=1, size=69 bytes, dist=0, time=0.146ms
host1 :   unicast, seq=2, size=69 bytes, dist=0, time=0.229ms
host1 : multicast, seq=2, size=69 bytes, dist=0, time=0.338ms
host3 : waiting for response msg
host1 :   unicast, seq=3, size=69 bytes, dist=0, time=0.314ms
host1 : multicast, seq=3, size=69 bytes, dist=0, time=0.436ms
host3 : joined (S,G) = (*, 232.43.211.234), pinging
host3 :   unicast, seq=1, size=69 bytes, dist=6, time=4.108ms
host1 :   unicast, seq=4, size=69 bytes, dist=0, time=0.208ms
host1 : multicast, seq=4, size=69 bytes, dist=0, time=0.316ms
host3 :   unicast, seq=2, size=69 bytes, dist=6, time=3.981ms
host1 :   unicast, seq=5, size=69 bytes, dist=0, time=0.257ms

claudio@host3 ~ omping host1 host2 host3
host1 : waiting for response msg
host2 : waiting for response msg
host1 : joined (S,G) = (*, 232.43.211.234), pinging
host2 : joined (S,G) = (*, 232.43.211.234), pinging
host2 :   unicast, seq=1, size=69 bytes, dist=6, time=3.948ms
host1 :   unicast, seq=1, size=69 bytes, dist=6, time=3.985ms
host1 :   unicast, seq=2, size=69 bytes, dist=6, time=4.076ms
host2 :   unicast, seq=2, size=69 bytes, dist=6, time=4.072ms
host2 :   unicast, seq=3, size=69 bytes, dist=6, time=4.072ms
host1 :   unicast, seq=3, size=69 bytes, dist=6, time=4.112ms
host1 :   unicast, seq=4, size=69 bytes, dist=6, time=4.045ms
host2 :   unicast, seq=4, size=69 bytes, dist=6, time=4.056ms

 While multicast between host1 and host2 still works (as one can see in the replies), all connections to host3 only work using unicast. In this case multicast connectivity to host3 does not work.

 

When the OS says you are old
Tuesday - Aug 2nd 2016 - by - (0 comments)

Just came across the following message on a reboot of a virtual machine running on SLES 10 SP4:

/dev/sda1 has gone 49710 days without being checked, check forced. 

fsck check forced

 

This makes me feel much older than I am ...

 

Reviewing 6 months with Linux Mint 17.3 as desktop OS
Monday - Jul 11th 2016 - by - (0 comments)

Back in January 2016 I decided to ditch openSUSE as my desktop OS of choice. At first this was a hard choice to take, as I used openSUSE in the years before and it is also worth to mention that SuSE Linux Professional 7 was my first Linux installation I ever used. 

But instability and issues have eased the letting-go process. First there was the upgrade issue from openSUSE 12.3 to Tumbleweed in November 2015. However this was merely a test to see if an upgrade would work. I then used Tumbleweed (new installation from scratch) - until January 2016. As every now and then I installed system patches with "zypper dup", this time the upgrade was stuck when upgrading systemd. The OS never recovered and I wasn't able to boot into it anymore.

Because I just used Tumbleweed for two months (and didn't have any relevant or important data on this machine) I decided to ditch SUSE and go with the most popular distro (information from DistroWatch.com as of January 2016 and also of this writing in July 2016): Linux Mint.

I was skeptical at first but I was positively surprised. Here are the pros and cons after having used Linux Mint 17.3 on a daily basis.

Pros

  • My Linux of choice in server environments is Debian and Ubuntu. Because Linux Mint 17.3 is based on Ubuntu Trusty, the commands and packages in the background are the same. My "apt-get" commands work the same way on Linux Mint as on the servers I manage and the package names are also the same (Ubuntu 14.04 trusty packages). So the OS is familiar to what I work with.
  • Even though there are a lot of packages already in the Ubuntu base repositories, additional repositories can be added from PPA (openSUSE had such a feature too, to be fair).
  • Based on the points above: Why not take a "normal" Ubuntu desktop installation then? Because, even after several trials, I never came to love nor appreciate the Unity desktop. Something I always loved on SUSE was the KDE or the Gnome desktop which was kind of similar to MS Windows (start-button, task list etc) but yet very modifiable. With Linux Mint 17.3 and the Cinnamon desktop (which is a GNOME fork) I got a very stable desktop which is highly modifiable - yet I still intuitively know where to click.
  • At work I connect the notebook (a Dell Latitude E7440) to a docking station and use two 24" screens. The setup for this was quick and painless through the "Display" settings. The whole graphics and screen setup was without pain and just worked out of the box (something I wasn't used to when coming from openSUSE).
  • The company I work for "officially" runs on Windows clients. I virtualized the Windows OS which was installed previously on the very same machine and use it since then as a VMware virtual machine in VMware Workstation. Works great.
  • Compared to the Windows OS installed before, the increase of speed is enormous. The notebook has a 256GB SSD. With Linux Mint I really feel the disk speed while on Windows the whole OS appeared slow in comparison. I have to add that this isn't probably the fault of Microsoft's Windows alone but rather of the amount of GPO's and network share connections and distributed file system setups by my employer.
  • I work more efficiently. Again, this is compared to the pre-installed Windows as I now run multiple terminal shells (also Terminator) and don't have to use douzens uf PuTTY windows.
  • Bluetooth. I was very surprised that my Bluetooth headset (a Plantronic Voyager) connected so well. On Windows I had once the problem that the heaset was out of reach and since then I wasn't able to successfully connect it anymore - even after deleting and reinstalling the drivers, etc.
  • By only using Linux Mint (without VMware Workstation and its virtual machines) I got a much longer battery life of the notebook than when I was running Windows OS. 
  • Debugging SD cards and other hardware. It's far easier to plug a SD card or an external HDD via USB to a Linux OS and run several diagnostic tools or file system checks on it than doing this with Windows. Was especially helpful when I needed to debug a SD card from a Raspberry Pi.

Cons

  • What works great in Windows is the support of audio devices. Especially if you have multiple sources and outputs of audio devices you can usually select them within the applications. On Linux Mint multiple outputs are still difficult to setup. A particular example is Skype. When I want to listen to music on my plugged headphones but want to communicate with my Bluetooth headset, it's not possible. I have to change the Pulseaudio settings to select the input and output device (generally for the whole machine) in order to use Skype with the devices I intended it for. To my knowledge the multiple audio support is in general a problem in Linux desktops but also Skype is kind of responsible in my case because it ONLY supports Pulseaudio and doesn't accept the select the audio devices directly.
  • Some devices I want to use on Linux Mint but also on the Windows virtual machine at the same time. For example the webcam (Creative VF0790) is used on my desktop OS in Skype and on appear.in but on the Windows VM in Skype for Business. Doesn't work. But that's not the "fault" of Linux Mint but rather in VMware Workstation.
  • LibreOffice. I know that's just a question of "being used to", but if you've worked with Microsoft Office for a couple of years it's just more intuitive. Practical example: Changing the page format from Portrait to Landscape. But that's what I have my Windows VM for.

Linux Mint 17.3 with Dual Screen  

In general my main fear (problems with the hardware, especially the two screens) never became reality. With the audio setup I can live because most calls I receive are on Skype for Business (on Windows) for which I connected a Polycom hard phone and forwarded this one to the VM. Altogether I'm positively surprised and much faster working on Linux Mint now (don't forget, I'm a Linux Systems Engineer, not a Windows Systems Engineer). So the choice Linux Mint as desktop OS was definitely worth it.

 

SD card of Raspberry Pi dead - after almost (or only?) a year runtime
Wednesday - Jul 6th 2016 - by - (0 comments)

In a previous article I wrote about Raspberry Pi SD card issue after power outage (constant red led). That was a month ago at the begin of June 2016. Back then I was able to "fix" the SD card and therefore save the Raspbian OS by running fsck on it.

Seems that was just the beginning. As of today, the card is defect. I cannot even run fsck on it anymore. dmesg shows:

Jul  6 10:55:43 kernel: [14554.229237] mmc0: Timeout waiting for hardware interrupt.
Jul  6 10:55:43 kernel: [14554.231234] ------------[ cut here ]------------
Jul  6 10:55:43 kernel: [14554.231255] WARNING: CPU: 0 PID: 0 at /build/linux-lts-vivid-naHA4g/linux-lts-vivid-3.19.0/drivers/mmc/host/sdhci.c:1013 sdhci_send_command+0x336/0x390 [sdhci]()
Jul  6 10:55:43 kernel: [14554.231256] Modules linked in: xfs libcrc32c jfs nls_iso8859_1 mmc_block vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) hid_plantronics cdc_mbim cdc_wdm cdc_ncm usbnet mii qcserial usb_wwan usbserial pn544_mei mei_phy pn544 hci nfc arc4 dell_wmi sparse_keymap dell_laptop dcdbas i8k rfcomm dm_multipath scsi_dh bnep intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_usb_audio binfmt_misc snd_usbmidi_lib hid_logitech_hidpp snd_hda_codec_hdmi uvcvideo iwlmvm videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev media snd_hda_codec_realtek snd_hda_codec_generic mac80211 joydev serio_raw snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep btusb snd_soc_rt5640 snd_soc_rl6231 bluetooth snd_soc_core snd_compress snd_pcm_dmaengine snd_seq_midi snd_seq_midi_event snd_rawmidi iwlwifi snd_seq snd_seq_device cfg80211 lpc_ich snd_pcm shpchp mei_me mei snd_timer 8250_fintek snd dell_smo8800 soundcore i2c_hid dw_dmac dw_dmac_core snd_soc_sst_acpi 8250_dw i2c_designware_platform dell_rbtn spi_pxa2xx_platform i2c_designware_core mac_hid parport_pc ppdev lp parport dm_mirror dm_region_hash dm_log hid_generic hid_logitech_dj usbhid hid psmouse i915 ahci libahci sdhci_pci i2c_algo_bit drm_kms_helper e1000e drm ptp pps_core wmi video sdhci_acpi sdhci
Jul  6 10:55:43 kernel: [14554.231298] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W  OE  3.19.0-32-generic #37~14.04.1-Ubuntu
Jul  6 10:55:43 kernel: [14554.231299] Hardware name: Dell Inc. Latitude E7440/07F3F4, BIOS A09 05/01/2014
Jul  6 10:55:43 kernel: [14554.231300]  ffffffffc02b2460 ffff88041ea03d28 ffffffff817af41b 0000000000000000
Jul  6 10:55:43 kernel: [14554.231301]  0000000000000000 ffff88041ea03d68 ffffffff81074daa 0000000000000004
Jul  6 10:55:43 kernel: [14554.231303]  0000000000000002 ffff88040b4bd500 ffff880407053310 ffff8804070532d0
Jul  6 10:55:43 kernel: [14554.231305] Call Trace:
Jul  6 10:55:43 kernel: [14554.231306]  <IRQ>  [<ffffffff817af41b>] dump_stack+0x45/0x57
Jul  6 10:55:43 kernel: [14554.231313]  [<ffffffff81074daa>] warn_slowpath_common+0x8a/0xc0
Jul  6 10:55:43 kernel: [14554.231315]  [<ffffffff81074e9a>] warn_slowpath_null+0x1a/0x20
Jul  6 10:55:43 kernel: [14554.231319]  [<ffffffffc02ae616>] sdhci_send_command+0x336/0x390 [sdhci]
Jul  6 10:55:43 kernel: [14554.231323]  [<ffffffffc02aee85>] sdhci_finish_data+0x115/0x3c0 [sdhci]
Jul  6 10:55:43 kernel: [14554.231325]  [<ffffffff817a9a6d>] ? printk+0x46/0x48
Jul  6 10:55:43 kernel: [14554.231329]  [<ffffffffc02af290>] ? sdhci_finish_command+0x160/0x160 [sdhci]
Jul  6 10:55:43 kernel: [14554.231333]  [<ffffffffc02af315>] sdhci_timeout_timer+0x85/0xc0 [sdhci]
Jul  6 10:55:43 kernel: [14554.231335]  [<ffffffff810daba9>] call_timer_fn+0x39/0x110
Jul  6 10:55:43 kernel: [14554.231339]  [<ffffffffc02af290>] ? sdhci_finish_command+0x160/0x160 [sdhci]
Jul  6 10:55:43 kernel: [14554.231341]  [<ffffffff810dc370>] run_timer_softirq+0x220/0x320
Jul  6 10:55:43 kernel: [14554.231344]  [<ffffffff8104a3e3>] ? lapic_next_deadline+0x33/0x40
Jul  6 10:55:43 kernel: [14554.231346]  [<ffffffff81078f04>] __do_softirq+0xe4/0x270
Jul  6 10:55:43 kernel: [14554.231348]  [<ffffffff810792cd>] irq_exit+0x9d/0xb0
Jul  6 10:55:43 kernel: [14554.231351]  [<ffffffff817b9e4a>] smp_apic_timer_interrupt+0x4a/0x60
Jul  6 10:55:43 kernel: [14554.231353]  [<ffffffff817b7e7d>] apic_timer_interrupt+0x6d/0x80
Jul  6 10:55:43 kernel: [14554.231354]  <EOI>  [<ffffffff8164ff30>] ? cpuidle_enter_state+0x70/0x170
Jul  6 10:55:43 kernel: [14554.231359]  [<ffffffff8164ff1d>] ? cpuidle_enter_state+0x5d/0x170
Jul  6 10:55:43 kernel: [14554.231360]  [<ffffffff816500e7>] cpuidle_enter+0x17/0x20
Jul  6 10:55:43 kernel: [14554.231362]  [<ffffffff810b5424>] cpu_startup_entry+0x334/0x3d0
Jul  6 10:55:43 kernel: [14554.231365]  [<ffffffff8179f987>] rest_init+0x77/0x80
Jul  6 10:55:43 kernel: [14554.231368]  [<ffffffff81d3e101>] start_kernel+0x499/0x4a6
Jul  6 10:55:43 kernel: [14554.231370]  [<ffffffff81d3da58>] ? set_init_arg+0x55/0x55
Jul  6 10:55:43 kernel: [14554.231371]  [<ffffffff81d3d120>] ? early_idt_handler_array+0x120/0x120
Jul  6 10:55:43 kernel: [14554.231373]  [<ffffffff81d3d5ee>] x86_64_start_reservations+0x2a/0x2c
Jul  6 10:55:43 kernel: [14554.231375]  [<ffffffff81d3d733>] x86_64_start_kernel+0x143/0x152
Jul  6 10:55:43 kernel: [14554.231375] ---[ end trace a7dffb2e9a4ede53 ]---
Jul  6 10:55:43 kernel: [14554.237089] mmcblk0: error -110 sending stop command, original cmd response 0x0, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.237092] mmcblk0: error -110 transferring data, sector 8, nr 8, cmd response 0x0, card status 0x0
Jul  6 10:55:43 kernel: [14554.237094] mmcblk0: retrying using single block read
Jul  6 10:55:43 kernel: [14554.239266] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.241358] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.243460] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.245545] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.247592] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.249648] mmcblk0: timed out sending r/w cmd command, card status 0x400f00
Jul  6 10:55:43 kernel: [14554.249651] blk_update_request: I/O error, dev mmcblk0, sector 8
Jul  6 10:55:43 kernel: [14554.249655] Buffer I/O error on dev mmcblk0, logical block 1, async page read

The SD card was bundled with the Raspberry Pi and was bought not even a year ago, on July 17 2015. A few days after this, the Pi started running and displaying our monitoring and was running 24/7.

Are SD cards considered "stable"? I wouldn't say so, as this wasn't the first time I experienced problems with SD cards (see article How to test if a SDHC card is defect or dying). Luckily we only used the Pi for displaying the monitoring status and graphs on the screen, no data was saved on it.

 

Plugged wlan access point - no link on switch (BPDU Guard disabled port)
Thursday - Jun 30th 2016 - by - (0 comments)

Just Plug'n'Play. Sure. 

For a small tech conference, I needed an additional WLAN access point and patched it (via a patch panel) to a Cisco Catalyst switch. The switch port quickly flashed once and went down again. First I thought a problem on the switch port so I tried the same on a different port. Same effect: The port flashed once, then went dark.

On the switch itself I detected the following entries:

SWITCH#sh log
[...]
Jun 29 14:28:51.864 MEST: %SPANTREE-2-BLOCK_BPDUGUARD: Received BPDU on port Fa2/0/47 with BPDU Guard enabled. Disabling port. (SWITCH)
Jun 29 14:28:51.873 MEST: %PM-4-ERR_DISABLE: bpduguard error detected on Fa2/0/47, putting Fa2/0/47 in err-disable state (SWITCH)
Jun 29 14:30:17.891 MEST: %SPANTREE-2-BLOCK_BPDUGUARD: Received BPDU on port Fa2/0/45 with BPDU Guard enabled. Disabling port. (SWITCH)
Jun 29 14:30:17.891 MEST: %PM-4-ERR_DISABLE: bpduguard error detected on Fa2/0/45, putting Fa2/0/45 in err-disable state (SWITCH)
Jun 29 14:32:37.906 MEST: %SPANTREE-2-BLOCK_BPDUGUARD: Received BPDU on port Fa1/0/46 with BPDU Guard enabled. Disabling port.
Jun 29 14:32:37.906 MEST: %PM-4-ERR_DISABLE: bpduguard error detected on Fa1/0/46, putting Fa1/0/46 in err-disable state

At least this tells me that the switch port isn't defect. Neither is the RJ45 cable. But that's something I haven't seen before.

After some research I came across this article, explaining the ERR-DISABLE feature. It seems that the switch detected that the patched device is a bridge (access point = bridge) so it disabled the port for security reasons.

SWITCH#show run interface Fa1/0/46
Building configuration...

Current configuration : 344 bytes
!
interface FastEthernet1/0/46
 description *** User Port VLAN 111 ***
 switchport access vlan 111
 switchport mode access
 no logging event link-status
 priority-queue out
 mls qos trust dscp
 no snmp trap link-status
 storm-control broadcast level 70.00
 spanning-tree portfast
 spanning-tree bpduguard enable
 spanning-tree guard root
end


In order to temporarily allow this on this single port, the port needs to be reconfigured and the bpduguard feature disabled:

SWITCH#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
SWITCH(config)#interface FastEthernet1/0/46
SWITCH(config-if)#no spanning-tree bpduguard enable
SWITCH(config-if)#exit
SWITCH(config)#exit

But that's not enough, because the port is still down due to the err-disable feature:

SWITCH#show interfaces Fa1/0/46
FastEthernet1/0/46 is down, line protocol is down (err-disabled)
  Hardware is Fast Ethernet, address is 0099.1234.5678 (bia 0099.1234.5678)
  Description: *** User Port VLAN 111 ***
  MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Auto-duplex, Auto-speed, media type is 10/100BaseTX
  input flow-control is off, output flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:30:05, output 00:30:06, output hang never
  Last clearing of "show interface" counters 5w5d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     900189 packets input, 81407154 bytes, 0 no buffer
     Received 446654 broadcasts (135913 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 135913 multicast, 0 pause input
     0 input packets with dribble condition detected
     3821572 packets output, 552713989 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

In order to re-enable the port, a shutdown followed by a no shut is necessary:

SWITCH#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
SWITCH(config)#interface Fa1/0/46
SWITCH(config-if)#shut
SWITCH(config-if)#no shut
SWITCH(config-if)#exit
SWITCH(config)#exit
SWITCH#show interfaces Fa1/0/46
FastEthernet1/0/46 is up, line protocol is up (connected)
  Hardware is Fast Ethernet, address is 0099.1234.5678 (bia 0099.1234.5678)
  Description: *** User Port VLAN 111 ***
  MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 100Mb/s, media type is 10/100BaseTX
  input flow-control is off, output flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:35:35, output 00:00:00, output hang never
  Last clearing of "show interface" counters 5w5d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 2000 bits/sec, 2 packets/sec
  5 minute output rate 2000 bits/sec, 2 packets/sec
     900282 packets input, 81423898 bytes, 0 no buffer
     Received 446735 broadcasts (135971 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 135971 multicast, 0 pause input
     0 input packets with dribble condition detected
     3821603 packets output, 552717323 bytes, 0 underruns
     0 output errors, 0 collisions, 2 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

Now the Access Point is working.

 

Dynamically increase physical volume (PV) in an LVM setup on a VM
Tuesday - Jun 28th 2016 - by - (0 comments)

The VM's (running in a VMware environment) serving as LXC hosts were setup in a way, that the volume group (VG) used for the containers, can be dynamically increased without downtime.

The classical and probably easiest way is to just add a new virtual disk to the VM, create a new PV of this disk and use vgextend to add the new PV to the existing VG.

A nicer method however (as least imho) is to increase the already used virtual disk/PV.

Before I touched anything, I check the current available space of the volume group:

vgs
  VG       #PV #LV #SN Attr   VSize  VFree
  vglxc      1   5   0 wz--n- 50.00g 6.00g
  vgsystem   1   2   0 wz--n-  6.61g 1.96g

1. In VMware you can increase the disk's size given you still have space in your volume.

2. You need to tell your OS (in my case this is an Ubuntu 14.04) that it should rescan the scsi bus to detect changes. As I increased the second disk, which is seen as /dev/sdb, I launched the following command for this:

echo 1 > /sys/class/scsi_device/2\:0\:1\:0/device/rescan

If you're not sure which device number your disk (again, in my case sdb) has, you can easily double-check the device path by checking /sys/block/sdb/device:

ll /sys/block/sdb/device
lrwxrwxrwx 1 root root 0 Jun 28 09:08 /sys/block/sdb/device -> ../../../2:0:1:0

3. Now you can verify that your disk's size has changed. I used fdisk:

fdisk -l /dev/sdb

Disk /dev/sdb: 75.2 GB, 75161927680 bytes
255 heads, 63 sectors/track, 9137 cylinders, total 146800640 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

(the previous size was 20GB smaller)

4. All you need to do now is to tell LVM that the physical volume (ergo /dev/sdb in my case) has changed it's size:

pvresize /dev/sdb
  Physical volume "/dev/sdb" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

5. Check again the available space of the volume group:

vgs
  VG       #PV #LV #SN Attr   VSize  VFree
  vglxc      1   5   0 wz--n- 70.00g 26.00g
  vgsystem   1   2   0 wz--n-  6.61g  1.96g

Voilą. 20GB more space for my containers without downtime.

Note: This only works "that easy" because I used the complete disk /dev/sdb as physical volume. There are no partitions on sdb. If there were, it would be mandatory to increase the partition as well.

 

Very slow VMware Perl SDK with newer Perl libwww-perl version (solved)
Tuesday - Jun 21st 2016 - by - (3 comments)

For a new VMware and monitoring environment, I wanted to add the ESXi hosts to the monitoring. In the past years I successfully used the plugin "check_esx" from OP5 for this purpose. Meanwhile I had to learn that the plugin was renamed to "check_vmware_api" and that this plugin in turn had been forked at least twice.

Now there are the following plugins out there which do basically the same but slightly differ in development and code:

They all have one thing in common: They rely on the VMware Perl SDK in the background. And here starts the problem.

With all three plugins I experienced timeouts. When I showed a little bit more patience, I finally saw that it takes more than 3 minutes (!!!), to get some results:

$ time /tmp/check_vmware_esx.pl -H esx001 -u root -p secret --select=runtime
SOAP request error - possibly a protocol issue:


[...]

real    3m46.739s
user    0m0.199s
sys     0m0.022s

The most interesting part is the first line of the output: SOAP request error - possibly a protocol issue. After some research I stumbled across the following two pages:

The first site is a discussion in the VMware community forums where the problem is identified in the SDK itself. Seems the SDK was programmed with an old version of libwww-perl. On newer systems (and my monitoring server runs Ubuntu 14.04) this causes problems.

On the second site there is actually a nice workaround presented where the author, Bob Clary, manually installed an older libwww-perl version (5.837) into a separate location and exported the path in shell scripts as a wrapper around the Perl SDK.

While the workaround seems like the way to go, I didn't want to build wrappers around the SDK (and therefore maintain this after every potential update). If I work around it, then right in the file which causes problems. Which is, in my case, /usr/share/perl/5.18/VMware/VICommon.pm.

Before I did anything, I wanted to verify once again and directly from the Perl SDK itself, that the problem is not from one of the plugins but from the SDK:

time /usr/lib/vmware-vcli/apps/vm/vminfo.pl --server esx001 --username root --password secret
SOAP request error - possibly a protocol issue:
 xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
[...]

real    3m48.655s
user    0m0.184s
sys     0m0.028s

Here again it took more than 3 minutes until a result came back.

Now it's time to install the older libwww-perl version, as seen on the article from Bob Clary:

cd src/
mkdir libwww-perl
cd libwww-perl
wget https://github.com/libwww-perl/libwww-perl/archive/libwww-perl/5.837.tar.gz
tar -xzf 5.837.tar.gz
cd libwww-perl-libwww-perl-5.837/
perl Makefile.PL INSTALL_BASE=/opt/check_vmware_esx
make
make install

This installs the following Perl folder in /opt/check_vmware_esx/lib/perl5:

ll /opt/check_vmware_esx/lib/perl5/
total 96
drwxr-xr-x 2 root root  4096 Jun 21 10:14 Bundle
drwxr-xr-x 2 root root  4096 Jun 21 10:14 File
drwxr-xr-x 2 root root  4096 Jun 21 10:14 HTML
drwxr-xr-x 5 root root  4096 Jun 21 10:14 HTTP
drwxr-xr-x 5 root root  4096 Jun 21 10:14 LWP
-r--r--r-- 1 root root  9137 Sep 20  2010 lwpcook.pod
-r--r--r-- 1 root root 21343 Sep 20  2010 LWP.pm
-r--r--r-- 1 root root 25528 Sep 20  2010 lwptut.pod
drwxr-xr-x 3 root root  4096 Jun 21 10:14 Net
drwxr-xr-x 3 root root  4096 Jun 21 10:14 WWW
drwxr-xr-x 3 root root  4096 Jun 21 10:14 x86_64-linux-gnu-thread-multi

Now to the modification of /usr/share/perl/5.18/VMware/VICommon.pm (which was installed by the VMware Perl SDK installer). The begin of the file looks like this:

#
# Copyright 2006 VMware, Inc.  All rights reserved.
#

use 5.006001;
use strict;
use warnings;

use Carp qw(confess croak);
use XML::LibXML;
use LWP::UserAgent;
use LWP::ConnCache;
use HTTP::Request;
use HTTP::Headers;
use HTTP::Response;
use HTTP::Cookies;
use Data::Dumper;
[...]

To use the manually installed libwww-perl module (and not the default one installed from the Ubuntu repository), I simply added "use lib" right before the Perl version is defined:

#
# Copyright 2006 VMware, Inc.  All rights reserved.
#

use lib "/opt/check_vmware_esx/lib/perl5";

use 5.006001;
use strict;
use warnings;

use Carp qw(confess croak);
use XML::LibXML;
use LWP::UserAgent;
use LWP::ConnCache;
use HTTP::Request;
use HTTP::Headers;
use HTTP::Response;
use HTTP::Cookies;
use Data::Dumper;

Let's try the vminfo.pl command from above again:

time /usr/lib/vmware-vcli/apps/vm/vminfo.pl --server esx001 --username root --password secret

Information of Virtual Machine vm001

Name:            vm001
No. of CPU(s):           1
Memory Size:             4096
Virtual Disks:           1
Template:                0
vmPathName:              [esxz108_T2] vm001/vm001.vmx
Guest OS:                Microsoft Windows Server 2008 R2 (64-bit)
guestId:                 windows7Server64Guest
Host name:               vm001.local
IP Address:              10.10.20.157
VMware Tools:            VMware Tools is running and the version is current
Cpu usage:               39 MHz
Host memory usage:               4141 MB
Guest memory usage:              327 MB
Overall Status:          The entity is OK

Information of Virtual Machine vm002

[...]

real    0m1.498s
user    0m0.834s
sys     0m0.035s

Wow, what a change! All VM's running on esx001 were listed with detailed information. And all that in 1.498 seconds!

The Perl SDK seems fixed. Does that also apply on the plugins? I tried it with check_vmware_esx:

time /tmp/check_vmware_esx.pl -H esx001 -u root -p secret -S cpu --nosession
OK: CPU wait=424610.00 ms - CPU ready=97.00 ms - CPU usage=7.10%|'cpu_wait'=424610.00ms;;;; 'cpu_ready'=97.00ms;;;; 'cpu_usage'=7.10%;;;;

real    0m0.687s
user    0m0.351s
sys     0m0.029s

Mission accomplished!

 

Migrated web-application from PHP 5.4 to PHP 7: 3x faster!
Friday - Jun 10th 2016 - by - (0 comments)

This week a web-application was migrated from PHP 5.4 to a newer server with PHP 7. It required a few modifications of the applications, but it finally ran correctly.

The most interesting part: The application running under PHP 7 is now 3x faster than before! Take a look at the following graph which comes from our monitoring and measures the response time of the web-application:

PHP 5.4 vs. PHP 7 response times

Without a doubt you saw he "drop" of response time on Wednesday afternoon.

Here some additional information:
"Old" server was running on a CentOS 6 VM with Nginx and PHP-FPM 5.4. "New" server is a Ubuntu 16.04 LXC container with Nginx and PHP-FPM 7.

 

Create a fake mail server to test mail functions in applications
Tuesday - Jun 7th 2016 - by - (0 comments)

Developing and testing applications sometimes require you to go the "non-standard" way. A good example is when you have a web-application which has a mail function. The mails must correctly be sent but remember, this is just a test of the application and you don't want your users to get spammed with test mails. 

A fake/dummy mail server comes in handy. A developer suggested MailCatcher, which looked good to me except that it's written in ruby and depends on gems which I didn't want to put on this system. At least not if I find something easier...

Then I came across Dummy-SMTP, a simple listener written in python (so no additional software to be installed on my Ubuntu test servers) just saving all the received mails as text files in a folder.
By default, Dummy-SMTP runs on port 25 which requires root privileges to start with and it also must be started from within the correct folder. I made some modifications for the following purposes:

  • Launch the listener as non-root user
  • Launch the listener from anywhere with the absolute path

In my case, I defined the listener port to be 1025 and defined an absolute path: /srv/Dummy-SMTP which I chose where to run the listener.
To install Dummy-SMTP either clone the original repository or my forked one:

cd /srv/
git clone https://github.com/Napsty/Dummy-SMTP.git
chown -R appuser:appuser /srv/Dummy-SMTP

 Once the repository was cloned, become "appuser" if you aren't already and launch the listen.py file:

appuser@app01-test:~$ /srv/Dummy-SMTP/listen.py &
[1] 32189
Running fake smtp server on port 1025

Now I adapted the Postfix installation and defined that all mails should be relayed to this fake smtp server by setting the relayhost parameter:

cat /etc/postfix/main.cf|grep relayhost
relayhost = 127.0.0.1:1025

service postfix reload

From now on a mail sent from this server should be relayed to the dummy smtp server and be just stored as a text file in /srv/Dummy-SMTP/mails. Let's try this:

echo "Dummy SMTP Test" | mailx -s "Test" recipient@example.com

mail.log shows:

app01-test postfix/pickup[30289]: 3559024A6A: uid=0 from=
app01-test postfix/cleanup[4940]: 3559024A6A: message-id=<20160607120336.3559024A6A@app01-test.local>
app01-test postfix/qmgr[30290]: 3559024A6A: from=, size=385, nrcpt=1 (queue active)
app01-test postfix/smtp[4942]: 3559024A6A: to=, relay=127.0.0.1[127.0.0.1]:1025, delay=0.03, delays=0.02/0.01/0/0, dsn=2.0.0, status=sent (250 Ok)
app01-test postfix/qmgr[30290]: 3559024A6A: removed

And let's check out the Dummy SMTP mail folder:

ls /srv/Dummy-SMTP/mails
1465301016.24.txt

cat /srv/Dummy-SMTP/mails/1465301016.24.txt
Received: by app01-test.local (Postfix, from userid 0)
    id 3559024A6A; Tue,  7 Jun 2016 14:03:36 +0200 (CEST)
Subject: Test
To:
X-Mailer: mail (GNU Mailutils 2.99.98)
Message-Id: <20160607120336.3559024A6A@n
app01-test.local>
Date: Tue,  7 Jun 2016 14:03:36 +0200 (CEST)
From: root@nzzshop-app01-test (root)

Dummy SMTP Test

Neat! It worked.

The above mentioned Ruby alternative Mailcatcher has one nice feature though: The mails can be seen by browser. Well, this isn't very complicated either. The mail folder of the dummy smtp server is a given (/srv/Dummy-SMTP/mails) so by creating a simple "Alias" on the Apache running already on this test server, I was able to display all sent mails on the browser, too:

cat /etc/apache2/conf-enabled/dummysmtp.conf
Alias /mails "/srv/Dummy-SMTP/mails/"


    Require all granted
    Options +FollowSymLinks +Indexes
    AllowOverride All

service apache2 reload

Nothing fancy of course, but it works:

Fake dummy smtp server showing mails in browser


 


Go to Homepage home
Linux Howtos how to's
Nagios Plugins nagios plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7817 Days
until Death of Computers
Why?