Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

Search field in Firefox is gone, how to get it back
Friday - Jun 15th 2018 - by - (0 comments)

In recent Firefox versions I noticed that the search field next to the address bar disappeared. While it is still possible to simply enter keywords in the address bar and therefore use it as search field, it is not possible to dynamically change the search engine, which was pretty handy sometimes.

Search field in Firefox gone 

But the search field can be made visible again. It simply requires a quick change of Firefox's settings.

Open a new tab and enter "about:config" in the address bar, then enter. Accept the warning that you'll be careful.

In the search field (inside the about:config tab), enter: "browser.search.widget.inNavBar".

Firefox change config to show search field again 

As you can see, the value is set to "False". Double-click on the line/text of the preference and it will change to "True" (and text will become bold). And you'll also see that the search field magically re-appeared next to the address bar:

Firefox showing search field again

 

Ansible: Detect and differ between LXC containers and hosts
Wednesday - Jun 13th 2018 - by - (0 comments)

While looking for a way to handle certain tasks differently inside a LXC container and on a (physical) host, I first tried to use ansible variables based on some hardware. 

But the problem, as you might know, is that LXC containers basically see the same hardware as the host because they use the same kernel (there is no hardware virtualization layer in between).
Note: That's what makes the containers much faster than VM's, just sayin'.

So checking for hardware will not work, as both host and container see the same:

$ ansible host -m setup | grep ansible_system_vendor
        "ansible_system_vendor": "HP",

$ ansible container -m setup | grep ansible_system_vendor
        "ansible_system_vendor": "HP",

When I looked through all available variables coming from "-m setup", I stumbled across ansible_virtualization_role at the end of the output. Looks interesting!

$ ansible host -m setup | grep ansible_virtualization_role
        "ansible_virtualization_role": "host",

$ ansible container -m setup | grep ansible_virtualization_role
        "ansible_virtualization_role": "guest",

Awesome!

 

Ubuntu 18.04: /etc/rc.local does not exist anymore - does it still work?
Monday - Jun 11th 2018 - by - (0 comments)

I was about to create a new Galera Cluster where a third node is simply being used as Arbitrator, not as data node. 

As this is a simple process starting up, I usually just added a line in /etc/rc.local to start the process. But in the new Ubuntu 18.04 this file does not exist anymore:

root@arbitrator:~# file /etc/rc.local
/etc/rc.local: cannot open `/etc/rc.local' (No such file or directory)

Sure, I could write a SystemD unit file for a garbd.service now, but lazy, you know.

So I wondered if and how /etc/rc.local is still working in Ubuntu 18.04 and I stumbled on the manpage of systemd-rc-local-generator:

"systemd-rc-local-generator is a generator that checks whether /etc/rc.local exists and is executable, and if it is pulls the rc-local.service unit into the boot process[...]"

Cool, let's give it a shot.

root@arbitrator:~# echo "/usr/bin/garbd --cfg /etc/garbd.conf &" > /etc/rc.local
root@arbitrator:~# echo "exit 0" >> /etc/rc.local
root@arbitrator:~# chmod 755 /etc/rc.local

After a reboot, the process didn't show up. What happened?

root@arbitrator:~# systemctl status rc-local
● rc-local.service - /etc/rc.local Compatibility
   Loaded: loaded (/lib/systemd/system/rc-local.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/rc-local.service.d
           └─debian.conf
   Active: failed (Result: exit-code) since Mon 2018-06-11 16:53:47 CEST; 1min 53s ago
     Docs: man:systemd-rc-local-generator(8)
  Process: 1182 ExecStart=/etc/rc.local start (code=exited, status=203/EXEC)

Jun 11 16:53:46 arbitrator systemd[1]: Starting /etc/rc.local Compatibility...
Jun 11 16:53:47 arbitrator systemd[1182]: rc-local.service: Failed to execute command: Exec format error
Jun 11 16:53:47 arbitrator systemd[1182]: rc-local.service: Failed at step EXEC spawning /etc/rc.local: Exec format error
Jun 11 16:53:47 arbitrator systemd[1]: rc-local.service: Control process exited, code=exited status=203
Jun 11 16:53:47 arbitrator systemd[1]: rc-local.service: Failed with result 'exit-code'.
Jun 11 16:53:47 arbitrator systemd[1]: Failed to start /etc/rc.local Compatibility.

Exec format error? Huh?

I came across an older article from SUSE (SLES12), which looked kind of similar: The after.local.service fails to start with exec format error. The resolution on that knowledge base article:

"Add a hashpling to the begining [...]"

D'oh! I compared with a /etc/rc.local from another system (Xenial) and indeed it starts with the shell, like a normal shell script. I changed my /etc/rc.local and added #!/bin/bash:

root@arbitrator:~# cat /etc/rc.local
#!/bin/bash
/usr/bin/garbd --cfg /etc/garbd.conf &
exit 0

The rc-local service could now be restarted with success:

root@arbitrator:~# systemctl restart rc-local

root@arbitrator:~# systemctl status rc-local
● rc-local.service - /etc/rc.local Compatibility
   Loaded: loaded (/lib/systemd/system/rc-local.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/rc-local.service.d
           └─debian.conf
   Active: active (running) since Mon 2018-06-11 16:59:31 CEST; 11s ago
     Docs: man:systemd-rc-local-generator(8)
  Process: 1948 ExecStart=/etc/rc.local start (code=exited, status=0/SUCCESS)
    Tasks: 3 (limit: 4664)
   CGroup: /system.slice/rc-local.service
           └─1949 /usr/bin/garbd --cfg /etc/garbd.conf

Jun 11 16:59:31 arbitrator systemd[1]: Starting /etc/rc.local Compatibility...
Jun 11 16:59:31 arbitrator systemd[1]: Started /etc/rc.local Compatibility.

And garbd is started:

root@arbitrator:~# ps auxf | grep garb | grep -v grep
root      1949  0.0  0.2 181000  9664 ?        Sl   16:59   0:00 /usr/bin/garbd --cfg /etc/garbd.conf

TL;DR: /etc/rc.local still works in Ubuntu 18.04, when
1) it exists
2) is executable
3) Starts with a valid shell hashpling (is that really the name for it?)

 

Filesystem full because of Filebeat still hanging on to rotated log
Monday - Jun 11th 2018 - by - (0 comments)

In the past weeks I've seen recurring file system warnings on certain servers. After some investigation it turned out to be Filebeat still hanging on to a already rotated log file and therefore not releasing the inode, ergo not giving back the available disk space to the file system.

Let's start off at the begin. Icinga sent the disk warning for /var:

root@server:~# df -h
Filesystem            Type        Size  Used Avail Use% Mounted on
sysfs                 sysfs          0     0     0    - /sys
proc                  proc           0     0     0    - /proc
udev                  devtmpfs    2.0G   12K  2.0G   1% /dev
devpts                devpts         0     0     0    - /dev/pts
tmpfs                 tmpfs       396M  736K  395M   1% /run
/dev/sda1             ext4        4.5G  2.0G  2.3G  48% /
none                  tmpfs       4.0K     0  4.0K   0% /sys/fs/cgroup
none                  fusectl        0     0     0    - /sys/fs/fuse/connections
none                  debugfs        0     0     0    - /sys/kernel/debug
none                  securityfs     0     0     0    - /sys/kernel/security
none                  tmpfs       5.0M     0  5.0M   0% /run/lock
none                  tmpfs       2.0G     0  2.0G   0% /run/shm
none                  tmpfs       100M     0  100M   0% /run/user
none                  pstore         0     0     0    - /sys/fs/pstore
/dev/mapper/vg0-lvvar ext4         37G   33G  3.0G  92% /var
/dev/mapper/vg0-lvtmp ext4        922M  1.2M  857M   1% /tmp
systemd               cgroup         0     0     0    - /sys/fs/cgroup/systemd

As you can see, /var is at 92% full.

With lsof we checked for open yet deleted files:

root@server:~# lsof +L1
COMMAND    PID     USER   FD   TYPE DEVICE   SIZE/OFF NLINK   NODE NAME
init         1     root   11w   REG  252,1        309     0     79 /var/log/upstart/systemd-logind.log.1 (deleted)
filebeat 30336     root    3r   REG  252,1 1538307897     0    326 /var/log/haproxy.log.1 (deleted)
filebeat 30336     root    4r   REG  252,1 1474951702     0   2809 /var/log/haproxy.log.1 (deleted)
filebeat 30336     root    6r   REG  252,1 1513061121     0   1907 /var/log/haproxy.log.1 (deleted)
filebeat 30336     root    7r   REG  252,1 1566966965     0     72 /var/log/haproxy.log.1 (deleted)
filebeat 30336     root    8r   REG  252,1 1830485663     0   2558 /var/log/haproxy.log.1 (deleted)
filebeat 30336     root    9r   REG  252,1 1426600050     0    163 /var/log/haproxy.log.1 (deleted)
nginx    31673 www-data   64u   REG  252,1     204800     0   2978 /var/lib/nginx/proxy/9/53/0008826539 (deleted)
nginx    31674 www-data  154u   REG  252,1     204800     0 131334 /var/lib/nginx/proxy/2/95/0008825952 (deleted)
nginx    31676 www-data  111u   REG  252,1     241664     0   4064 /var/lib/nginx/proxy/0/54/0008826540 (deleted)

There it is; Filebeat still hanging on to a (meanwhile) rotated HAProxy log file (which is quite big as you can see in the column SIZE/OFF).

To release the inode, Filebeat can either restart or force reloaded.

root@server:~# /etc/init.d/filebeat force-reload
 * Restarting Filebeat sends log files to Logstash or directly to Elasticsearch. filebeat
2018/05/23 13:39:20.004534 beat.go:297: INFO Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2018/05/23 13:39:20.004580 beat.go:192: INFO Setup Beat: filebeat; Version: 5.6.9
2018/05/23 13:39:20.004623 logstash.go:91: INFO Max Retries set to: 3
2018/05/23 13:39:20.004663 outputs.go:108: INFO Activated logstash as output plugin.
2018/05/23 13:39:20.004681 metrics.go:23: INFO Metrics logging every 30s
2018/05/23 13:39:20.004727 publish.go:300: INFO Publisher name: server
2018/05/23 13:39:20.004850 async.go:63: INFO Flush Interval set to: 1s
2018/05/23 13:39:20.006018 async.go:64: INFO Max Bulk Size set to: 2048
Config OK

Verification with lsof again:

root@server:~# lsof +L1
COMMAND   PID     USER   FD   TYPE DEVICE SIZE/OFF NLINK   NODE NAME
init        1     root   11w   REG  252,1      309     0     79 /var/log/upstart/systemd-logind.log.1 (deleted)
nginx   31674 www-data  154u   REG  252,1   204800     0 131334 /var/lib/nginx/proxy/2/95/0008825952 (deleted)

Looks better! What about the file system size?

root@server:~# df -h
Filesystem            Type        Size  Used Avail Use% Mounted on
sysfs                 sysfs          0     0     0    - /sys
proc                  proc           0     0     0    - /proc
udev                  devtmpfs    2.0G   12K  2.0G   1% /dev
devpts                devpts         0     0     0    - /dev/pts
tmpfs                 tmpfs       396M  736K  395M   1% /run
/dev/sda1             ext4        4.5G  2.0G  2.3G  48% /
none                  tmpfs       4.0K     0  4.0K   0% /sys/fs/cgroup
none                  fusectl        0     0     0    - /sys/fs/fuse/connections
none                  debugfs        0     0     0    - /sys/kernel/debug
none                  securityfs     0     0     0    - /sys/kernel/security
none                  tmpfs       5.0M     0  5.0M   0% /run/lock
none                  tmpfs       2.0G     0  2.0G   0% /run/shm
none                  tmpfs       100M     0  100M   0% /run/user
none                  pstore         0     0     0    - /sys/fs/pstore
/dev/mapper/vg0-lvvar ext4         37G   24G   12G  68% /var
/dev/mapper/vg0-lvtmp ext4        922M  1.2M  857M   1% /tmp
systemd               cgroup         0     0     0    - /sys/fs/cgroup/systemd

The /var partition is now back to 68% used. Good!

But how does one prevent this from happening again? In the logrotate config for HAProxy (/etc/logrotate.d/haproxy) I added a postrotate command to reload Filebeat:

root@server:~# cat /etc/logrotate.d/haproxy
/var/log/haproxy.log {
    daily
    rotate 52
    missingok
    notifempty
    compress
    delaycompress
    postrotate
        invoke-rc.d rsyslog rotate >/dev/null 2>&1 || true
        service filebeat force-reload >/dev/null 2>&1
    endscript
}

For the last couple of weeks I've been watching this on that particular server and it turns out this definitely solved the problem that Filebeat was still hanging on to inodes not in use anymore. The following graphs shows the disk usage of /var:

Filebeat still hanging on to logfiles after rotation 

The red circles show when I manually forced a reload of Filebeat.
The blue circle notes the day when I added the "service filebeat force-reload >/dev/null 2>&1" line into postrotate in the logrotate file - and when it first executed (note the significant fall compared to the days before).

I also had to add the reload line to the logrotate config of Nginx, as I'm using Filebeat to go through HAProxy and Nginx logs.

Note: This happened under both Ubuntu 14.04 and 16.04 with Filebeat 5.6.9. Meanwhile Filebeat 6.x is out, maybe this fixes the rotated log file issue but I didn't have the time to upgrade yet. 

 

Network config in Ubuntu 18.04 Bionic is now elsewhere and handled by netplan
Friday - Jun 8th 2018 - by - (0 comments)

Whlie creating a new template for Ubuntu 18.04, I stumbled across something I didn't expect. 

At the end of the template creation process, I usually comment the used IP addresses (to avoid human mistake when someone powers up the template with network card enabled). So I wanted to configure /etc/network/interfaces, as always. But:

root@ubuntu1804:~# cat /etc/network/interfaces
root@ubuntu1804:~#

Umm... the config file is empty. Huh? What's going on?

But during the setup I defined a static IP (10.10.4.44)?

root@ubuntu1804:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:8d:5c:45 brd ff:ff:ff:ff:ff:ff
    inet 10.10.4.44/24 brd 194.40.217.255 scope global ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe8d:5c45/64 scope link
       valid_lft forever preferred_lft forever

Yep, the static IP is configured. Where the hell is this stored now?

root@ubuntu1804:~# grep -rni 10.10.4.44 /etc/*
/etc/hosts:2:10.10.4.44    ubuntu1804.nzzmg.local    ubuntu1804
/etc/netplan/01-netcfg.yaml:8:      addresses: [ 10.10.4.44/24 ]

/etc/netplan? Never heard of it... 

Let's check out this config file (which is written in YAML format):

root@ubuntu1804:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  ethernets:
    ens192:
      addresses: [ 10.10.4.44/24 ]
      gateway4: 10.10.4.44.1
      nameservers:
          search: [ example.com ]
          addresses:
              - "1.1.1.1"
              - "8.8.8.8"

OK, doesn't look that complicated and is quite understandable. Important here is also the "renderer". According to the documentation, this refers to the network daemon reading the netplan config file:

"Netplan reads network configuration from /etc/netplan/*.yaml which are written by administrators, installers, cloud image instantiations, or other OS deployments. During early boot, Netplan generates backend specific configuration files in /run to hand off control of devices to a particular networking daemon."

The Ubuntu default is to hand off control to networkd, which is part of the SystemD jungle:

root@ubuntu1804:~# ps auxf|grep networkd
systemd+   750  0.0  0.1 71816  5244 ?        Ss   13:27   0:00 /lib/systemd/systemd-networkd
root       930  0.0  0.4 170424 17128 ?        Ssl  13:27   0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher

So if I change the IP address to something else, how does one apply the new IP and will it be immediate?

root@ubuntu1804:~# ip address show ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:8d:5d:37 brd ff:ff:ff:ff:ff:ff
    inet 10.10.4.44/25 brd 10.150.2.127 scope global ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe8d:5d37/64 scope link
       valid_lft forever preferred_lft forever

root@ubuntu1804:~# sed -i "s/10.10.4.44/10.150.2.98/g" /etc/netplan/01-netcfg.yaml

root@ubuntu1804:~# netplan apply

# [ Note: Here my SSH session got disconnected - so yes, IP change was immediate ]

admck@ubuntu1804:~$ ip address show ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:8d:5d:37 brd ff:ff:ff:ff:ff:ff
    inet 10.150.2.98/25 brd 10.150.2.127 scope global ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe8d:5d37/64 scope link
       valid_lft forever preferred_lft forever

Right after "netplan apply", the IP was changed. I lost my SSH connection (of course) and could log back in using the new IP. So yes, IP change was immediate.

I just wonder how often I will still use "vi /etc/network/interfaces" as this has been stuck in my head for more than 10 years. Old habits die hard.

 

Rancher 1.6: Github user not able to login (Internal Server Error)
Thursday - May 31st 2018 - by - (0 comments)

On our Rancher 1.6 environment one particular developer was unable to login into Rancher using his Github account. Whenever he tried it, he got an error "Internal Server Error".

Rancher Login Internal Server Error 

But all other users were able to log in, so I first suspected a problem with his Github account (e.g. OAuth disabled or similar). On the research of the problem we came across an issue in Rancher's Github repository, having similar problems:

Some of the users from the organization added in the environment can log in with no problem, but some get Internal Server Error

The issue was closed by the OP himself with the reason:

It was because of mysql being in latin and not utf8

Could this be our problem, too?

I verified and indeed the database and all its tables were set to latin1_swedish_ci (the "old default" of MySQL). The Rancher 1.6 install documentation clearly states to create the database with utf8 though:

Instead of using the internal database that comes with Rancher server, you can start Rancher server pointing to an external database. The command would be the same, but appending in additional arguments to direct how to connect to your external database.

Here is an example of a SQL command to create a database and users.

> CREATE DATABASE IF NOT EXISTS cattle COLLATE = 'utf8_general_ci' CHARACTER SET = 'utf8';
> GRANT ALL ON cattle.* TO 'cattle'@'%' IDENTIFIED BY 'cattle';
> GRANT ALL ON cattle.* TO 'cattle'@'localhost' IDENTIFIED BY 'cattle';

Why wasn't our database created with utf8? Turns out we're using AWS RDS for this database and an option to set the collation and character set is not possible:

RDS Aurora DB Options 

But can the failed login really be blamed on the database's character set? I entered the Rancher container and checked the logs:

root@rancher01:/# docker exec -it 1d0ec49acc20 bash

root@1d0ec49acc20:/# cd /var/lib/cattle/logs

root@1d0ec49acc20:/var/lib/cattle/logs# cat cattle-debug.log|grep -i github
[...]
Query is: insert into `account` (`name`, `kind`, `uuid`, `state`, `created`, `data`, `external_id`, `external_id_type`, `health_state`, `version`) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?), parameters ['Paweł Lewandowski','user','de7b9940-e699-422f-8586-5a964adec35a','requested','2018-05-30 10:10:27.511','{}','6796097','github_user','healthy','2']
[...]

Yes, there is indeed a special character "L with stroke" in the first name. This character is not part in the latin1_swedish_ci character set. The failed login can therefore really be blamed on the wrong database/table character sets.

 

Quickly show character sets and collations on a MySQL database
Wednesday - May 30th 2018 - by - (0 comments)

To find out which character sets and collations all tables of a given database (here: rancher_dev) use, the following query helps:

mysql> SELECT TABLE_SCHEMA, TABLE_NAME, COLLATION_NAME, CHARACTER_SET_NAME FROM information_schema.`TABLES` T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "rancher_dev";
+--------------+-----------------------------------------------+-------------------+--------------------+
| TABLE_SCHEMA | TABLE_NAME                                    | COLLATION_NAME    | CHARACTER_SET_NAME |
+--------------+-----------------------------------------------+-------------------+--------------------+
| rancher_dev  | DATABASECHANGELOG                             | latin1_swedish_ci | latin1             |
| rancher_dev  | DATABASECHANGELOGLOCK                         | latin1_swedish_ci | latin1             |
| rancher_dev  | account                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | account_link                                  | latin1_swedish_ci | latin1             |
| rancher_dev  | agent                                         | latin1_swedish_ci | latin1             |
| rancher_dev  | agent_group                                   | latin1_swedish_ci | latin1             |
| rancher_dev  | audit_log                                     | latin1_swedish_ci | latin1             |
| rancher_dev  | auth_token                                    | latin1_swedish_ci | latin1             |
| rancher_dev  | backup                                        | latin1_swedish_ci | latin1             |
| rancher_dev  | backup_target                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_category                              | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_file                                  | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_label                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_template                              | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_template_category                     | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_version                               | latin1_swedish_ci | latin1             |
| rancher_dev  | catalog_version_label                         | latin1_swedish_ci | latin1             |
| rancher_dev  | certificate                                   | latin1_swedish_ci | latin1             |
| rancher_dev  | cluster_host_map                              | latin1_swedish_ci | latin1             |
| rancher_dev  | cluster_membership                            | latin1_swedish_ci | latin1             |
| rancher_dev  | config_item                                   | latin1_swedish_ci | latin1             |
| rancher_dev  | config_item_status                            | latin1_swedish_ci | latin1             |
| rancher_dev  | container_event                               | latin1_swedish_ci | latin1             |
| rancher_dev  | credential                                    | latin1_swedish_ci | latin1             |
| rancher_dev  | credential_instance_map                       | latin1_swedish_ci | latin1             |
| rancher_dev  | data                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | deployment_unit                               | latin1_swedish_ci | latin1             |
| rancher_dev  | dynamic_schema                                | latin1_swedish_ci | latin1             |
| rancher_dev  | dynamic_schema_role                           | latin1_swedish_ci | latin1             |
| rancher_dev  | environment                                   | latin1_swedish_ci | latin1             |
| rancher_dev  | external_event                                | latin1_swedish_ci | latin1             |
| rancher_dev  | external_handler                              | latin1_swedish_ci | latin1             |
| rancher_dev  | external_handler_external_handler_process_map | latin1_swedish_ci | latin1             |
| rancher_dev  | external_handler_process                      | latin1_swedish_ci | latin1             |
| rancher_dev  | generic_object                                | latin1_swedish_ci | latin1             |
| rancher_dev  | global_load_balancer                          | latin1_swedish_ci | latin1             |
| rancher_dev  | healthcheck_instance                          | latin1_swedish_ci | latin1             |
| rancher_dev  | healthcheck_instance_host_map                 | latin1_swedish_ci | latin1             |
| rancher_dev  | host                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | host_ip_address_map                           | latin1_swedish_ci | latin1             |
| rancher_dev  | host_label_map                                | latin1_swedish_ci | latin1             |
| rancher_dev  | host_template                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | host_vnet_map                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | image                                         | latin1_swedish_ci | latin1             |
| rancher_dev  | image_storage_pool_map                        | latin1_swedish_ci | latin1             |
| rancher_dev  | instance                                      | latin1_swedish_ci | latin1             |
| rancher_dev  | instance_host_map                             | latin1_swedish_ci | latin1             |
| rancher_dev  | instance_label_map                            | latin1_swedish_ci | latin1             |
| rancher_dev  | instance_link                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | ip_address                                    | latin1_swedish_ci | latin1             |
| rancher_dev  | ip_address_nic_map                            | latin1_swedish_ci | latin1             |
| rancher_dev  | ip_association                                | latin1_swedish_ci | latin1             |
| rancher_dev  | ip_pool                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | label                                         | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_certificate_map                 | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_config                          | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_config_listener_map             | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_host_map                        | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_listener                        | latin1_swedish_ci | latin1             |
| rancher_dev  | load_balancer_target                          | latin1_swedish_ci | latin1             |
| rancher_dev  | machine_driver                                | latin1_swedish_ci | latin1             |
| rancher_dev  | mount                                         | latin1_swedish_ci | latin1             |
| rancher_dev  | network                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | network_driver                                | latin1_swedish_ci | latin1             |
| rancher_dev  | network_service                               | latin1_swedish_ci | latin1             |
| rancher_dev  | network_service_provider                      | latin1_swedish_ci | latin1             |
| rancher_dev  | network_service_provider_instance_map         | latin1_swedish_ci | latin1             |
| rancher_dev  | nic                                           | latin1_swedish_ci | latin1             |
| rancher_dev  | offering                                      | latin1_swedish_ci | latin1             |
| rancher_dev  | physical_host                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | port                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | process_execution                             | latin1_swedish_ci | latin1             |
| rancher_dev  | process_instance                              | latin1_swedish_ci | latin1             |
| rancher_dev  | project_member                                | latin1_swedish_ci | latin1             |
| rancher_dev  | project_template                              | latin1_swedish_ci | latin1             |
| rancher_dev  | resource_pool                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | scheduled_upgrade                             | latin1_swedish_ci | latin1             |
| rancher_dev  | secret                                        | latin1_swedish_ci | latin1             |
| rancher_dev  | service                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | service_consume_map                           | latin1_swedish_ci | latin1             |
| rancher_dev  | service_event                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | service_expose_map                            | latin1_swedish_ci | latin1             |
| rancher_dev  | service_index                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | service_log                                   | latin1_swedish_ci | latin1             |
| rancher_dev  | setting                                       | latin1_swedish_ci | latin1             |
| rancher_dev  | snapshot                                      | latin1_swedish_ci | latin1             |
| rancher_dev  | snapshot_storage_pool_map                     | latin1_swedish_ci | latin1             |
| rancher_dev  | storage_driver                                | latin1_swedish_ci | latin1             |
| rancher_dev  | storage_pool                                  | latin1_swedish_ci | latin1             |
| rancher_dev  | storage_pool_host_map                         | latin1_swedish_ci | latin1             |
| rancher_dev  | subnet                                        | latin1_swedish_ci | latin1             |
| rancher_dev  | subnet_vnet_map                               | latin1_swedish_ci | latin1             |
| rancher_dev  | task                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | task_instance                                 | latin1_swedish_ci | latin1             |
| rancher_dev  | ui_challenge                                  | latin1_swedish_ci | latin1             |
| rancher_dev  | user_preference                               | latin1_swedish_ci | latin1             |
| rancher_dev  | vnet                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | volume                                        | latin1_swedish_ci | latin1             |
| rancher_dev  | volume_storage_pool_map                       | latin1_swedish_ci | latin1             |
| rancher_dev  | volume_template                               | latin1_swedish_ci | latin1             |
| rancher_dev  | zone                                          | latin1_swedish_ci | latin1             |
| rancher_dev  | cluster                                       | utf8_general_ci   | utf8               |
+--------------+-----------------------------------------------+-------------------+--------------------+
104 rows in set (0.01 sec)

To adapt the query to your own database, simply replace "rancher_dev" by your database name.

Based on the information found on https://stackoverflow.com/questions/1049728/how-do-i-see-what-character-set-a-mysql-database-table-column-is .

 

Install Linux Mint 18.3 on software raid (mdraid) device
Saturday - May 26th 2018 - by - (0 comments)

When I re-vamped my computer (bought in 2011) a few days ago, I replaced the two internal 750GB hard drives with three SSD's.

  • Drive 1 (/dev/sda) 500GB Samsung 850 Evo
  • Drive 2 (/dev/sdb) 500GB Samsung 850 Evo
  • Drive 3 (/dev/sdc) 120GB ADATA SSD S510

I wanted to use both Samsung drives as a raid-1 for my main installation, Linux Mint 18.3.

The standalone SSD would be used for a Windows 7 installation for dual booting.

When I launched the Linux Mint 18.3 installation, I couldn't find any options to create software raid. So I created them manually (mdadm...) and restarted the installer. At the end of the installation the installer asks to reboot. That's what I did. Just to come to the grub loader and it coudln't find any operating system. Great :-/

After some try'n'err, I finally got to a way which works. If you want to install Linux Mint 18.3 on a software raid, follow these steps. Make sure you are using the correct device names, in my case they were /dev/sda and /dev/sdb.

1) Create the partitions on /dev/sda

I chose a very simple approach here with two partitions. The main partition almost fills up the whole disk, only leaving 4GB left for the second partition (swap).

mint ~ # sfdisk -l /dev/sda
Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x4f1db047

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1  *         2048 968384511 968382464 461.8G 83 Linux
/dev/sda2       968384512 976773119   8388608     4G 82 Linux swap / Solaris

2) Copy the partition table from SDA to SDB

The following command dumps (-d) the partition table from /dev/sda and inserts it into /dev/sdb:

mint ~ # sfdisk -d /dev/sda | sfdisk /dev/sdb
Checking that no-one is using this disk right now ... OK

Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x16a9c579

Old situation:

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new DOS disklabel with disk identifier 0x4f1db047.
Created a new partition 1 of type 'Linux' and of size 461.8 GiB.
/dev/sdb2: Created a new partition 2 of type 'Linux swap / Solaris' and of size 4 GiB.
/dev/sdb3:
New situation:

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sdb1  *         2048 968384511 968382464 461.8G 83 Linux
/dev/sdb2       968384512 976773119   8388608     4G 82 Linux swap / Solaris

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

3) Create the raid devices

First I created /dev/md0, which will hold the Linux Mint installation:

mint ~ # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm: /dev/sda1 appears to contain an ext2fs file system
       size=484191232K  mtime=Fri May 25 15:31:47 2018
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdb1 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Fri May 25 12:50:09 2018
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Then create /dev/md1 which will be used as swap partition:

mint ~ # mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb2
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdb2 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Fri May 25 12:50:23 2018
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.

4) Wait for the sync to be completed

mint ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
      4190208 blocks super 1.2 [2/2] [UU]
          resync=DELAYED
     
md0 : active raid1 sdb1[1] sda1[0]
      484060160 blocks super 1.2 [2/2] [UU]
      [>....................]  resync =  0.7% (3421824/484060160) finish=39.7min speed=201283K/sec
      bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices:

Yes, patience you must have.

mint ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
      4190208 blocks super 1.2 [2/2] [UU]
     
md0 : active raid1 sdb1[1] sda1[0]
      484060160 blocks super 1.2 [2/2] [UU]
      bitmap: 0/4 pages [0KB], 65536KB chunk

unused devices:

5) Format the raid device /dev/md0

I will be using an ext4 filesystem, so:

mint ~ # mkfs.ext4 /dev/md0
mke2fs 1.42.13 (17-May-2015)
Discarding device blocks: done                           
Creating filesystem with 121015040 4k blocks and 30261248 inodes
Filesystem UUID: 8f662d46-4759-4b81-b879-eb60dd643f41
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000

Allocating group tables: done                           
Writing inode tables: done                           
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

6) Launch the installer

But launch it from the command line:

mint ~ # ubiquity -b

7) In the installer...

When the installer asks how about the installation type, select "Something else":

Linux Mint Install Something Else 

The raid devices /dev/md0 (with ext4 as type) and /dev/md1 (with swap as type) should be shown in the list:

Linux Mint 18.3 partitions 

Double-click on the row /dev/md0 with the values and select use the partition as ext4 mounted as /.

Double-click on the row /dev/md1 with the values and select use the partition as swap. 

Important: Make sure the other swap partitions (from devices SDA and SDB) are set to "do not use this partition". Otherwise the installer will fail and somehow crashes...

Make sure you select the row /dev/md0 with the values, then click on "Install now":

Linux Mint 

8) At the end of the installation...

Very important: DO NOT click on "Restart Now". Click on "Continue Testing" instead. Otherwise you will have the same failing boot effect as I described at the begin of this article.

Linux Mint installation completed 

9) Prepare the Linux Mint installation to chroot into

Launch a terminal window and mount /dev/md0:

mint ~ # mount /dev/md0 /mnt

Also make sure you are mounting sys and proc file systems as bind mounts into /mnt:

mint ~ # for i in /dev /dev/pts /sys /proc; do mount --bind $i /mnt/$i; done

In case the resolv.conf inside the Linux Mint installation is empty, enter a nameserver manually:

mint ~ # cat /mnt/etc/resolv.conf
mint ~ # echo "nameserver 1.1.1.1" > /mnt/etc/resolv.conf

Now chroot into your Linux Mint installation, mounted as /mnt:

mint ~ # chroot /mnt

10) Fix grub in the terminal

Now install the package mdadm into the Linux Mint installation. I will show the full output here:

mint / # apt-get install mdadm
Reading package lists... Done
Building dependency tree      
Reading state information... Done
Suggested packages:
  default-mta | mail-transport-agent dracut-core
The following NEW packages will be installed:
  mdadm
0 upgraded, 1 newly installed, 0 to remove and 326 not upgraded.
Need to get 394 kB of archives.
After this operation, 1,208 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 mdadm amd64 3.3-2ubuntu7.6 [394 kB]
Fetched 394 kB in 0s (1,491 kB/s)
Preconfiguring packages ...
Selecting previously unselected package mdadm.
(Reading database ... 199757 files and directories currently installed.)
Preparing to unpack .../mdadm_3.3-2ubuntu7.6_amd64.deb ...
Unpacking mdadm (3.3-2ubuntu7.6) ...
Processing triggers for systemd (229-4ubuntu21) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for doc-base (0.10.7) ...
Processing 4 added doc-base files...
Registering documents with scrollkeeper...
Processing triggers for man-db (2.7.5-1) ...
Setting up mdadm (3.3-2ubuntu7.6) ...
Generating mdadm.conf... done.
update-initramfs: deferring update (trigger activated)
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-4.10.0-38-generic
Found initrd image: /boot/initrd.img-4.10.0-38-generic
Found memtest86+ image: /boot/memtest86+.elf
Found memtest86+ image: /boot/memtest86+.bin
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
ERROR: isw: Could not find disk /dev/sdd in the metadata
File descriptor 3 (pipe:[1799227]) leaked on lvs invocation. Parent PID 29155: /bin/sh
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
Found Windows 7 (loader) on /dev/sdd1
done
Running in chroot, ignoring request.
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
Processing triggers for systemd (229-4ubuntu21) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for initramfs-tools (0.122ubuntu8.9) ...
update-initramfs: Generating /boot/initrd.img-4.10.0-38-generic
Warning: No support for locale: en_US.utf8

You can ignore the errors about drive /dev/sdd (it was an additional USB drive, nothing to do with the installation).

Very important here: When mdadm was installed into the Linux Mint installation, a new Kernel initramfs was created and also the grub config was created. Also the mdadm.conf was written. Obviously this step (installing mdadm into the Linux Mint installation and therefore self-awareness of being a Linux raid) was missed by the installer...

11) Verification

After mdadm was installed, a bunch of necessary files were created. Let's start with grub:

mint / # ll /boot/grub/
total 2368
drwxr-xr-x 2 root root    4096 May 26 08:08 ./
drwxr-xr-x 3 root root    4096 May 26 08:08 ../
-rw-r--r-- 1 root root     712 Nov 24  2017 gfxblacklist.txt
-r--r--r-- 1 root root    9734 May 26 08:08 grub.cfg
-rw-r--r-- 1 root root 2398585 Nov 24  2017 unicode.pf2

grub.cfg was only created once mdadm was installed. No wonder, a boot was not possible without this manual fix.

What does it contain?

mint / # cat /boot/grub/grub.cfg
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#
[...]
insmod part_msdos
insmod part_msdos
insmod diskfilter
insmod mdraid1x
insmod ext2
set root='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'
if [ x$feature_platform_search_hint = xy ]; then
  search --no-floppy --fs-uuid --set=root --hint='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'  8f662d46-4759-4b81-b879-eb60dd643f41
else
  search --no-floppy --fs-uuid --set=root 8f662d46-4759-4b81-b879-eb60dd643f41
fi
    font="/usr/share/grub/unicode.pf2"
fi
[...]
menuentry 'Linux Mint 18.3 Cinnamon 64-bit' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-8f662d46-4759-4b81-b879-eb60dd643f41' {
    recordfail
    load_video
    gfxmode $linux_gfx_mode
    insmod gzio
    if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
    insmod part_msdos
    insmod part_msdos
    insmod diskfilter
    insmod mdraid1x
    insmod ext2
    set root='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'
    if [ x$feature_platform_search_hint = xy ]; then
      search --no-floppy --fs-uuid --set=root --hint='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'  8f662d46-4759-4b81-b879-eb60dd643f41
    else
      search --no-floppy --fs-uuid --set=root 8f662d46-4759-4b81-b879-eb60dd643f41
    fi
        linux    /boot/vmlinuz-4.10.0-38-generic root=UUID=8f662d46-4759-4b81-b879-eb60dd643f41 ro  quiet splash $vt_handoff
    initrd    /boot/initrd.img-4.10.0-38-generic
}
submenu 'Advanced options for Linux Mint 18.3 Cinnamon 64-bit' $menuentry_id_option 'gnulinux-advanced-8f662d46-4759-4b81-b879-eb60dd643f41' {
    menuentry 'Linux Mint 18.3 Cinnamon 64-bit, with Linux 4.10.0-38-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.10.0-38-generic-advanced-8f662d46-4759-4b81-b879-eb60dd643f41' {
        recordfail
        load_video
        gfxmode $linux_gfx_mode
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_msdos
        insmod part_msdos
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'  8f662d46-4759-4b81-b879-eb60dd643f41
        else
          search --no-floppy --fs-uuid --set=root 8f662d46-4759-4b81-b879-eb60dd643f41
        fi
        echo    'Loading Linux 4.10.0-38-generic ...'
            linux    /boot/vmlinuz-4.10.0-38-generic root=UUID=8f662d46-4759-4b81-b879-eb60dd643f41 ro  quiet splash $vt_handoff
        echo    'Loading initial ramdisk ...'
        initrd    /boot/initrd.img-4.10.0-38-generic
    }
    menuentry 'Linux Mint 18.3 Cinnamon 64-bit, with Linux 4.10.0-38-generic (upstart)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.10.0-38-generic-init-upstart-8f662d46-4759-4b81-b879-eb60dd643f41' {
        recordfail
        load_video
        gfxmode $linux_gfx_mode
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_msdos
        insmod part_msdos
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'  8f662d46-4759-4b81-b879-eb60dd643f41
        else
          search --no-floppy --fs-uuid --set=root 8f662d46-4759-4b81-b879-eb60dd643f41
        fi
        echo    'Loading Linux 4.10.0-38-generic ...'
            linux    /boot/vmlinuz-4.10.0-38-generic root=UUID=8f662d46-4759-4b81-b879-eb60dd643f41 ro  quiet splash $vt_handoff init=/sbin/upstart
        echo    'Loading initial ramdisk ...'
        initrd    /boot/initrd.img-4.10.0-38-generic
    }
    menuentry 'Linux Mint 18.3 Cinnamon 64-bit, with Linux 4.10.0-38-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.10.0-38-generic-recovery-8f662d46-4759-4b81-b879-eb60dd643f41' {
        recordfail
        load_video
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_msdos
        insmod part_msdos
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/650864013e2d41cf1f2acfeafc5c2bd7'  8f662d46-4759-4b81-b879-eb60dd643f41
        else
          search --no-floppy --fs-uuid --set=root 8f662d46-4759-4b81-b879-eb60dd643f41
        fi
        echo    'Loading Linux 4.10.0-38-generic ...'
            linux    /boot/vmlinuz-4.10.0-38-generic root=UUID=8f662d46-4759-4b81-b879-eb60dd643f41 ro recovery nomodeset
        echo    'Loading initial ramdisk ...'
        initrd    /boot/initrd.img-4.10.0-38-generic
    }
}

### END /etc/grub.d/10_linux ###
[...]

As root device (set root) an uuid using mdraid (mduuid/650864013e2d41cf1f2acfeafc5c2bd7) was used. Let's doublecheck that with the entries in mdadm.conf:

mint / # cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/0  metadata=1.2 UUID=65086401:3e2d41cf:1f2acfea:fc5c2bd7 name=mint:0
ARRAY /dev/md/1  metadata=1.2 UUID=ad052a0a:eb1f9198:ec842848:215b650b name=mint:1
ARRAY metadata=imsm UUID=0b24ad7f:9b251541:a98a3748:f6333faa
ARRAY /dev/md/RAID1 container=0b24ad7f:9b251541:a98a3748:f6333faa member=0 UUID=aaa62640:f0d57fc8:6c097c8f:547b9c8f

# This file was auto-generated on Sat, 26 May 2018 08:08:42 +0200
# by mkconf $Id$

The UUID for /dev/md/0 looks familiar ;-). It's the same UUID as used in the grub config. So far so good.

Let's check /etc/fstab, too:

mint / # cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
#              
# / was on /dev/md0 during installation
UUID=8f662d46-4759-4b81-b879-eb60dd643f41 /               ext4    errors=remount-ro 0       1
# swap was on /dev/md1 during installation
UUID=b8277371-01fb-4aa3-bec1-9c0a4295deea none            swap    sw              0       0

Here another UUID is used (because it's a device UUID, not a mdraid UUID). We can verify these by checking in /dev/disk/by-uuid:

mint / # ls -la /dev/disk/by-uuid/
total 0
drwxr-xr-x 2 root root 180 May 26 08:05 .
drwxr-xr-x 6 root root 120 May 25 18:59 ..
lrwxrwxrwx 1 root root  10 May 26 08:05 101EC9371EC9171E -> ../../sdd2
lrwxrwxrwx 1 root root  10 May 26 08:05 1EF881D5F881AB99 -> ../../sdc1
lrwxrwxrwx 1 root root   9 May 26 08:05 2017-11-24-13-25-42-00 -> ../../sr0
lrwxrwxrwx 1 root root   9 May 26 08:08 8f662d46-4759-4b81-b879-eb60dd643f41 -> ../../md0
lrwxrwxrwx 1 root root   9 May 26 08:08 b8277371-01fb-4aa3-bec1-9c0a4295deea -> ../../md1
lrwxrwxrwx 1 root root  10 May 26 08:05 CEAABB7CAABB601F -> ../../sdd1
lrwxrwxrwx 1 root root  10 May 26 08:05 E646DD2A46DCFBED -> ../../sdd3

Both UUID's used in fstab (for the root partition and for swap) are here.

12) Grub install on the physical drives

Now that everything looks in order, we can install grub to /dev/sda and /dev/sdb.

mint / # grub-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.

mint / # grub-install /dev/sdb
Installing for i386-pc platform.
Installation finished. No error reported.

Very good, no errors.

Now I exited the chroot environment and rebooted the machine.

mint / # exit

mint ~ # reboot

13) Booting

And finally, success: Linux Mint 18.3 now boots from the software raid-1 /dev/md0 device.

Grub 

Linux Mint finished booting on software raid

Update June 11th 2018:
After I installed Linux Mint on the raid device, I discovered that Windows 7 (on /dev/sdc) did not boot anymore and failed with the following error:

MBR

BOOTMGR is missing
Press Ctrl+Alt+Del to restart

All attempts to repair the Windows boot with the Windows 7 DVD failed. I had to phyiscally unplug the two Samsung SSD's and then boot from the Windows 7 DVD again.

This time Windows 7 Repair was able to fix the boot loader. Seems Windows 7 repair requires the Windows drive to be discovered as the first drive (/dev/sda) and only then the repair works.

After I was able to boot into my Windows 7 installation on the ADATA SSD again, I replugged the two Samsung SSD's. This made the Windows drive to /dev/sdc again. But with the fixed boot loader, I can now boot into Windows 7 from the Grub menu.

 

Hardware Error: Parity error during data load. Or: Clean me!!!
Thursday - May 24th 2018 - by - (1 comments)

For a couple of months I've always wondered about the following error messages appearing on my NAS, a HP Proliant N40L Microserver running Debian 7 Wheezy, every five minutes:

[Hardware Error]: CPU:0 (10:6:3) MC2_STATUS[-|CE|-|-|AddrV|CECC]: 0x940040000000018a
[Hardware Error]: MC2_ADDR: 0x00000000d3b42540
[Hardware Error]: MC2 Error: : SNP error during data copyback.
[Hardware Error]: cache level: L2, tx: GEN, mem-tx: SNP
[Hardware Error]: Corrected error, no action required.

I came across some articles, like the following:

But none offered real solutions to the problem. Some even said this logged error messages could simply be ignored...

A couple of days ago, I upgraded the NAS server from Debian Wheezy to Jessie (as a mid-way upgrad to Stretch) and realized after the successful OS upgrade, that the log entries now happen ALL THE TIME. I couldn't even use the terminal anymore because it was flooded by these messages:

[ 1026.904428] [Hardware Error]: CPU:0 (10:6:3) MC2_STATUS[-|CE|-|-|AddrV|CECC]: 0x940040000000018a
[ 1026.910229] [Hardware Error]: MC2_ADDR: 0x00000000d3b42540
[ 1026.915945] [Hardware Error]: MC2 Error: : SNP error during data copyback.
[ 1026.921690] [Hardware Error]: cache level: L2, tx: GEN, mem-tx: SNP
[ 1027.182836] [Hardware Error]: Corrected error, no action required.
[ 1027.188553] [Hardware Error]: CPU:0 (10:6:3) MC2_STATUS[-|CE|-|-|AddrV|CECC]: 0x940040000000018a
[ 1027.194345] [Hardware Error]: MC2_ADDR: 0x0000000001af2540
[ 1027.200132] [Hardware Error]: MC2 Error: : SNP error during data copyback.
[ 1027.205915] [Hardware Error]: cache level: L2, tx: GEN, mem-tx: SNP
[ 1027.338890] [Hardware Error]: Corrected error, no action required.
[ 1027.344632] [Hardware Error]: CPU:0 (10:6:3) MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151
[ 1027.350428] [Hardware Error]: MC1_ADDR: 0x0000ffff81012550
[ 1027.356222] [Hardware Error]: MC1 Error: Parity error during data load.
[ 1027.361997] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
[ 1027.430924] [Hardware Error]: Corrected error, no action required.
[ 1027.436645] [Hardware Error]: CPU:0 (10:6:3) MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151
[ 1027.442419] [Hardware Error]: MC1_ADDR: 0x0000ffff810b2550
[ 1027.448216] [Hardware Error]: MC1 Error: Parity error during data load.
[ 1027.453960] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
[ 1027.939102] [Hardware Error]: Corrected error, no action required.

Damn. It's time to dig into that problem again. This time I got luckier and came across this forum thread:

The most interesting posted text there was:

"It is most likely a CPU fan dust bunny. That's the signal from the kernel to clean those out."

As easy as this sounds, it made sense. The microserver has been running day and night since it became my NAS server in December 2012 (see article Building a home file server with HP Proliant N40L). That's more than 5 years of total run time. As you might be aware of, the motherboard of this Microserver is under the drive cage and not easily accessible. And therefore not easily cleanable either.

I gave it a shot, shut down the server, removed the cables from the motherboard and pulled it out.

Dust on the heat sink causing hardware errors in kernel log 

There it is. A thick layer of dust sitting on the CPU's heat sink.

I cleaned the motherboard (vacuumed the dust off), re-attached the cable and pushed the motherboard back in position. Time of truth. I booted the server.

Checking syslog, you can easily see when I turned off (15:28) and booted the server again (15:42):

May 24 15:28:04 nas kernel: [77872.129490] [Hardware Error]: CPU:0 (10:6:3) MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151
May 24 15:28:04 nas kernel: [77872.135237] [Hardware Error]: MC1_ADDR: 0x0000ffff810b2550
May 24 15:28:04 nas kernel: [77872.140955] [Hardware Error]: MC1 Error: Parity error during data load.
May 24 15:28:04 nas kernel: [77872.146656] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
May 24 15:28:04 nas kernel: [77872.263866] [Hardware Error]: Corrected error, no action required.
May 24 15:28:04 nas kernel: [77872.269509] [Hardware Error]: CPU:0 (10:6:3) MC2_STATUS[-|CE|-|-|AddrV|CECC]: 0x940040000000018a
May 24 15:28:04 nas kernel: [77872.275283] [Hardware Error]: MC2_ADDR: 0x0000000001af2540
May 24 15:28:04 nas kernel: [77872.280990] [Hardware Error]: MC2 Error: : SNP error during data copyback.
May 24 15:28:04 nas kernel: [77872.286694] [Hardware Error]: cache level: L2, tx: GEN, mem-tx: SNP
May 24 15:28:04 nas kernel: [77872.323890] [Hardware Error]: Corrected error, no action required.
May 24 15:28:04 nas kernel: [77872.329552] [Hardware Error]: CPU:0 (10:6:3) MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151
May 24 15:28:04 nas kernel: [77872.335294] [Hardware Error]: MC1_ADDR: 0x0000ffff810b2550
May 24 15:28:04 nas kernel: [77872.341013] [Hardware Error]: MC1 Error: Parity error during data load.
May 24 15:28:04 nas kernel: [77872.346716] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
May 24 15:28:04 nas kernel: [77872.371793] [Hardware Error]: Corrected error, no action required.
May 24 15:28:04 nas kernel: [77872.377085] [Hardware Error]: CPU:0 (10:6:3) MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151
May 24 15:28:04 nas kernel: [77872.382397] [Hardware Error]: MC1_ADDR: 0x0000ffff810b2540
May 24 15:28:04 nas kernel: [77872.387718] [Hardware Error]: MC1 Error: Parity error during data load.
May 24 15:28:04 nas kernel: [77872.393030] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
May 24 15:42:13 nas kernel: [    0.000000] Initializing cgroup subsys cpuset
May 24 15:42:13 nas kernel: [    0.000000] Initializing cgroup subsys cpu
May 24 15:42:13 nas kernel: [    0.000000] Initializing cgroup subsys cpuacct
May 24 15:42:13 nas kernel: [    0.000000] Linux version 3.16.0-6-amd64 (debian-kernel@lists.debian.org) (gcc version 4.9.2 (Debian 4.9.2-10+deb8u1) ) #1 SMP Debian 3.16.56-1+deb8u1 (2018-05-08)
May 24 15:42:13 nas kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.0-6-amd64 root=UUID=e00b8ddf-5247-4b9f-834c-d557df90f575 ro quiet
May 24 15:42:13 nas kernel: [    0.000000] e820: BIOS-provided physical RAM map:

Then, I waited. From the logs above (which flooded my terminal) you can see that already after 1026 seconds of uptime the hardware errors appeared.

Now, after 1200 seconds of uptime, still no hardware errors:

root@nas:~# uptime
 16:03:00 up 20 min,  1 user,  load average: 0.04, 0.15, 0.09

root@nas:~# echo $((20 * 60 ))
1200

root@nas:~# dmesg | tail
[   10.257700] RPC: Registered named UNIX socket transport module.
[   10.257706] RPC: Registered udp transport module.
[   10.257709] RPC: Registered tcp transport module.
[   10.257711] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   10.272263] FS-Cache: Loaded
[   10.321299] FS-Cache: Netfs 'nfs' registered for caching
[   10.376030] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[   11.809469] tg3 0000:02:00.0 eth0: Link is up at 1000 Mbps, full duplex
[   11.809478] tg3 0000:02:00.0 eth0: Flow control is on for TX and on for RX
[   11.809506] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

Even after now 41 minutes (=2460 seconds) of uptime, still no errors:

root@nas:~# uptime && dmesg |tail
 16:23:49 up 41 min,  1 user,  load average: 0.02, 0.03, 0.01
[   10.257700] RPC: Registered named UNIX socket transport module.
[   10.257706] RPC: Registered udp transport module.
[   10.257709] RPC: Registered tcp transport module.
[   10.257711] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   10.272263] FS-Cache: Loaded
[   10.321299] FS-Cache: Netfs 'nfs' registered for caching
[   10.376030] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[   11.809469] tg3 0000:02:00.0 eth0: Link is up at 1000 Mbps, full duplex
[   11.809478] tg3 0000:02:00.0 eth0: Flow control is on for TX and on for RX
[   11.809506] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

These error messages really turned out to be a warning from the OS to clean the server. Who'd have thought that looking at these hardware error messages...

 

How to build a generic Icinga2 service graph in Grafana using InfluxDB
Friday - May 11th 2018 - by - (0 comments)

In the past weeks I've spent quite some time (whenever I had time) to slowly kick off the new monitoring architecture using a dual-master Icinga2 installation, InfluxDB as graphing database and Grafana as graph displaying software. See previous articles "Icinga2 graphing with InfluxDB and Grafana" and "Create separate measurement tables in InfluxDB for Icinga 2 NRPE checks" for further information.

I was quite happy so far with the dashboard I created in Grafana, based on the Icinga2 Grafana dashboard template:

Grafana Icinga2 Linux Dashboard 

But I was missing some dynamic graphs showing up. We currently have around 850 host objects in our "old" Icinga2 monitoring and not all the hosts are the same. Some have additional database checks, some have HTTP checks, some are running on Windows, others again have very specific application checks. It's difficult in the main dashboard to represent all the services with the (fixed) graph elements in Grafana.

Eventually I came across a question from user TryTryAgain on serverfault, which was basically more about creating a dynamic action_url to point to Grafana. The question itself was irrelvant to me, but something interesting hit my eye:

" I'd like this to work: action_url = "http://grafana-server.example/grafana/dashboard/db/generic-check?var-device=$HOSTNAME$&var-check=$SERVICEDESC$&var-checkmetric=$SERVICECHECKCOMMAND$&var-datatype=perfdata&var-value=value" "

So the user prepared a new template "generic-check" and uses dynamic variables to display the graph for one exact service. That's a great idea!

As I couldn't find a pre-existing template for such a generic graph, I went on to create it. And had to dig deeper into InfluxDB's queries and schemas...

 

1. Prepare the graph

I added a single graph, first with a static data query:

SELECT mean("value") FROM hostalive WHERE ("hostname" =~ /mytesthost/) AND $timeFilter GROUP BY time($__interval) fill(previous)

This graph needs to be adjusted in the next steps, as I added dynamic variables.

 

2. Variable $hostname

I decided I want the generic service template to start with the actual host object. This is usually the most important marker (from which host object do I need the graph?). I created the $hostname variable in the template's templating variables:

$hostname = SHOW TAG VALUES FROM "hostalive" WITH KEY = "hostname"

This query looks up the "hostalive" measurements table and shows all unique values of the key "hostname". Remember, InfluxDB is basically explained a key-value store (like Redis).

This one was pretty easy and immediately showed up all the hosts prepared in the new Icinga2 architecture:

Grafana variable hostname 

To use the dynamic variable $hostname in the graph, the graph's query needs to be adjusted:

SELECT mean("value") FROM hostalive WHERE ("hostname" =~ /^$hostname$/) AND $timeFilter GROUP BY time($__interval) fill(previous)

 

3. Variable $check

Now it gets slightly more complicated. Now that the host object is selected by using $hostname, Grafana needs to look up for which services it is able to display graphs. I decided the best way would be to look into the different measurement tables. I did this by:

$check = SHOW measurements

But I wasn't happy with that because it just showed all measurement tables, even irrelevant ones like "http" for a non-webserver.

Luckily the show measurements query also allows a WHERE clause:

$check = SHOW measurements WHERE "hostname" =~ /^$hostname$/

This way InfluxDB only shows measurement tables in which our already selected host object has already some data entries.

Grafana Dynamic Variable Check 

To use the dynamic variable $check in the graph, the graph's query needs to be adjusted:

SELECT mean("value") FROM /^$check$/ WHERE ("hostname" =~ /^$hostname$/) AND $timeFilter GROUP BY time($__interval) fill(previous)

 

4. Variable $service

At first I thought my template is almost complete with the defined $check. It worked for example for "ssh" checks, which is (normally) a single service check on the host object. But a very good example disproving it is a disk check: You usually run disk usage checks (check_disk) on several partitions on the same host object, having therefore multiple service objects in Icinga 2. In such a case, the query of the selected $check returns multiple results. A graph would then simply take all the data together (value), whether the value came from a partition "/" or "/tmp". This is wrong.

So I needed to create another variable $service which represents the already existing data for the selected $check:

$service = SHOW TAG VALUES FROM $check WITH KEY = "service" WHERE "hostname" =~ /^$hostname$/

In the following example, a host object with several tcp checks gives the following selection:

Grafana dynamic variable service 

 

5. Variable $metric

But a check can return multiple values! For example a check_http usually returns two sets of performance data: The size of the response and the time of the response (response time). To get the graph we actually want, for example response time of a http check, another flexible variable $metric was defined:

$metric = SHOW TAG VALUES FROM "$check" WITH KEY = "metric" WHERE "hostname" =~ /^$hostname$/

The new variable now allows to select the relevant data:

Grafana dynamic variable metric

To use the dynamic variable $metric in the graph, the graph's query needs to be adjusted:

SELECT mean("value") FROM /^$check$/ WHERE ("hostname" =~ /^$hostname$/ AND "metric" =~ /^$metric$/) AND $timeFilter GROUP BY time($__interval) fill(previous)

 

6. Variable $aggregation

I thought I was done and the first few tests looked promising. Until I came across a host having a MySQL running on it. The graphs for MySQL (metric: connections) just grew:

Grafana MySQL Connections wrong 

This is a misinterpretation by the graph because the MySQL connections is a so-called counter (connection = connection +1). Because this is a single graph, how does Grafana know what kind of data it gets?

The solution is to provide yet another dynamic variable $aggregation. With this variable, Grafana can be told how to display the data. I created a custom variable for this purpose with two values:

Grafana dynamic variable aggregation

$aggregation = mean("value"),derivative(mean("value"))

To use the dynamic variable $aggregation in the graph, the graph's query needs to be adjusted:

SELECT $aggregation FROM /^$check$/ WHERE ("hostname" =~ /^$hostname$/ AND "metric" =~ /^$metric$/) AND $timeFilter GROUP BY time($__interval) fill(previous)

The graph for the counter data now dynamically adapts:

Grafana MySQL connections correct 

 

The full picture

Grafana Icinga2 Dynamic Generic Service Graph 

The template "Generic Service" can be downloaded here as json export.

 

Next steps

Now being able to dynamically show up a graph for a generic service (by manually selecting the values from the drop down fields or by using the variables in the URL), I will try to use that somehow for the "action_url" in Icinga 2 or integrate it differently.

 

Open issues:

The graphs for disk/partition usage are not showing up due to an error in the query. I haven't found out yet why this happens (probably due to a conflict between $service and $metric, not sure) but for now I can live with it.

Grafana generic graph failing for disk

 


Go to Homepage home
Linux Howtos how to's
Monitoring Plugins monitoring plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7148 Days
until Death of Computers
Why?