Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

MySQL Galera cluster not starting (failed to open channel)
Monday - Jul 14th 2014 - by - (0 comments)

On a Galera Cluster test environment which was previously shut down (two virtual servers on the same physical machine), I got the following error message when I tried to start MySQL on the first cluster node:

/etc/init.d/mysql start
 * Starting MariaDB database server mysqld     [fail]

The detailed information was logged in the background in syslog:

Jul 14 15:17:07 node1 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
Jul 14 15:17:07 node1 mysqld_safe: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.iuhkNF' --pid-file='/var/lib/mysql/node1-recover.pid'
Jul 14 15:17:09 node1 mysqld_safe: WSREP: Recovered position cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:17:09 node1 mysqld: 140714 15:17:09 [Note] WSREP: wsrep_start_position var submitted: 'cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488'
[...]
Jul 14 15:17:13 node1 mysqld: 140714 15:17:13 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50224S), skipping check
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused")'
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
Jul 14 15:17:38 node1 /etc/init.d/mysql[11978]:
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [Note] WSREP: view((empty))
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: #011 at gcomm/src/pc.cpp:connect():141
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed to open backend connection: -110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'Galera Test' at 'gcomm://192.168.41.11,192.168.41.12': -110 (Connection timed out)
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: gcs connect failed: Connection timed out
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] WSREP: wsrep::connect() failed: 7
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [ERROR] Aborting
Jul 14 15:17:42 node1 mysqld:
Jul 14 15:17:42 node1 mysqld: 140714 15:17:42 [Note] WSREP: Service disconnected.
Jul 14 15:17:43 node1 mysqld: 140714 15:17:43 [Note] WSREP: Some threads may fail to exit.
Jul 14 15:17:43 node1 mysqld: 140714 15:17:43 [Note] /usr/sbin/mysqld: Shutdown complete
Jul 14 15:17:43 node1 mysqld:
Jul 14 15:17:43 node1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended

The important information here is the following line:

[ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)

When node1 starts MySQL, it tries to join an existing cluster. But because both nodes are currently down, there is no primary node available (see this page for a good and short explanation).

So when a Galera Cluster must be started from "zero" again, the first node must be started with the "wsrep-new-cluster" command (exactly during the set up of a new cluster):

service mysql start --wsrep-new-cluster
 * Starting MariaDB database server mysqld                               [ OK ]
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

In syslog, the following log entries can be found:

Jul 14 15:18:43 node1 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
Jul 14 15:18:43 node1 mysqld_safe: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.hK64YC' --pid-file='/var/lib/mysql/node1-recover.pid'
Jul 14 15:18:46 node1 mysqld_safe: WSREP: Recovered position cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: wsrep_start_position var submitted: 'cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488'
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Start replication
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Setting initial position to cc4fb7ad-e5ab-11e3-8fae-d3fd14daa6a4:391488
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: protonet asio version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: backend: asio
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: GMCast version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: (62a145e9-0b59-11e4-9a2f-c62c46c73c36, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: (62a145e9-0b59-11e4-9a2f-c62c46c73c36, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: EVS version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: PC version 0
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: gcomm: bootstrapping new group 'Galera Test'
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: gcomm: connected
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Opened channel 'Galera Test'
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] WSREP: Quorum results:
Jul 14 15:18:46 node1 mysqld: #011version    = 3,
Jul 14 15:18:46 node1 mysqld: #011component  = PRIMARY,
Jul 14 15:18:46 node1 mysqld: #011conf_id    = 0,
Jul 14 15:18:46 node1 mysqld: #011members    = 1/1 (joined/total),
[...]
Jul 14 15:18:46 node1 mysqld: 140714 15:18:46 [Note] /usr/sbin/mysqld: ready for connections.
Jul 14 15:18:46 node1 mysqld: Version: '10.0.10-MariaDB-1~trusty-wsrep-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution, wsrep_25.10.r3968
Jul 14 15:18:47 node1 /etc/mysql/debian-start[13061]: Upgrading MySQL tables if necessary.

The other nodes can be started normally and they will automatically connect to the primary node.

 

Add a comment

Show form to leave a comment

Comments (newest first):

No comments yet.

Go to Homepage home
Linux Howtos how to's
Nagios Plugins nagios plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7633 Days
until Death of Computers
Why?