Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

Galera Arbitrator garbd not starting: Exception in creating receive loop
Wednesday - Mar 29th 2017 - by - (0 comments)

For a new with a HA database I decided to create a Galera cluster, as I already installed a couple of Galera clusters so far (see MySQL Galera cluster not starting (failed to open channel). But this time I decided to create a two node cluster with an Arbitrator service for split-brain situations.

The Galera Arbitrator service is a daemon process (garbd) which simply connects to the Galera cluster and is from then on part of the cluster. However there are no databases synced on the disk - it's a pure member, not a data node. That works great for this scenario because we have a dual data center anyway and I don't need three times the same data in two data centers.

I created a config filefor garbd, according to the Galera Arbitrator documentation:

root@garb:~# cat /etc/garbd.conf
# arbtirator.config
group = MYCLUSTER
address = gcomm://10.161.206.45,10.161.206.46

But when I tried to start garbd, it failed:

root@garb:~# garbd --cfg /etc/garbd.conf
2017-03-29 15:47:01.480  INFO: CRC-32C: using hardware acceleration.
2017-03-29 15:47:01.480  INFO: Read config:
    daemon:  0
    name:    garb
    address: gcomm://10.161.206.45,10.161.206.46
    group:   ATLDB
    sst:     trivial
    donor:  
    options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
    cfg:     /etc/garbd.conf
    log:  

I came across a github issue which stated that ports are required:

"garbd" consistently failed to start unless the configuration [...] explicitly provided the port number.

Important here is to note that we're talking about Galera ports, not MySQL/MariaDB ports (3306).
The default Galera port is 4567 and can be verified on one of the Galera data nodes:

root@galera-node1:~# netstat -lntup | grep mysql
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      2971/mysqld    
tcp        0      0 0.0.0.0:4567            0.0.0.0:*               LISTEN      2971/mysqld 

Using the port 4567, I adapted /etc/garbd.conf:

root@garb:~#  cat /etc/garbd.conf
# arbtirator.config
group = ATLDB
address = gcomm://10.161.206.45:4567,10.161.206.46:4567

Start test:

root@garb:~# garbd --cfg /etc/garbd.conf
2017-03-29 15:48:31.289  INFO: CRC-32C: using hardware acceleration.
2017-03-29 15:48:31.289  INFO: Read config:
    daemon:  0
    name:    garb
    address: gcomm://10.161.206.45:4567,10.161.206.46:4567
    group:   ATLDB
    sst:     trivial
    donor:  
    options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
    cfg:     /etc/garbd.conf
    log:    

2017-03-29 15:48:31.290  INFO: protonet asio version 0
2017-03-29 15:48:31.290  INFO: Using CRC-32C for message checksums.
2017-03-29 15:48:31.290  INFO: backend: asio
2017-03-29 15:48:31.290  INFO: gcomm thread scheduling priority set to other:0
2017-03-29 15:48:31.290  WARN: access file(./gvwstate.dat) failed(No such file or directory)
2017-03-29 15:48:31.290  INFO: restore pc from disk failed
2017-03-29 15:48:31.291  INFO: GMCast version 0
2017-03-29 15:48:31.291  INFO: (6520b85a, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2017-03-29 15:48:31.291  INFO: (6520b85a, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2017-03-29 15:48:31.291  INFO: EVS version 0
2017-03-29 15:48:31.291  INFO: gcomm: connecting to group 'ATLDB', peer '10.161.206.45:4567,10.161.206.46:4567'
2017-03-29 15:48:31.293  INFO: (6520b85a, 'tcp://0.0.0.0:4567') connection established to 6a1ea4ef tcp://10.161.206.45:4567
2017-03-29 15:48:31.293  INFO: (6520b85a, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2017-03-29 15:48:31.296  INFO: (6520b85a, 'tcp://0.0.0.0:4567') connection established to 5d311f46 tcp://10.161.206.46:4567
2017-03-29 15:48:31.585  INFO: declaring 5d311f46 at tcp://10.161.206.46:4567 stable
2017-03-29 15:48:31.585  INFO: declaring 6a1ea4ef at tcp://10.161.206.45:4567 stable
2017-03-29 15:48:31.586  INFO: Node 5d311f46 state prim
2017-03-29 15:48:31.587  INFO: view(view_id(PRIM,5d311f46,5) memb {
    5d311f46,0
    6520b85a,0
    6a1ea4ef,0
} joined {
} left {
} partitioned {
})
2017-03-29 15:48:31.587  INFO: save pc into disk
2017-03-29 15:48:31.792  INFO: gcomm: connected
2017-03-29 15:48:31.792  INFO: Changing maximum packet size to 64500, resulting msg size: 32636
2017-03-29 15:48:31.792  INFO: Shifting CLOSED -> OPEN (TO: 0)
2017-03-29 15:48:31.792  INFO: Opened channel 'ATLDB'
2017-03-29 15:48:31.792  INFO: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
2017-03-29 15:48:31.792  INFO: STATE EXCHANGE: Waiting for state UUID.
2017-03-29 15:48:31.792  INFO: STATE EXCHANGE: sent state msg: 64f868a2-1486-11e7-b76e-b64f3ece7e23
2017-03-29 15:48:31.792  INFO: STATE EXCHANGE: got state msg: 64f868a2-1486-11e7-b76e-b64f3ece7e23 from 0 (inf-atldb02-p)
2017-03-29 15:48:31.792  INFO: STATE EXCHANGE: got state msg: 64f868a2-1486-11e7-b76e-b64f3ece7e23 from 2 (inf-atldb01-p)
2017-03-29 15:48:31.793  INFO: STATE EXCHANGE: got state msg: 64f868a2-1486-11e7-b76e-b64f3ece7e23 from 1 (garb)
2017-03-29 15:48:31.793  INFO: Quorum results:
    version    = 4,
    component  = PRIMARY,
    conf_id    = 4,
    members    = 2/3 (joined/total),
    act_id     = 0,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 6a1f102a-13a3-11e7-b710-b2876418a643
2017-03-29 15:48:31.793  INFO: Flow-control interval: [9999999, 9999999]
2017-03-29 15:48:31.793  INFO: Shifting OPEN -> PRIMARY (TO: 0)
2017-03-29 15:48:31.793  INFO: Sending state transfer request: 'trivial', size: 7
2017-03-29 15:48:31.795  INFO: Member 1.0 (garb) requested state transfer from '*any*'. Selected 0.0 (inf-atldb02-p)(SYNCED) as donor.
2017-03-29 15:48:31.795  INFO: Shifting PRIMARY -> JOINER (TO: 0)
2017-03-29 15:48:31.796  INFO: 0.0 (inf-atldb02-p): State transfer to 1.0 (garb) complete.
2017-03-29 15:48:31.796  INFO: 1.0 (garb): State transfer from 0.0 (inf-atldb02-p) complete.
2017-03-29 15:48:31.796  INFO: Shifting JOINER -> JOINED (TO: 0)
2017-03-29 15:48:31.797  INFO: Member 0.0 (inf-atldb02-p) synced with group.
2017-03-29 15:48:31.797  INFO: Member 1.0 (garb) synced with group.
2017-03-29 15:48:31.797  INFO: Shifting JOINED -> SYNCED (TO: 0)

It does indeed look better now! A verification on data node 1 confirmed that the cluster size increased from 2 to 3:

root@galera-node1:~#  mysql -B -e "SHOW STATUS WHERE variable_name ='wsrep_local_state_comment' \
OR variable_name ='wsrep_cluster_size' \
OR variable_name ='wsrep_incoming_addresses' \
OR variable_name ='wsrep_cluster_status' \
OR variable_name ='wsrep_connected' \
OR variable_name ='wsrep_ready' \
OR variable_name ='wsrep_local_state_uuid' \
OR variable_name ='wsrep_cluster_state_uuid';"

Variable_name    Value
wsrep_cluster_size    3
wsrep_cluster_state_uuid    6a1f102a-13a3-11e7-b710-b2876418a643
wsrep_cluster_status    Primary
wsrep_connected    ON
wsrep_incoming_addresses    ,10.161.206.46:3306,10.161.206.45:3306
wsrep_local_state_comment    Synced
wsrep_local_state_uuid    6a1f102a-13a3-11e7-b710-b2876418a643
wsrep_ready    ON

Note that the garbd machine doesn't show up in the row "wsrep_incoming_addresses". It's merely showing up "empty" (note the comma). That makes sense, because there is no MySQL running on the Arbitrator Service machine, ergo no 3306 listener.

 

Add a comment

Show form to leave a comment

Comments (newest first):

No comments yet.

Go to Homepage home
Linux Howtos how to's
Monitoring Plugins monitoring plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7423 Days
until Death of Computers
Why?