Galera: Recover a complete environment failure

asked 2020-02-28 07:49:37 -0500

CKi gravatar image

updated 2020-02-28 16:16:16 -0500

Hi All,

Two weeks ago one of our switches failed. As a result, all three nodes of our cluster have gone offline. Now they are online again, but you can't log in on the OpenStack website anymore. The website takes a long time to load when you try to log in and then returns 504 Gateway Time-Out. According to the logs, there are problems with Keystone. Further troubleshooting has led me to Galera. I can see that none of our nodes is running the MariaDB service. (According to docs), we have a "complete environment failure" because cat /var/lib/mysql/grastate.dat returns seqno -1 on all nodes.

How can I recover from that?

edit retag flag offensive close merge delete



You'll have to bootstrap the galera cluster again, how is it managed? If it's a pacemaker env it should recover by itself if you cleanup failed resources. If you have to do it manually I would first try galera_recover(or similar). To do that you'll probably have to edit the grastate.datfile...

eblock gravatar imageeblock ( 2020-02-28 12:57:14 -0500 )edit

the mysql log usually should mention something like that (I don't have a cluster at hand right now).

eblock gravatar imageeblock ( 2020-02-28 12:58:34 -0500 )edit

Already happened here , we used the kolla-ansible playbook to recover the cluster . Even if you don't have Kolla, you can follow the playbook :

chalans gravatar imagechalans ( 2020-02-28 15:49:21 -0500 )edit

@eblock Thanks for the suggestion. mysql just returns ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused"). /etc/mysql/my.cnf mentions a log named /var/log/mysql_logs/galera_server_error.log but this file is empty

CKi gravatar imageCKi ( 2020-02-28 16:20:18 -0500 )edit

Of course mysql returns an error since galera is not running. But depending on your config you should find something in one of the logs. In my environment galera writes to /var/log/mysql/mysqld.log.

eblock gravatar imageeblock ( 2020-03-02 02:54:37 -0500 )edit