Ask Your Question
0

Intermittent access when configure using mariadb-galera

asked 2015-01-12 20:41:29 -0500

senyapsudah gravatar image

it has been a while since my last post question. i'm currently implementing openstack juno and would like to achieve high availability on sql database server. so i have configure 3 node of mariadb galera as backend sql. so far i can see the database is replicated between the 3 nodes. however, i hit some issue where i keet getting error below when i run any keystone glance cinder neutron or nova client:

Authorization Failed: An unexpected error prevented the server from fulfilling your request: (OperationalError) (2006, 'MySQL server has gone away') 'SELECT user.id AS user_id, user.name AS user_name, user.domain_id AS user_domain_id, user.password AS user_password, user.enabled AS user_enabled, user.extra AS user_extra, user.default_project_id AS user_default_project_id \nFROM user \nWHERE user.name = %s AND user.domain_id = %s' ('myusername', 'default') (Disable debug mode to suppress these details.) (HTTP 500)

it is an intermittent issue, at first i thought it was due to setting on haproxy. but then i try to remove from ha proxy and point the sql directly to the node and i can see the same issue still happened.

is there any thing else that i need to take note if would like to configure with galera mariadb? please find below are my galera output.

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep%';
+------------------------------+-------------------------------------------------+
| Variable_name                | Value                                           |
+------------------------------+-------------------------------------------------+
| wsrep_local_state_uuid       | 4ebb6111-9a04-11e4-ad66-9ff2b3cfa0eb            |
| wsrep_protocol_version       | 5                                               |
| wsrep_last_committed         | 15388                                           |
| wsrep_replicated             | 3550                                            |
| wsrep_replicated_bytes       | 2151539                                         |
| wsrep_repl_keys              | 16488                                           |
| wsrep_repl_keys_bytes        | 213557                                          |
| wsrep_repl_data_bytes        | 1710782                                         |
| wsrep_repl_other_bytes       | 0                                               |
| wsrep_received               | 38                                              |
| wsrep_received_bytes         | 1143                                            |
| wsrep_local_commits          | 3550                                            |
| wsrep_local_cert_failures    | 0                                               |
| wsrep_local_replays          | 0                                               |
| wsrep_local_send_queue       | 0                                               |
| wsrep_local_send_queue_avg   | 0.000559                                        |
| wsrep_local_recv_queue       | 0                                               |
| wsrep_local_recv_queue_avg   | 0.000000                                        |
| wsrep_local_cached_downto    | 11839                                           |
| wsrep_flow_control_paused_ns | 0                                               |
| wsrep_flow_control_paused    | 0.000000                                        |
| wsrep_flow_control_sent      | 0                                               |
| wsrep_flow_control_recv      | 0                                               |
| wsrep_cert_deps_distance     | 1.025915                                        |
| wsrep_apply_oooe             | 0.007324                                        |
| wsrep_apply_oool             | 0.000000                                        |
| wsrep_apply_window           | 1.007324                                        |
| wsrep_commit_oooe            | 0.000000                                        |
| wsrep_commit_oool            | 0.000000                                        |
| wsrep_commit_window          | 1.000000                                        |
| wsrep_local_state            | 4                                               |
| wsrep_local_state_comment    | Synced                                          |
| wsrep_cert_index_size        | 26                                              |
| wsrep_causal_reads           | 0                                               |
| wsrep_cert_interval          | 0.008169                                        |
| wsrep_incoming_addresses     | 10.10.10.1:3306,10.10.10.3:3306,10.10.10.2:3306 |
| wsrep_cluster_conf_id        | 3                                               |
| wsrep_cluster_size           | 3                                               |
| wsrep_cluster_state_uuid     | 4ebb6111-9a04-11e4-ad66-9ff2b3cfa0eb            |
| wsrep_cluster_status         | Primary                                         |
| wsrep_connected              | ON                                              |
| wsrep_local_bf_aborts        | 0                                               |
| wsrep_local_index            | 0                                               |
| wsrep_provider_name          | Galera                                          |
| wsrep_provider_vendor        | Codership Oy <info@codership.com>               |
| wsrep_provider_version       | 25.3.5-wheezy(rXXXX)                            |
| wsrep_ready                  | ON                                              |
| wsrep_thread_count           | 2                                               |
+------------------------------+-------------------------------------------------+
48 rows in set (0.00 sec)

MariaDB [(none)]>

or is there any issue if implement with galera? please advise.

edit retag flag offensive close merge delete

7 answers

Sort by ยป oldest newest most voted
1

answered 2015-01-14 15:07:50 -0500

capsali gravatar image

i managed to bypass that error by increasing wait_timeout from default 600 to 28800 and connect_timeout from 5 to 20 in mysql config. I am not getting any errors since!

Also i dropped haproxy for galera since it doesnt load balance anything in activa/backup/backup. I instead use keepalive VIP on all mysql servers and point all openstack services to that VIP for database section.

The drawback for this is that mysql will keep connections opened for a much longer time! I had no problems untill now but i'm only in half production with the system so not so many queries are made!

edit flag offensive delete link more

Comments

i have test this also. still give me the same issue. i have change the wait timeout and connect timeout and also idle timeout on every service. but still no luck for me. i fallback to mariadb standalone server. works like a charm again. :(

senyapsudah gravatar imagesenyapsudah ( 2015-01-15 02:00:06 -0500 )edit

wait_timeout and connect_timeout should be change in /etc/mysql/my.cnf not in openstack services!

capsali gravatar imagecapsali ( 2015-01-15 06:40:35 -0500 )edit

hi capsali, yes. i did change it inside my.cnf. but still no luck.

senyapsudah gravatar imagesenyapsudah ( 2015-01-15 07:03:58 -0500 )edit

are u running galera through haproxy? Because if so it will not work...Use VIP on mysql servers!

capsali gravatar imagecapsali ( 2015-01-15 07:58:13 -0500 )edit
2

answered 2015-01-13 05:22:21 -0500

marcyb17 gravatar image

Ensure that your sql_idle_timeout setting for all services is under 300. Once we set ours to 240 we stopped getting the "MySQL server has gone away" errors

edit flag offensive delete link more

Comments

hi marcyb17,

just to confirm, the setting should be on my.cnf rite? :)

thanks

senyapsudah gravatar imagesenyapsudah ( 2015-01-13 06:58:24 -0500 )edit

my bad it should be on all services like keystone and others. but i'm not sure is it still working? as i can see in juno it stated this..

senyapsudah gravatar imagesenyapsudah ( 2015-01-13 07:07:01 -0500 )edit

Timeout before idle SQL connections are reaped. (integer value)

Deprecated group/name - [DEFAULT]/sql_idle_timeout

Deprecated group/name - [DATABASE]/sql_idle_timeout

Deprecated group/name - [sql]/idle_timeout

#idle_timeout = 3600

should i used idle timeout instead?

senyapsudah gravatar imagesenyapsudah ( 2015-01-13 07:07:21 -0500 )edit

No we have a sql_idle_timeout in the following conf files: /etc/nova/nova.conf /etc/glance/glance-registry.conf /etc/cinder/cinder.conf

Wherever there is a sql_idle_timeout value in one of the configuration files then it should be set to under 300

marcyb17 gravatar imagemarcyb17 ( 2015-01-13 07:22:03 -0500 )edit

thanks bro. found it. and it seems working. i'm not hitting mysql server error anymore you save my day. :)

senyapsudah gravatar imagesenyapsudah ( 2015-01-13 08:16:28 -0500 )edit
0

answered 2015-04-19 07:08:53 -0500

Tung Nguyen gravatar image

I have same issue. I increase wait_timeout and connect_timeout but it's not work for me.

edit flag offensive delete link more
0

answered 2015-01-13 08:17:33 -0500

senyapsudah gravatar image

updated 2015-01-14 08:09:48 -0500

seems like the issue still not resolved after some time the issue came back. any thing else that we should check??

edit flag offensive delete link more
0

answered 2015-11-20 21:22:28 -0500

I had the same issue with keystone and a MariaDB Galera cluster behind a HAProxy. I've increased the idle_timeout in the keystone configuration to 3600 and I also increased the limits in the my.cnf file to:

max_allowed_packet = 128M

wait_timeout = 3600

edit flag offensive delete link more
0

answered 2015-01-12 21:41:15 -0500

You can not run active/active/active for your SQL cluster do to the way Openstack components do record / table locking on the db.

You can run active / backup/ backup on haproxy for the db cluster.

The db will still be active active but haproxy needs to load balance to one active db and be standby for the other 2 nodes.

edit flag offensive delete link more

Comments

Thats is what i'm thingking also. so i configure all configuration to directly point to master mariadb server. but still same thing happened. i keep getting "MySQL server has gone away" any suggestion on how to troubleshoot this?

senyapsudah gravatar imagesenyapsudah ( 2015-01-13 02:11:42 -0500 )edit
0

answered 2016-06-23 22:00:42 -0500

xugangsh gravatar image

updated 2016-06-23 22:02:44 -0500

I hit the same issue when use openstack mitaka on centos 7 with mariadb 10.1.*

Seemed after all service online, it will throw out above error.

I did not change anythin except restart mariadb service, then wait for some time

Everything worked.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2015-01-12 20:41:29 -0500

Seen: 1,577 times

Last updated: Jun 23 '16