Ask Your Question
0

Keystone authentication failure in a HA Setup

asked 2016-04-06 08:31:06 -0500

zekken gravatar image

updated 2016-04-07 02:11:13 -0500

I am running two controllers as active/active under two haproxy nodes as active/passive using a VIP. All the services in controller nodes are load balanced. I am facing a strange situation. Whenever I am trying to keystone command, for the first two attempts it gives me this error : An unexpected error prevented the server from fulfilling your request. (HTTP 500), and then when I run the commands continuously without much time gap, it works fine. Then when I try to run the same commands after sometime, again the same situation comes into picture. No significant error message in the logs is noticed. I am very new to load balancing, so unable to figure out whether it is a load balancing issue or a something else. This is my haproxy.cfg :

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode    http
    option  httplog
    option  dontlognull
        contimeout 5000
        clitimeout 50000
        srvtimeout 50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

listen galera 192.168.1.64:3306
        balance source
        mode tcp
        option tcpka
        option mysql-check user haproxy
        server Controller1 192.168.1.61:3306 check weight 1
        server Controller2 192.168.1.62:3306 check weight 1

listen keystone_admin 192.168.1.64:35357
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:35357 check inter 2000 rise 2 fall 5
        server Controller2 192.168.1.62:35357 check inter 2000 rise 2 fall 5

listen keystone_api 192.168.1.64:5000
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:5000 check inter 2000 rise 2 fall 5
        server Controller2 192.168.1.62:5000 check inter 2000 rise 2 fall 5

listen glance-api 192.168.1.64:9292
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:9292 check inter 2000 rise 2 fall 5
        server Controller2 192.168.1.62:9292 check inter 2000 rise 2 fall 5

listen glance-registry 192.168.1.64:9191
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:9191 check inter 2000 rise 2 fall 5
        server Controller2 192.168.1.62:9191 check inter 2000 rise 2 fall 5

listen nova_ec2 192.168.1.64:8773
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:8773 check inter 2000 rise 2 fall 5
        server Controller2 192.168.1.62:8773 check inter 2000 rise 2 fall 5

listen nova_osapi 192.168.1.64:8774
        balance source
        option tcpka
        option httpchk
        maxconn 10000
        server Controller1 192.168.1.61:8774 check inter 2000 rise 2 fall 5
        server Controller2 192.168 ...
(more)
edit retag flag offensive close merge delete

Comments

I am facing the same issue in dashboard as well. It gives me such error : Error: Unable to retrieve usage information.. Then I have to logout and then again log in to avoid this error. By this, I am pretty much clear the issue is with load balancing.

zekken gravatar imagezekken ( 2016-04-07 04:51:31 -0500 )edit

3 answers

Sort by ยป oldest newest most voted
0

answered 2016-04-20 05:48:31 -0500

zekken gravatar image

I have fixed the issue by removing the Mysql-galera from the load balancer and configuring a virtual IP on top of the Galera DB cluster.

edit flag offensive delete link more
1

answered 2016-04-07 03:40:22 -0500

Emrvb gravatar image

To have keystone in an active/active setup they must:

  1. Be connected to a clustered (or replicating) database
  2. Be using the same pool of memcache servers

Before anyone can help you further, you should tell more about your setup. Are the above 2 requirements met?

edit flag offensive delete link more

Comments

I have clustered my database using MariaDB galera cluster, both are replicating and load balanced. Memcached is running in both the controller nodes. Does it meet the requirements?

zekken gravatar imagezekken ( 2016-04-07 04:24:57 -0500 )edit

You say memcached is running on both controller nodes. Do both controllers use both memcached servers? Did you specify memcached "host" or "hosts" option?

Are you sure your galera cluster is operating correctly? Which version of mysql-(galera)-server are you running and how did you start it?

Emrvb gravatar imageEmrvb ( 2016-04-07 04:46:24 -0500 )edit

I have specified memcached_servers=192.168.1.61:11211,192.168.1.62:11211 in keystone.conf. I am running mysql-(galera) version 5.5.48. I have started the galera cluster (on one of the nodes) by the command :

service mysql start --wsrep-new-cluster

and in other simply by:

service mysql start
zekken gravatar imagezekken ( 2016-04-07 05:36:59 -0500 )edit

Ok, your galera cluster should be fine. Which version of openstack are you running since i'm not familiar with the memcached_servers directive.

Emrvb gravatar imageEmrvb ( 2016-04-07 06:02:07 -0500 )edit

Anyway, you could easily find out if caching is the cause by disabling it. Also it's worth noting that before a certain release (and I think it's liberty) active/active is not supported by keystone.

Emrvb gravatar imageEmrvb ( 2016-04-07 06:07:15 -0500 )edit
0

answered 2016-04-06 15:31:58 -0500

kaustubh gravatar image

Perhaps its the load balancer. How do you load balance the requests?
I can make a __guess__ as follows:
When you try to execute a command, the credentials are sent to controller1, it generates a token and gives back to you. You use this token to execute a command, but this time the request is sent to controller2, who thinks the token is invalid (as it was generated by controller1). On the second attempt, controller2 generates a token, but it is sent to controller1 (who rejects it). But now, your subsequent commands succeed because you have tokens from both controllers (and I think haproxy somehow keeps track of connections) and they are sent to the correct nodes. Again you face the problem when the token expires.

One thing to prevent would be to synchronize tokens depending on the backend used. Of course, the sync should be fast enough so that both the nodes have the token before the request is sent. Or maybe, you can place the service in active/passive mode? However, you might face the same issue when the active node fails.

edit flag offensive delete link more

Comments

Hi. I have added my haproxy.cfg file. My load balancer IP is 192.168.1.64, Controller1 IP is 192.168.1.61, Controller2 IP is 192.168.1.62. I have taken help from this blog

zekken gravatar imagezekken ( 2016-04-07 02:14:20 -0500 )edit

What you can do as a workaround is to place keystone in active/standby mode. You can do this by using the backup keyword in keystone and memcache config stanzas. More info here

kaustubh gravatar imagekaustubh ( 2016-04-08 15:09:44 -0500 )edit

However, you will face the same problem when the active server goes down.

kaustubh gravatar imagekaustubh ( 2016-04-08 15:10:17 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2016-04-06 08:31:06 -0500

Seen: 1,066 times

Last updated: Apr 20 '16