RDO: Keystone dies after 1 - 3 days, can't login into horizon dashboard
On a RDO installation (controller and 2 compute nodes) I'd the following problem: couldn't login to the dashboard.
The message by login was:
An error occurred authenticating. Please try again later.
The solution was to start keystone
[root@csky03 ~]# /etc/init.d/openstack-keystone status
keystone dead but pid file exists
[root@csky03 ~]# /etc/init.d/openstack-keystone start
Starting keystone: [ OK ]
[root@csky03 ~]# /etc/init.d/openstack-keystone status
keystone (pid 18754) is running...
But the problem remains, since keystone dies after 1 - 3 days.
Is this RDO specific on an all in one controller and 2 compute nodes? Or does this happen also to other distros.
I could find some similar questions here, which stated to change SELinux to permissive or disable it at all, but in my case thats already in place.
Is anybody else facing a similar problem?
Any ideas why keystone dies after 1 - 3 days?
Note: a funny thing from my observations after some weeks: if I work with the system the whole day, this didn't happen to me anytime, but it seems that, if I sleep, keystone likes to sleep too :-) But sometimes it needs more that 2 days to go in sleep mode.
As a workaround, I'll now create a cron job to start keystone and let you know if it could fix the problem after 1 - 3 days:
[root@controller ~]# crontab -e
0 3 * * * /etc/init.d/openstack-keystone start
What sort of errors do you see in the keystone log file after it has died?
Hi Lars, there are no errors, only the info:
2014-03-17 22:07:45.380 2964 INFO keystone.common.environment.eventlet_server [-] Starting /usr/bin/keystone-all on 0.0.0.0:5000
2014-03-20 18:21:46.511 18754 INFO keystone.common.environment [-] Environment configured as: eventlet
Which openstack-keystone version is that? Please add debug = True in keystone.conf before next restart, hopefully that will catch more info. Or run it from shell with keystone-all -d and capture terminal output e.g. using script(1)
RIP Keystone.
Have you looked in the system logs to see if the process was killed because of out of memory issues?