Revision history [back]

click to hide/show revision 1
initial version

Also considering the same question myself at the moment...

3 nodes (DELL R420s w/16GB RAM):

  • Controller/Network : running nova/neutron/glance/cinder etc, plus horizon, mysql, rabbitmq and memcache
  • 2x Compute nodes: running nova/neutron/cinder

I believe the problem is that Mysql and RabbitMQ are using inordinate amounts of CPU and memory, and causing the Controller1 node to constantly run into swap memory. The page requests from Horizon query APIs which are backed by MySQL, and so I suspect it is down to slow db queries.

The compute nodes and the VMs themselves are running along nicely. It's just Horizon, which can take 15-30 seconds between page clicks on some occassions. We're only running 5-6 VMs, so I can't figure out how/why the mysql and rabbitmq processes can be using the kind of CPU/memory that the process list is reporting they are.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9595 999 20 0 6690688 1.537g 4408 S 56.4 9.8 35:37.97 beam.smp
23598 999 20 0 13.317g 412044 12116 S 1.3 2.5 21:13.65 mysqld

It's been like this since Icehouse (on Mitaka now), although I suspect it's less to do with Openstack itself, and more to do with these services on which it relies. I've read other articles that all suggest it's related to stale keystone tokens, but I've cleared them and it's still the same.

Here's a typical Nova API call from the logs... (15 secs!?)

2016-04-28 06:33:17.438 16517 INFO nova.osapi_compute.wsgi.server [req-4e774ef7-447f-4ec7-9035-4456d1035e95 47ed0175d27e4485b91fee5d076e8aae 9809424358874fe189b6392b8468f177 - - -] 10.0.0.13 "GET /v2/9809424358874fe189b6392b8468fabc/servers/detail?project_id=9809424358874fe189b6392b8468f177 HTTP/1.1" status: 200 len: 3581 time: 15.7520719

Surely this isn't 'standard' performance for all Openstack deploys?

-- Ross

Also considering the same question myself at the moment...

3 nodes (DELL R420s w/16GB RAM):

  • Controller/Network : running nova/neutron/glance/cinder etc, plus horizon, mysql, rabbitmq and memcache
  • 2x Compute nodes: running nova/neutron/cinder

I believe the problem is that Mysql and RabbitMQ are using inordinate amounts of CPU and memory, and causing the Controller1 node to constantly run into swap memory. The page requests from Horizon query APIs which are backed by MySQL, and so I suspect it is down to slow db queries.

The compute nodes and the VMs themselves are running along nicely. It's just Horizon, which can take 15-30 seconds between page clicks on some occassions. We're only running 5-6 VMs, so I can't figure out how/why the mysql and rabbitmq processes can be using the kind of CPU/memory that the process list is reporting they are.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9595 999 20 0 6690688 1.537g 4408 S 56.4 9.8 35:37.97 beam.smp
23598 999 20 0 13.317g 412044 12116 S 1.3 2.5 21:13.65 mysqld

It's been like this since Icehouse (on Mitaka now), although I suspect it's less to do with Openstack itself, and more to do with these services on which it relies. I've read other articles that all suggest it's related to stale keystone tokens, but I've cleared them and it's still the same.

Here's a typical Nova API call from the logs... (15 secs!?)

2016-04-28 06:33:17.438 16517 INFO nova.osapi_compute.wsgi.server [req-4e774ef7-447f-4ec7-9035-4456d1035e95 47ed0175d27e4485b91fee5d076e8aae 9809424358874fe189b6392b8468f177 - - -] 10.0.0.13 "GET /v2/9809424358874fe189b6392b8468fabc/servers/detail?project_id=9809424358874fe189b6392b8468f177 HTTP/1.1" status: 200 len: 3581 time: 15.7520719

Surely this isn't 'standard' performance for all Openstack deploys?

EDIT: Seems like I've been barking up the wrong tree. The majority of the memory was actually being consumed by the various API worker processes. Most API services were spawning one worker process for each CPU core. We don't need 70+ nova-api processes to serve a handful of VMs for a handful of staff, so I discovered the '*_workers' configuration parameters for the nova-api, nova-conductor, glance-api, cinder-api and neutron-api/metadata (IIRC) services. Setting these to more modest values led to less unnecessary processes being spawned, and considerably less memory (and swap) being consumed. The Horizon dashboard, and things in general on the controller node, are now running a lot more happily.

-- Ross