OpenStack Newton Performance Issue

asked 2019-02-05 12:33:51 -0500

jaysonwor gravatar image

Running into a very similar issue as some have previously reported in OpenStack Newton.

Both CLI and API calls to nova (example: GET http://$url:8774/v2/$id/servers/detail) become slower and slower as VM's accumulate. I'm talking like every 50 VM's speed (in requesting/build new VM's and getting the list) drops 1-2secs in response. I'm finding that as nova builds more VM's / more VM's are active the latency increases.

I've tried everything from increasing/decreasing threads and workers and nothing seems to help. We're using memcached, and all max connections in the supporting components: GaleraDB, HAproxy, Rabbitmq, etc... look okay. Furthermore, the system specs that the controllers run on are way more than needed (only using about 20% of a 72core / 500GB bare metal blade)

Anyone run into this, is it a known problem in Newton? Any workarounds or upgrades known to fix this?

edit retag flag offensive close merge delete

Comments

Your bottleneck is most likely the database or the message queue. You may want to find out how to measure and improve Galera and Rabbit performance.

You say you use 20% - of what precisely? If your servers have enough CPU and memory capacity, how about network and storage bandwidth?

Bernd Bausch gravatar imageBernd Bausch ( 2019-02-05 17:38:22 -0500 )edit

Thanks for the response. Right, 20% CPU/Mem, network and storage all fine. Checked both DB and Queues are not piling up, also checked memcache no issue there. I am finding the openstack nova-scheduler is very slow to sched. Any other thoughts?

jaysonwor gravatar imagejaysonwor ( 2019-02-14 08:49:14 -0500 )edit