Ask Your Question
2

During concurrent instance deployment some instances are going to Error State

asked 2013-12-03 05:04:11 -0500

Sreedhar Nathani gravatar image

updated 2013-12-03 12:07:18 -0500

Setup
  - All the systems are installed with Ubuntu Precise + Havana Bits from Ubuntu Cloud Archive
  - Controller node running on a VM
  - Seperate network node running on VM
  - 16x compute nodes

Creating 30 instances at once using mincount() option in a loop with a sleep time of 20min. Once we cross
200+ active instances in the tenant, most of the subsequenct instances are failing to create network ports and going into error state

Instances are going into error state due to Neutron server error while creating the ports.  

2013-12-02 11:47:19.884 56214 ERROR nova.network.neutronv2.api [-] [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9] Neutron error creating port on network fb4fd94f-9d44-4f22-a347-ffdf8476c148
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9] Traceback (most recent call last):
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/nova/network/neutronv2/api.py", line 182, in _create_port
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     port_id = port_client.create_port(port_req_body)['port']['id']
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 108, in with_params
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     ret = self.function(instance, *args, **kwargs)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 308, in create_port
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     return self.post(self.ports_path, body=body)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 1188, in post
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     headers=headers, params=params)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 1103, in do_request
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     resp, replybody = self.httpclient.do_request(action, method, body=body)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/client.py", line 185, in do_request
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     **kwargs)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]   File "/usr/lib/python2.7/dist-packages/neutronclient/client.py", line 152, in _cs_request
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9]     raise exceptions.ConnectionFailed(reason=e)
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9] ConnectionFailed: Connection to neutron failed: timed out


2013-12-02 11:47:19.884 56214 ERROR nova.network.neutronv2.api [-] [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9] Neutron error creating port on network fb4fd94f-9d44-4f22-a347-ffdf8476c148
2013-12-02 11:47:19.884 56214 TRACE nova.network.neutronv2.api [instance: 2eba02dd-329d-4b21-bc46-700ba285a6c9] Traceback (most recent call last ...
(more)
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
1

answered 2013-12-03 21:44:09 -0500

Ashokb gravatar image

These type of issues were observed even in grizzly, it looks like neutron server is unable to handle the load ,with havana you have an option is to increase the number of api child processes. You can set the api_workers parameter in neutron.conf to a value >1depending upon the CPU cores available in your neutron server. Personally I have not tried this option but this should be able to address your concurrency problem to some extent.

edit flag offensive delete link more

Comments

Thanks for the reply. When done the tests on Grizzly had similar issues for which tuned below values. sqlalchemy_pool_size = 60 sqlalchemy_max_overflow = 120 sqlalchemy_pool_timeout = 2 After the above tuning, I could deploy the 240 instances successfully. I tested this more than 10 times. All the instances would be in active state. In Havana also tuned the similar parameters but still could not make >150 instances active with 30concurrent requests. I am using the same HW and network config which was used Grizzly in Havana as well but with fresh installation Regarding the usage of api_workers in neutron.conf, i don't see such option mentioned in /usr/lib/python2.7/dist-packages/neutron/common/config.py Was this introduced newly? Have already increased the osapi_compute_workers and nova-conductor (workers)

Sreedhar Nathani gravatar imageSreedhar Nathani ( 2013-12-04 02:51:09 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2013-12-03 05:04:11 -0500

Seen: 448 times

Last updated: Dec 03 '13