Unable to Launch Instance After Following the Installation Guide for Queens

asked 2018-05-16 17:47:47 -0600

codylab gravatar image

updated 2018-05-16 17:53:19 -0600

I followed the Installation Guide step-by-step to prepare my Controller and Compute nodes (both are kvm-based virtual machines) and completed the Minimal Deployment for Queens after two days of tinkering. Sadly, I got an error while trying to launch my very first instance.

Before I can ask for any help, here is my virtual system configuration:

== Virtual Networks ==
Management: 10.0.0.0/24 NAT
Provider: 192.168.113.0/24 NAT

== Controller VM ==
eth0 attached to Management, 10.0.0.11/24
eth1 attached to Provider (no IP)

== Compute VM ==
eth0 attached to Management, 10.0.0.31/24
eth1 attached to Provider (no IP)
vCPU nested virtualization enabled

The installation went relatively trouble free except the following two issues:

1) While starting the compute service (nova) on the controller node, I received the error message like this:

/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported exception.NotSupportedWarning

I found a fix at https://bugs.launchpad.net/nova/+bug/1746530 (https://bugs.launchpad.net/nova/+bug/...) and edited the oslo_db/sqlalchemy/enginefacade.py according to https://github.com/openstack/oslo.db/commit/c432d9e93884d6962592f6d19aaec3f8f66ac3a2 (https://github.com/openstack/oslo.db/...). That seems solved the problem.

2) After installed the Horizon, I was unable to open the dashboard web page and got following error from the apache log:

[core:error] [pid 11304] [client 10.0.0.11:59676] Script timed out before returning headers: django.wsgi

This issue was fixed by adding WSGIApplicationGroup &{GLOBAL} into /etc/httpd/conf.d/openstack-dashboard.conf.

As for my original question, I don't know what logs you may need for a troubleshoot, but I just attached items seemingly related to the launch failure. Let me know if you need any further information.

From /etc/nova/nova-conductor.log

1612 ERROR nova.scheduler.utils [req-b673853d-9feb-4ed0-8f22-405dd535fab0 1b229a3a688f40f6ae83705d8212ccf1 18d2d3bfa79149a397d9587d16d66436 - default default] [instance: 2a73d1a8-219b-4cf2-85b2-d5674f7253a0] Error from last host: compute1 (node compute1): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1840, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2117, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 2a73d1a8-219b-4cf2-85b2-d5674f7253a0 was re-scheduled: Binding failed for port 5d3d7278-8217-43b8-b444-1d4227f4fda8, please check neutron logs for more information.\n']

1612 WARNING nova.scheduler.utils [req-b673853d-9feb-4ed0-8f22-405dd535fab0 1b229a3a688f40f6ae83705d8212ccf1 18d2d3bfa79149a397d9587d16d66436 - default default] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 2a73d1a8-219b-4cf2-85b2-d5674f7253a0.: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 2a73d1a8-219b-4cf2-85b2-d5674f7253a0.

1612 WARNING nova.scheduler.utils [req-b673853d-9feb-4ed0-8f22-405dd535fab0 1b229a3a688f40f6ae83705d8212ccf1 18d2d3bfa79149a397d9587d16d66436 - default default] [instance: 2a73d1a8-219b-4cf2-85b2-d5674f7253a0] Setting instance to ERROR state.: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 2a73d1a8-219b-4cf2-85b2-d5674f7253a0.

Could anyone shine some light? Thank you!

edit retag flag offensive close merge delete

Comments

2

Nova creates a Neutron port to connect the instance to the network. This port must be mapped to physical resources on the Compute node, such as the Linuxbridge used by the install guide. This mapping fails.

Check the Neutron logs, in particular the neutron-server log and the linuxbridge-agent logs.

Bernd Bausch gravatar imageBernd Bausch ( 2018-05-17 00:34:44 -0600 )edit

Hi Bernd, thanks so much. I got ERROR messages from the neutron-server log indicating no physical network was available ('physical_network': None). I set "physical_interface_mappings = provider:eth1" in linuxbridge_agent.ini on both controller and compute nodes. Any suggestion?

codylab gravatar imagecodylab ( 2018-05-17 18:11:26 -0600 )edit

Also, the output of the command "openstack network agent list" from the controller node is missing the "Linux bridge agent" from the compute node.

codylab gravatar imagecodylab ( 2018-05-17 18:57:29 -0600 )edit
1

My guess is that the port can't be bound because there is no Linuxbridge agent.

If there is a linuxbridge agent log on the compute node, it should contain clues why the agent crashed. If not, perhaps you didn't deploy it. Check again how you deployed and configured it on the compute node.

Bernd Bausch gravatar imageBernd Bausch ( 2018-05-17 20:02:36 -0600 )edit

Problem solved! The root cause was due to a typo in the password for the RabbitMQ in the /etc/neutron/neutron.conf (transport_url = rabbit://openstack:PASSWORD@controller) on the compute node. Thank you so much for pointing me toward the right direction!

codylab gravatar imagecodylab ( 2018-05-17 20:35:56 -0600 )edit