Ask Your Question
0

Kilo - Adding second compute node

asked 2015-10-05 21:51:05 -0500

nubber gravatar image

I am trying to make a second compute node operational. however, so far not succeeded in launching the instance. I followed similar steps to add a compute node from openstack documentation.

Error: Failed to perform requested operation on instance "Instancetest", the instance has an error status: Please try again later [Error: No valid host was found. There are not enough hosts available.].

below are few logs I noticed, can any one suggest what I could be missing?

[root@compute2 nova]# neutron agent-list
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+
| id                                   | agent_type         | host     | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+
| 0cb8d0d4-140a-4446-add8-7881f0a07dda | DHCP agent         | network  | :-)   | True           | neutron-dhcp-agent        |
| 2a164cb6-0c6d-418a-ab7d-f68a0f3a3032 | L3 agent           | network  | :-)   | True           | neutron-l3-agent          |
| abbe20eb-8f49-43e6-a0d7-d2625ef07084 | Open vSwitch agent | network  | :-)   | True           | neutron-openvswitch-agent |
| d4e1b8a7-f826-4213-9f5f-9ab936d4f004 | Open vSwitch agent | compute1 | :-)   | True           | neutron-openvswitch-agent |
| ec116376-6d3f-4623-b5b5-78736ac41a5a | Metadata agent     | network  | :-)   | True           | neutron-metadata-agent    |
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+

From the nova conductor log on Controller:

2015-10-05 21:22:57.378 4737 WARNING nova.scheduler.utils [req-78fe3a75-24fc-48c8-8041-44414315acea 96771cdf5d2844ed919a172556dec532 e4b31c40067e473c8c07fe5ff1021ac7 - - -] [instance: 8e3ae49a-27e4-4c06-8813-c0b900c4b9e5] Setting instance to ERROR state.
2015-10-05 22:27:52.634 4738 ERROR nova.scheduler.utils [req-f335dd94-73fc-489d-b6bd-cb7187a79c1c 96771cdf5d2844ed919a172556dec532 e4b31c40067e473c8c07fe5ff1021ac7 - - -] [instance: baf256cf-3c0f-49ea-9535-f1aea5bb0827] Error from last host: compute2 (node compute2): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2235, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2381, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance baf256cf-3c0f-49ea-9535-f1aea5bb0827 was re-scheduled: Unexpected vif_type=binding_failed\n']
2015-10-05 22:27:52.652 4738 WARNING nova.scheduler.utils [req-f335dd94-73fc-489d-b6bd-cb7187a79c1c 96771cdf5d2844ed919a172556dec532 e4b31c40067e473c8c07fe5ff1021ac7 - - -] Failed to compute_task_build_instances: No valid host was found. There are not enough hosts available.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 86, in select_destinations
    filter_properties)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 80, in select_destinations
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.

2015-10-05 22:27:52.653 4738 WARNING nova.scheduler.utils [req-f335dd94-73fc-489d-b6bd-cb7187a79c1c 96771cdf5d2844ed919a172556dec532 e4b31c40067e473c8c07fe5ff1021ac7 - - -] [instance: baf256cf-3c0f-49ea-9535-f1aea5bb0827] Setting instance to ERROR state.

From /var/log/nova/nova-compute.log

2015-10-05 22:46:34.397 1092 INFO nova.scheduler.client.report [req-f107233e-76a4-4dd3-aa39-8739920e327f - - - - -] Compute_service record updated for ('compute2', 'compute2')
2015-10-05 22:46:34.397 1092 INFO nova.compute.resource_tracker [req-f107233e-76a4-4dd3-aa39-8739920e327f - - - - -] Compute_service record updated for compute2:compute2
2015-10-05 22:47:35.127 1092 INFO nova.compute.resource_tracker [req-f107233e-76a4-4dd3-aa39-8739920e327f - - - - -] Auditing locally available compute resources for node compute2
2015-10-05 22:47:35.313 1092 INFO nova.compute.resource_tracker [req-f107233e-76a4-4dd3-aa39-8739920e327f - - - - -] Total usable vcpus: 8, total allocated vcpus: 0
2015-10-05 22:47:35.313 1092 INFO nova.compute.resource_tracker [req-f107233e-76a4-4dd3-aa39-8739920e327f - - - - -] Final resource view: name=compute2 phys_ram=31986MB used_ram=5632MB phys_disk=464GB used_disk=42GB total_vcpus=8 used_vcpus=0 pci_stats=<nova.pci.stats.PciDeviceStats object at 0x5a43b10>
edit retag flag offensive close merge delete

Comments

Hi, first : Please let us know about

#tail -f /var/log/{nova,neutron}/*.log

on both controller and compute nodes

second : on both compute nodes you get this error?

Moe gravatar imageMoe ( 2015-10-06 08:24:54 -0500 )edit

Compute1 Logs looks clean:

Compute 2 however keep throwing this exception:

2015-10-06 09:52:21.665 6478 ERROR oslo_messaging._drivers.impl_rabbit [-] AMQP server on 127.0.0.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 2 seconds.

Is this a concerned message in the log?

nubber gravatar imagenubber ( 2015-10-06 08:54:04 -0500 )edit

Compute1: Auditing locally available compute resources for node compute1 2015-10-06 09:56:31.233 1079 INFO nova.compute.resource_tracker [req-152e1f80-6382-4d71-8fed-7b631c2eaf77 - - - - -] Total usable vcpus: 8, total allocated vcpus: 9 2015-10-06 09:56:31.234 1079 INFO nova.compute.resource_tr

nubber gravatar imagenubber ( 2015-10-06 08:57:13 -0500 )edit

3 answers

Sort by ยป oldest newest most voted
0

answered 2015-10-05 22:55:09 -0500

nubber gravatar image

updated 2015-10-06 10:15:29 -0500

Is it possible that the hypervisor stats and the dashboard display is misleading? On the two compute nodes the number of CPUS are 8+8 = 16.. I am getting the error after reaching the instances more than the number of CPUs present(if vCPU~CPU). If thats the case why would the errored instances appear as if spawning to the second compute node?

Instance Overview
Information

Name
compute2test-3
ID
62c1e1e9-e9fb-4a5e-ab86-baca549f287e
Status
Error
Availability Zone
nova
Created
Oct. 6, 2015, 3:44 a.m.
Time Since Created
9 minutes
Host
compute2
Fault

Message
No valid host was found. There are not enough hosts available

image description

    [root@compute1 ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz
Stepping:              9
CPU MHz:               3499.933
BogoMIPS:              6600.13
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
[root@compute1 ~]#

[root@compute2 ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz
Stepping:              9
CPU MHz:               1675.265
BogoMIPS:              6600.18
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7

Answer: Thanks Raffie..

From the logs you suggested.. tail -f /var/log/{nova,neutron}/*.log

on Compute2 node the Node is attempting to connect to amqp server and failing.. and further checking the config [root@compute2 ~]# egrep -v '(^#|^$)' /etc/neutron/neutron.conf I noticed I entered rabbit details at the wrong place...

**Incorrect for my setup:**
[oslo_messaging_amqp]
rabbit_host = controller
rabbit_userid = openstack
rabbit_password = openstack
[oslo_messaging_qpid]
[oslo_messaging_rabbit]

**Correct for my Setyp**
[oslo_messaging_amqp]
[oslo_messaging_qpid]
[oslo_messaging_rabbit]
rabbit_host = controller
rabbit_userid = openstack
rabbit_password = openstack

After Correcting openvswitch agent on compute2 correctly established.

[root@compute2 ~]# neutron agent-list
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+
| id                                   | agent_type         | host     | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+
| 0cb8d0d4-140a-4446-add8-7881f0a07dda | DHCP agent         | network  | :-)   | True           | neutron-dhcp-agent        |
| 29951dc3-8070-4fe9-8e08-d24fb420c0dd | Open vSwitch agent | compute2 | :-)   | True           | neutron-openvswitch-agent |
| 2a164cb6-0c6d-418a-ab7d-f68a0f3a3032 | L3 agent           | network  | :-)   | True           | neutron-l3-agent          |
| abbe20eb-8f49-43e6-a0d7-d2625ef07084 | Open vSwitch agent | network  | :-)   | True           | neutron-openvswitch-agent |
| d4e1b8a7-f826-4213-9f5f-9ab936d4f004 | Open vSwitch agent | compute1 | :-)   | True           | neutron-openvswitch-agent |
| ec116376-6d3f-4623-b5b5-78736ac41a5a | Metadata agent     | network  | :-)   | True           | neutron-metadata-agent    |
+--------------------------------------+--------------------+----------+-------+----------------+---------------------------+
edit flag offensive delete link more
0

answered 2015-10-06 00:04:01 -0500

soumitrakarmakar gravatar image

The ratio of CPU:vCPU is 1:16 unless you haven't changed it in the nova.conf, so it's not a cpu issue perhaps it maybe that you need to start the nova services all over again on the controller and compute nodes.

edit flag offensive delete link more
0

answered 2015-10-06 23:59:46 -0500

xu-haiwei gravatar image

Is this problem solved? It seems this log is the reason 'AMQP server on 127.0.0.1:5672 is unreachable'. The second compute node can't access to the controller's rabbitmq server, it seems the controller node's IP is not configured.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2015-10-05 21:51:05 -0500

Seen: 808 times

Last updated: Oct 06 '15