Ask Your Question
0

nova-compute host not found

asked 2015-12-10 10:24:36 -0600

dwoods gravatar image

I've been working on this for two days and can't seem to find an answer anywhere. I've seen a lot of people with a similar problem to mine, but no clear answer or one that fits my situation.

I've been following this guide http://docs.openstack.org/kilo/install-guide/install/apt/content/index.html (http://docs.openstack.org/kilo/instal...) step-by-bloody-step. I'm trying to install Kilo with legacy Nova network on Ubuntu 14.04. The controller is an old Dell 1950 w/2x Xeon 5120 w/16GB RAM. I have four compute nodes - each one is a dual 12 core AMD with 128GB RAM. So far have not even got the first node working.

I've been through the steps twice now. First time got to starting an instance but couldn't due to 'lack of resources'. That's when I realized I had a compute issue. Wiped the drives, reinstalled everything in order and realized the issue occurs when the compute node is created, it shows status "UP" for a minute, then goes "DOWN" and stays down. I've rebooted, restarted services same results.

I've gotten this far by finding that I need to change "logdir" to "log_dir" in the nova.conf files and that I need to move the three Rabbit lines on the compute nova.conf under [Default], but I just can't find an answer to this issue. All help is appreciated. If there is other data that would help, just tell me what you want to see. --Thanks

Compute node error:

2015-12-10 09:06:45.217 6731 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative
2015-12-10 09:06:45.247 6731 INFO nova.virt.driver [-] Loading compute driver 'libvirt.LibvirtDriver'
2015-12-10 09:06:45.271 6731 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller:5672
2015-12-10 09:06:45.289 6731 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on controller:5672
2015-12-10 09:06:45.317 6731 AUDIT nova.service [-] Starting compute node (version 2014.1.5)
2015-12-10 09:06:45.686 6731 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources
2015-12-10 09:06:45.971 6731 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 128411
2015-12-10 09:06:45.972 6731 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 332
2015-12-10 09:06:45.972 6731 AUDIT nova.compute.resource_tracker [-] Free VCPUS: 24
2015-12-10 09:06:45.992 6731 WARNING nova.compute.resource_tracker [-] No service record for host cnode1
2015-12-10 09:06:46.014 6731 ERROR nova.openstack.common.threadgroup [-] Compute host cnode1 could not be found.
Traceback (most recent call last):

  File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 142, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 286, in service_get_all_by
    context, result['host'])

  File "/usr/lib/python2.7/dist-packages/nova/objects/base.py", line 163, in wrapper
    result = fn(cls, context, *args, **kwargs)

  File ...
(more)
edit retag flag offensive close merge delete

Comments

stuck at the same problem for last many days.

atkaushik gravatar imageatkaushik ( 2015-12-25 06:49:17 -0600 )edit

3 answers

Sort by ยป oldest newest most voted
0

answered 2016-01-03 13:34:51 -0600

jguy gravatar image

I had a similar issue, and found the rabbitmq was contacting localhost on the compute node. Turns out to be an ubuntu bug found back in kilo: https://bugs.launchpad.net/openstack-manuals/+bug/1453682 (https://bugs.launchpad.net/openstack-...)

Just move the configuration of the RabbitMQ settings appear to still be affected by a bug, and must be configured under the [DEFAULT] section, rather than the [oslo_messaging_rabbit] section as the Liberty guide instructs. Then in the /var/log/nova-compute.log, you will see it actually contacts the controller, and should work.

edit flag offensive delete link more

Comments

I have seen that posted and I noted that I had already tried it with no effect. I did find the answer - see below.

dwoods gravatar imagedwoods ( 2016-01-05 15:19:31 -0600 )edit
0

answered 2016-01-05 15:27:07 -0600

dwoods gravatar image

I don't know if this will fix anybody else's, but I did find the solution - I had to relearn to read.

In the Environment --> Openstack packages section, I missed the part of "Perform these procedures on all nodes" and had only installed the packages on the controller. I reinstalled everything from scratch and installed all of these packages on all nodes and the compute node stays UP on the controller. Oddly, to JGuy's point, when I did install the packages on the compute nodes, I no longer had to put the parts under [oslo_messaging_rabbit] under [default]. Everything works as per the documentation (Except the 'logdir' to 'log_dir' issue in the all the nova.conf's

edit flag offensive delete link more
0

answered 2015-12-17 03:02:44 -0600

Vinoth gravatar image

I guess you might miss hosts entry in your compute node. Check whether cnode1 is mapped to your management address.

Primary check is try pinging cnode1 from your cnode1. If it pings successfully. Then the problem could be with the endpoints or RabbitMq credentials.

Thanks, Vinoth

edit flag offensive delete link more

Comments

Pings work from each machine to other and to themselves. I'm getting really frustrated with this. I've now tried Liberty with CentOS7 on VM's instead of hardware and now Liberty with Ubuntu on VM's and it all breaks down at the same place with same errors in logs.

dwoods gravatar imagedwoods ( 2015-12-17 09:01:18 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2015-12-10 10:24:36 -0600

Seen: 3,064 times

Last updated: Jan 05 '16