Ask Your Question
0

Why are nova-conductor and nova-compute not showing in multi-node setup?

asked 2013-05-06 08:54:56 -0600

henrik16 gravatar image

updated 2013-05-06 17:35:40 -0600

smaffulli gravatar image

Hi guys, I have a grizzly multi-node setup according to this guide. And the problem is that I can't see nova-conductor and nova-compute listed when I run nova-manage service list...

root@cloud:~# nova-manage service list
Binary           Host                                 Zone             Status     State Updated_At
nova-cert        cloud                                internal         enabled    :-)   2013-05-06 13:49:52
nova-consoleauth cloud                                internal         enabled    :-)   2013-05-06 13:49:52
nova-scheduler   cloud                                internal         enabled    :-)   2013-05-06 13:49:52

I need the other services listed so can you help me? Also when I try to run an instance I get error state so I check the log and see:

root@cloud:~# tail /var/log/nova/nova-scheduler.log
2013-05-06 12:15:37.040 WARNING nova.scheduler.driver [req-642a0a66-9360-487e-b2cd-02ac861f9bf8 df4140890ea14aef8d5abc80d5e0b9d4 f78136edb3024417984d42337ad4bd67] [instance: 96c554c2-67d5-4a28-8899-9a7de18f2286] Setting instance to ERROR state.
2013-05-06 12:17:00.351 WARNING nova.scheduler.driver [req-0032a7ec-4f74-45ff-92cb-267d770ee7f8 df4140890ea14aef8d5abc80d5e0b9d4 f78136edb3024417984d42337ad4bd67] [instance: 082d513d-ebb7-4b74-a45a-4a58a8d7a7ef] Setting instance to ERROR state.
2013-05-06 12:19:19.502 WARNING nova.scheduler.driver [req-481c9664-3eb2-4ed6-8b7d-bbd6a26ec0da df4140890ea14aef8d5abc80d5e0b9d4 f78136edb3024417984d42337ad4bd67] [instance: b9fbbf43-99f6-40dd-a505-9c995f9500df] Setting instance to ERROR state.
2013-05-06 14:07:50.776 WARNING nova.scheduler.driver [req-9ec0386e-6d90-4fe8-85ba-a0823bd716fc df4140890ea14aef8d5abc80d5e0b9d4 f78136edb3024417984d42337ad4bd67] [instance: c1589e39-f7a2-4d36-9669-1957b821d1c6] Setting instance to ERROR state.
2013-05-06 14:28:48.564 WARNING nova.scheduler.driver [req-4587b13b-dc76-41c1-b779-9c6caa207285 df4140890ea14aef8d5abc80d5e0b9d4 f78136edb3024417984d42337ad4bd67] [instance: 96a8be77-6bd2-417d-a32e-1e3c1dc83fe1] Setting instance to ERROR state.

More debugging info:

nova-conductor.log

nova-compute.log

Any help? Please.

edit retag flag offensive close merge delete

Comments

Are there any errors or warnings in the /var/log/nova/nova-conductor.log and nova-compute.log files?

briancline gravatar imagebriancline ( 2013-05-06 09:23:16 -0600 )edit

Yes there are... see here please, nova-conductor.log: http://pastebin.com/prrpkABU | nova-compute.log: http://pastebin.com/Uja2c94s

henrik16 gravatar imagehenrik16 ( 2013-05-06 09:42:34 -0600 )edit

Is your message queue server up?

Alen Komljen gravatar imageAlen Komljen ( 2013-05-06 16:05:07 -0600 )edit

Yes... is up and all the other services are on to ... I don't know what is wrong :s

henrik16 gravatar imagehenrik16 ( 2013-05-06 16:08:40 -0600 )edit

One more question, your compute node is on different server with nova-conductor and nova-compute services running?

Alen Komljen gravatar imageAlen Komljen ( 2013-05-06 16:12:45 -0600 )edit

2 answers

Sort by ยป oldest newest most voted
2

answered 2013-05-07 02:50:53 -0600

updated 2013-05-07 02:53:11 -0600

Try to test MQ connectivity from both compute and controller node:

telnet 192.168.0.1 5672

and

telnet localhost 5672

As I can see from logs rabbit host is different in nova.conf on compute and controller node. As that shouldn't be a problem (nova-conductor is running on localhost) try to put same host, external ip:

rabbit_host=192.168.0.1

Also check rabbit password, default for user guest is "guest":

rabbit_password=guest
edit flag offensive delete link more

Comments

root@c01:~# telnet 192.168.0.1 5672 Trying 192.168.0.1... Connected to 192.168.0.1. Escape character is '^]'. Connection closed by foreign host. root@c01:~# telnet localhost 5672 Trying 127.0.0.1... telnet: Unable to connect to remote host: Connection refused (It happens on both machines)

henrik16 gravatar imagehenrik16 ( 2013-05-07 03:43:17 -0600 )edit

root@cloud:~# telnet localhost 5672 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. root@cloud:~# telnet 192.168.0.3 5672 Trying 192.168.0.3... telnet: Unable to connect to remote host: Connection refused (also with localhost)

henrik16 gravatar imagehenrik16 ( 2013-05-07 03:43:50 -0600 )edit

I notice that I have a password for rabbit wrong so I have changed it, installed nova-conductor in compute node and now I can see nova-compute and nova-conductor when I run nova manage service list but only in the compute node... What is wrong since I can't see the same services in both machines?Thx

henrik16 gravatar imagehenrik16 ( 2013-05-07 03:49:06 -0600 )edit

Where your rabbit is deployed, which ip address? Try to telnet to that address from both.

Alen Komljen gravatar imageAlen Komljen ( 2013-05-07 03:49:36 -0600 )edit

My rabbit host is on Controller (cloud) node... I get connection refused in both :S

henrik16 gravatar imagehenrik16 ( 2013-05-07 03:51:01 -0600 )edit
0

answered 2013-05-07 10:25:16 -0600

armando-migliaccio gravatar image

The conductor, as the other nova services like compute, scheduler need a message queue (like RabbitMQ) to communicate with each other. If the conductor and the compute services in your deployment do not come up correctly, but your scheduler is, most likely this means that the former services do not know how to reach the message queue service and this can be down to two reasons:

1) there is no network connectivity between the MQ and the service 2) the configuration (of the failing service) is wrong, e.g. it does not point to the right queue node, wrong credentials, etc.

So I suggest you the following:

  • Find the configuration of your scheduler and make a note of the details for the message queue.
  • Use these ones to configure the services that fail, remember that if the host of the message queue is 'localhost', you must use the IP or FQDN of the node on which the Queue runs on for configuring remote services like the conductor or compute.
edit flag offensive delete link more

Comments

I can ssh between service node and the MQ server node but I can't do telnet with port 5672 like Alen mention ... can you give me a name for the scheduler config file? I have no idea.. thanks!

henrik16 gravatar imagehenrik16 ( 2013-05-07 10:34:07 -0600 )edit

it's in /etc/nova/nova.conf

armando-migliaccio gravatar imagearmando-migliaccio ( 2013-05-07 10:37:07 -0600 )edit

Thank's Armando, I've checked both files and this is ok (on compute node and cloud node)... I can see all services enabled in nova-manage service list in cloud node but not on compute node... this maybe will be solved with time like Alen said.. My problem now is on accessing my instances..

henrik16 gravatar imagehenrik16 ( 2013-05-07 10:47:20 -0600 )edit

if your nova-compute still isn't showing in the service list there might be something wrong; this might be to misconfiguration related to your hypervisor or choice. Feel free to follow up, though in the context of another question! I believe it's more appropriate

armando-migliaccio gravatar imagearmando-migliaccio ( 2013-05-07 10:50:55 -0600 )edit

It is showing ... but not ont compute node... I have checked that configuration files to and I see everything ok... log for nova-conductor and nova-compute are ok like we have seen above .. the problem now is that when I run ip netns I can't see anything... my instance is running and booted porperly

henrik16 gravatar imagehenrik16 ( 2013-05-07 11:03:57 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

Stats

Asked: 2013-05-06 08:54:56 -0600

Seen: 5,184 times

Last updated: May 07 '13