Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

nova-compute won't start anymore

dears, my openstack-nova-compute.service does not start and i don't know why. i'm following the official installation guide and i'm working on the compute node. libvirt is ok active(running)

when i try to manually start it, i got no console for one minute and then a message appears:

[root@compute1 ~]# systemctl restart openstack-nova-compute.service Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.

journalctl -xn does not say anything useful : Nov 11 22:31:21 compute1 systemd[1]: openstack-nova-compute.service operation timed out. Terminating. Nov 11 22:31:21 compute1 systemd[1]: Failed to start OpenStack Nova Compute Server. -- Subject: Unit openstack-nova-compute.service has failed

connectivity to rabbitmq is ok, but the nova-compute process continuously connect and disconnect from the rabbit server: =WARNING REPORT==== 11-Nov-2014::22:34:21 === closing AMQP connection <0.1103.0> (10.0.0.33:56973 -> 10.0.0.12:5672): connection_closed_abruptly

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1132.0> (10.0.0.33:56975 -> 10.0.0.12:5672)

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1142.0> (10.0.0.33:56976 -> 10.0.0.12:5672)

intermittent connectivity is from 10.0.0.33 the compute node, 10.0.0.12 is the controller node with rabbit. i also have a pcap file, where i can see that 10.0.0.33 RST's the connection, but i don't know why because i don't know AMPQ with enough details.

logs on the compute nodes are of any help, always those lines below (verbose = true in nova.conf) 2014-11-11 22:37:22.702 1095 INFO nova.virt.driver [-] Loading compute driver 'libvirt.LibvirtDriver' 2014-11-11 22:37:22.708 1095 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2014-11-11 22:37:22.756 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.775 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672 2014-11-11 22:37:22.778 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.790 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672

no other errors. i also dropped and rebuilt the nova db from mysql, now the services table is empty, could be a hostname issue?because in that table there were wrong or non-existent hostnames, i have changed the hostname of the compute node once, but /etc/hosts is now ok and points to the right ip addresses.

please give me a direction where to look at.

nova-compute won't start anymore

dears, my openstack-nova-compute.service does not start and i don't know why. i'm following the official installation guide and i'm working on the compute node. libvirt is ok active(running)

when i try to manually start it, i got no console for one minute and then a message appears:

[root@compute1 ~]# systemctl restart openstack-nova-compute.service Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.

journalctl -xn does not say anything useful : Nov 11 22:31:21 compute1 systemd[1]: openstack-nova-compute.service operation timed out. Terminating. Nov 11 22:31:21 compute1 systemd[1]: Failed to start OpenStack Nova Compute Server. -- Subject: Unit openstack-nova-compute.service has failed

connectivity to rabbitmq is ok, but the nova-compute process continuously connect and disconnect from the rabbit server: =WARNING REPORT==== 11-Nov-2014::22:34:21 === closing AMQP connection <0.1103.0> (10.0.0.33:56973 -> 10.0.0.12:5672): connection_closed_abruptly

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1132.0> (10.0.0.33:56975 -> 10.0.0.12:5672)

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1142.0> (10.0.0.33:56976 -> 10.0.0.12:5672)

intermittent connectivity is from 10.0.0.33 the compute node, 10.0.0.12 is the controller node with rabbit. i also have a pcap file, where i can see that 10.0.0.33 RST's the connection, but i don't know why because i don't know AMPQ with enough details.

logs on the compute nodes are of any help, always those lines below (verbose = true in nova.conf) 2014-11-11 22:37:22.702 1095 INFO nova.virt.driver [-] Loading compute driver 'libvirt.LibvirtDriver' 2014-11-11 22:37:22.708 1095 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2014-11-11 22:37:22.756 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.775 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672 2014-11-11 22:37:22.778 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.790 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672

no other errors. i also dropped and rebuilt the nova db from mysql, now the services table is empty, could be a hostname issue?because in that table there were wrong or non-existent hostnames, i have changed the hostname of the compute node once, but /etc/hosts is now ok and points to the right ip addresses.

please give me a direction where to look at.

edit: sorry i forgot to mention that nova services on controller nodes are all active, nova.conf files have been checked many times and are fine on both nodes.

nova-compute won't start anymore

UPDATE:

starting from command line with python --debug behaves as usual,

2014-11-14 07:54:47.287 1125 DEBUG nova.servicegroup.api [-] ServiceGroup driver defined as an instance of db __new__ /usr/lib/python2.7/site-packages/nova/servicegroup/api.py:65

2014-11-14 07:54:47.415 1125 INFO nova.virt.driver [-] Loading compute driver 'libvirt.LibvirtDriver' 2014-11-14 07:54:47.421 1125 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2014-11-14 07:54:47.470 1125 INFO oslo.messaging._drivers.impl_rabbit [req-5074168d-4800-4ac9-a96e-a30c1ef28146 ] Connecting to AMQP server on controller:5672 2014-11-14 07:54:47.489 1125 INFO oslo.messaging._drivers.impl_rabbit [req-5074168d-4800-4ac9-a96e-a30c1ef28146 ] Connected to AMQP server on controller:5672 2014-11-14 07:54:47.492 1125 INFO oslo.messaging._drivers.impl_rabbit [req-5074168d-4800-4ac9-a96e-a30c1ef28146 ] Connecting to AMQP server on controller:5672 2014-11-14 07:54:47.504 1125 INFO oslo.messaging._drivers.impl_rabbit [req-5074168d-4800-4ac9-a96e-a30c1ef28146 ] Connected to AMQP server on controller:5672

this happens immediately, until i press CRTL+C and so i get some TRACE lines:

2014-11-14 07:58:01.814 1125 CRITICAL nova [req-5074168d-4800-4ac9-a96e-a30c1ef28146 None] KeyboardInterrupt 2014-11-14 07:58:01.814 1125 TRACE nova Traceback (most recent call last): 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/bin/nova-compute", line 10, in <module> 2014-11-14 07:58:01.814 1125 TRACE nova sys.exit(main()) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 72, in main 2014-11-14 07:58:01.814 1125 TRACE nova db_allowed=CONF.conductor.use_local) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/nova/service.py", line 275, in create 2014-11-14 07:58:01.814 1125 TRACE nova db_allowed=db_allowed) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/nova/service.py", line 157, in __init__ 2014-11-14 07:58:01.814 1125 TRACE nova self.conductor_api.wait_until_ready(context.get_admin_context()) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/nova/conductor/api.py", line 313, in wait_until_ready 2014-11-14 07:58:01.814 1125 TRACE nova timeout=timeout) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/nova/baserpc.py", line 62, in ping 2014-11-14 07:58:01.814 1125 TRACE nova return cctxt.call(context, 'ping', arg=arg_p) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call 2014-11-14 07:58:01.814 1125 TRACE nova retry=self.retry) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send 2014-11-14 07:58:01.814 1125 TRACE nova timeout=timeout, retry=retry) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 408, in send 2014-11-14 07:58:01.814 1125 TRACE nova retry=retry) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 394, in _send 2014-11-14 07:58:01.814 1125 TRACE nova retry=retry) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 145, in __exit__ 2014-11-14 07:58:01.814 1125 TRACE nova self._done() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 134, in _done 2014-11-14 07:58:01.814 1125 TRACE nova self.connection.reset() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 690, in reset 2014-11-14 07:58:01.814 1125 TRACE nova self.channel.close() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 170, in close 2014-11-14 07:58:01.814 1125 TRACE nova (20, 41), # Channel.close_ok 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 73, in wait 2014-11-14 07:58:01.814 1125 TRACE nova self.channel_id, allowed_methods) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 220, in _wait_method 2014-11-14 07:58:01.814 1125 TRACE nova self.method_reader.read_method() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 192, in read_method 2014-11-14 07:58:01.814 1125 TRACE nova self._next_method() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 113, in _next_method 2014-11-14 07:58:01.814 1125 TRACE nova frame_type, channel, payload = read_frame() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 163, in read_frame 2014-11-14 07:58:01.814 1125 TRACE nova frame_type, channel, size = unpack('>BHI', read(7, True)) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 278, in _read 2014-11-14 07:58:01.814 1125 TRACE nova s = recv(131072) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 309, in recv 2014-11-14 07:58:01.814 1125 TRACE nova timeout_exc=socket.timeout("timed out")) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 186, in _trampoline 2014-11-14 07:58:01.814 1125 TRACE nova mark_as_closed=self._mark_as_closed) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 159, in trampoline 2014-11-14 07:58:01.814 1125 TRACE nova return hub.switch() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 293, in switch 2014-11-14 07:58:01.814 1125 TRACE nova return self.greenlet.switch() 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 345, in run 2014-11-14 07:58:01.814 1125 TRACE nova self.wait(sleep_time) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 85, in wait 2014-11-14 07:58:01.814 1125 TRACE nova presult = self.do_poll(seconds) 2014-11-14 07:58:01.814 1125 TRACE nova File "/usr/lib/python2.7/site-packages/eventlet/hubs/epolls.py", line 62, in do_poll 2014-11-14 07:58:01.814 1125 TRACE nova return self.poll.poll(seconds) 2014-11-14 07:58:01.814 1125 TRACE nova KeyboardInterrupt

but those are not of any helk, at least for my actual knowledge. do you see anything strange on that TRACE? thanks

LAST UPDATE STOPS HERE

dears, my openstack-nova-compute.service does not start and i don't know why. i'm following the official installation guide and i'm working on the compute node. libvirt is ok active(running)

when i try to manually start it, i got no console for one minute and then a message appears:

[root@compute1 ~]# systemctl restart openstack-nova-compute.service Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.

journalctl -xn does not say anything useful : Nov 11 22:31:21 compute1 systemd[1]: openstack-nova-compute.service operation timed out. Terminating. Nov 11 22:31:21 compute1 systemd[1]: Failed to start OpenStack Nova Compute Server. -- Subject: Unit openstack-nova-compute.service has failed

connectivity to rabbitmq is ok, but the nova-compute process continuously connect and disconnect from the rabbit server: =WARNING REPORT==== 11-Nov-2014::22:34:21 === closing AMQP connection <0.1103.0> (10.0.0.33:56973 -> 10.0.0.12:5672): connection_closed_abruptly

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1132.0> (10.0.0.33:56975 -> 10.0.0.12:5672)

=INFO REPORT==== 11-Nov-2014::22:34:22 === accepting AMQP connection <0.1142.0> (10.0.0.33:56976 -> 10.0.0.12:5672)

intermittent connectivity is from 10.0.0.33 the compute node, 10.0.0.12 is the controller node with rabbit. i also have a pcap file, where i can see that 10.0.0.33 RST's the connection, but i don't know why because i don't know AMPQ with enough details.

logs on the compute nodes are of any help, always those lines below (verbose = true in nova.conf) 2014-11-11 22:37:22.702 1095 INFO nova.virt.driver [-] Loading compute driver 'libvirt.LibvirtDriver' 2014-11-11 22:37:22.708 1095 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2014-11-11 22:37:22.756 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.775 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672 2014-11-11 22:37:22.778 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connecting to AMQP server on controller:5672 2014-11-11 22:37:22.790 1095 INFO oslo.messaging._drivers.impl_rabbit [req-cefd617b-0772-4da4-8770-f3bf3195f88d ] Connected to AMQP server on controller:5672

no other errors. i also dropped and rebuilt the nova db from mysql, now the services table is empty, could be a hostname issue?because in that table there were wrong or non-existent hostnames, i have changed the hostname of the compute node once, but /etc/hosts is now ok and points to the right ip addresses.

please give me a direction where to look at.

edit: sorry i forgot to mention that nova services on controller nodes are all active, nova.conf files have been checked many times and are fine on both nodes.