Problems launching instances

asked 2011-05-16 17:11:24 -0500

diego-lalo gravatar image

Hi everyone, last week I managed to configure a two node OpenStack cloud and it was working fine but today is not working!!! I don't know what is the problem, I restarted the CloudController, I restarted the services but nothing worked, here is the output of nova-compute log:

2011-05-16 17:24:19,823 DEBUG nova.rpc [-] received {u'_context_request_id': u'2S-U0RPA6VPLVL73BB07', u'_context_read_deleted': False, u'args': {u'instance_id': 18, u'injected_files': None, u'availability_zone': None}, u'_context_is_admin': True, u'_context_timestamp': u'2011-05-16T16:24:19Z', u'_context_user': u'myuser', u'method': u'run_instance', u'_context_project': u'proj1', u'_context_remote_address': u'192.168.1.30'} from (pid=1048) _receive /usr/lib/pymodules/python2.6/nova/rpc.py:177 2011-05-16 17:24:19,824 DEBUG nova.rpc [-] unpacked context: {'timestamp': u'2011-05-16T16:24:19Z', 'remote_address': u'192.168.1.30', 'project': u'proj1', 'is_admin': True, 'user': u'myuser', 'request_id': u'2S-U0RPA6VPLVL73BB07', 'read_deleted': False} from (pid=1048) _unpack_context /usr/lib/pymodules/python2.6/nova/rpc.py:350 2011-05-16 17:24:19,895 AUDIT nova.compute.manager [2S-U0RPA6VPLVL73BB07 myuser proj1] instance 18: starting... 2011-05-16 17:24:20,042 DEBUG nova.rpc [-] Making asynchronous call on network.openstack-in-a-box-1rev85 ... from (pid=1048) call /usr/lib/pymodules/python2.6/nova/rpc.py:370 2011-05-16 17:24:20,043 DEBUG nova.rpc [-] MSG_ID is 1d4c30b82a514def8264d5a37a449879 from (pid=1048) call /usr/lib/pymodules/python2.6/nova/rpc.py:373 2011-05-16 17:24:23,646 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=0, so setting state to shutoff. 2011-05-16 17:24:23,647 INFO nova.compute.manager [-] DB/VM state mismatch. Changing state from '0' to '5' 2011-05-16 17:25:23,856 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=5, so setting state to shutoff. 2011-05-16 17:26:23,925 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=5, so setting state to shutoff.

From what I'm uderstanding reading the log the spawning process is interrupted after making asynchronous call on network, nova computer manager waits for 3 seconds and because there is no response to the network call it changes the state of the VM to shutoff. Here is the nova.conf of the cloud controller:

--dhcpbridge_flagfile=/etc/nova/nova.conf --dhcpbridge=/usr/bin/nova-dhcpbridge --logdir=/var/log/nova --state_path=/var/lib/nova --lock_path=/var/lock/nova --verbose --s3_host=192.168.1.234 --rabbit_host=192.168.1.234 --cc_host=192.168.1.234 --ec2_url=http://192.168.1.234:8773/services/Cloud --fixed_range=10.0.0.0/12 --network_size=2048 --network_manager=nova.network.manager.FlatDHCPManager --flat_interface=eth0 --routing_source_ip=192.168.1.234 --FAKE_subdomain=ec2 --image_service=nova.image.glance.GlanceImageService --glance_host=192.168.1.234 --glance_port=9292 --sql_connection=mysql://root:nova@192.168.1.234/nova

And here is the output of ifconfig on the cloud controller:

eth0 Link encap:Ethernet HWaddr 00:24:8c:2e:50:88
inet addr:192.168.1.234 Bcast:192.168.1.255 Mask:255.255 ...
(more)

edit retag flag offensive close merge delete

8 answers

Sort by ยป oldest newest most voted
0

answered 2011-05-16 17:57:40 -0500

diego-lalo gravatar image

I restarted rabbitmq and then all of nova services, I also checked again the message queues and they are all empty, I also noticed something strange at the start of the queues list:

{error_logger,{{2011,5,16},{18,55,20}},std_error,"File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db."}

=ERROR REPORT==== 16-May-2011::18:55:20 === File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db.

Is this a problem?

Cheers Diego

edit flag offensive delete link more
0

answered 2011-05-16 17:24:42 -0500

vishvananda gravatar image

Are your services still running? It sounds like nova-network may be down.

Vish

On May 16, 2011, at 10:15 AM, Diego Lalo wrote:

New question #157772 on OpenStack Compute (nova): https://answers.launchpad.net/nova/+q...

Hi everyone, last week I managed to configure a two node OpenStack cloud and it was working fine but today is not working!!! I don't know what is the problem, I restarted the CloudController, I restarted the services but nothing worked, here is the output of nova-compute log:

2011-05-16 17:24:19,823 DEBUG nova.rpc [-] received {u'_context_request_id': u'2S-U0RPA6VPLVL73BB07', u'_context_read_deleted': False, u'args': {u'instance_id': 18, u'injected_files': None, u'availability_zone': None}, u'_context_is_admin': True, u'_context_timestamp': u'2011-05-16T16:24:19Z', u'_context_user': u'myuser', u'method': u'run_instance', u'_context_project': u'proj1', u'_context_remote_address': u'192.168.1.30'} from (pid=1048) _receive /usr/lib/pymodules/python2.6/nova/rpc.py:177 2011-05-16 17:24:19,824 DEBUG nova.rpc [-] unpacked context: {'timestamp': u'2011-05-16T16:24:19Z', 'remote_address': u'192.168.1.30', 'project': u'proj1', 'is_admin': True, 'user': u'myuser', 'request_id': u'2S-U0RPA6VPLVL73BB07', 'read_deleted': False} from (pid=1048) _unpack_context /usr/lib/pymodules/python2.6/nova/rpc.py:350 2011-05-16 17:24:19,895 AUDIT nova.compute.manager [2S-U0RPA6VPLVL73BB07 myuser proj1] instance 18: starting... 2011-05-16 17:24:20,042 DEBUG nova.rpc [-] Making asynchronous call on network.openstack-in-a-box-1rev85 ... from (pid=1048) call /usr/lib/pymodules/python2.6/nova/rpc.py:370 2011-05-16 17:24:20,043 DEBUG nova.rpc [-] MSG_ID is 1d4c30b82a514def8264d5a37a449879 from (pid=1048) call /usr/lib/pymodules/python2.6/nova/rpc.py:373 2011-05-16 17:24:23,646 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=0, so setting state to shutoff. 2011-05-16 17:24:23,647 INFO nova.compute.manager [-] DB/VM state mismatch. Changing state from '0' to '5' 2011-05-16 17:25:23,856 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=5, so setting state to shutoff. 2011-05-16 17:26:23,925 INFO nova.compute.manager [-] Found instance 'instance-00000012' in DB but no VM. State=5, so setting state to shutoff.

From what I'm uderstanding reading the log the spawning process is interrupted after making asynchronous call on network, nova computer manager waits for 3 seconds and because there is no response to the network call it changes the state of the VM to shutoff. Here is the nova.conf of the cloud controller:

--dhcpbridge_flagfile=/etc/nova/nova.conf --dhcpbridge=/usr/bin/nova-dhcpbridge --logdir=/var/log/nova --state_path=/var/lib/nova --lock_path=/var/lock/nova --verbose --s3_host=192.168.1.234 --rabbit_host=192.168.1.234 --cc_host=192.168.1.234 --ec2_url=http://192.168.1.234:8773/services/Cloud --fixed_range=10.0.0.0/12 --network_size=2048 --network_manager=nova.network.manager.FlatDHCPManager --flat_interface=eth0 --routing_source_ip=192.168.1.234 --FAKE_subdomain=ec2 --image_service=nova.image.glance.GlanceImageService --glance_host=192.168.1.234 --glance_port=9292 --sql_connection=mysql://root:nova@192.168.1.234 ...

(more)
edit flag offensive delete link more
0

answered 2011-05-16 17:29:20 -0500

diego-lalo gravatar image

root@openstack:~# ps ax | grep nova 1023 ? Ss 0:00 su -c nova-api --flagfile=/etc/nova/nova.conf nova 1038 ? S 0:04 /usr/bin/python /usr/bin/nova-api --flagfile=/etc/nova/nova.conf 1040 ? Ss 0:00 su -c nova-compute --flagfile=/etc/nova/nova.conf nova 1048 ? S 2:11 /usr/bin/python /usr/bin/nova-compute --flagfile=/etc/nova/nova.conf 1057 ? S 0:00 /usr/bin/python /usr/bin/nova-objectstore --uid 110 --gid 65534 --pidfile /var/run/nova/nova-objectstore.pid --flagfile=/etc/nova/nova.conf --nodaemon --logfile=/var/log/nova/nova-objectstore.log 1842 ? Ss 0:00 su -c nova-scheduler --flagfile=/etc/nova/nova.conf nova 1843 ? S 1:44 /usr/bin/python /usr/bin/nova-scheduler --flagfile=/etc/nova/nova.conf 3148 ? Ss 0:00 su -c nova-network --flagfile=/etc/nova/nova.conf nova 3149 ? S 0:30 /usr/bin/python /usr/bin/nova-network --flagfile=/etc/nova/nova.conf 3521 pts/0 S+ 0:00 grep --color=auto nova

edit flag offensive delete link more
0

answered 2011-05-16 17:33:55 -0500

vishvananda gravatar image

Try: rabbitmqctl list_queues

If you see any reporting a number other than 0, then something isn't pulling messages properly. You might want to check the logs of that component to see if it is reporting errors and restart that component.

Vish

On May 16, 2011, at 10:31 AM, Diego Lalo wrote:

Question #157772 on OpenStack Compute (nova) changed: https://answers.launchpad.net/nova/+q...

Status: Answered => Open

Diego Lalo is still having a problem: root@openstack:~# ps ax | grep nova 1023 ? Ss 0:00 su -c nova-api --flagfile=/etc/nova/nova.conf nova 1038 ? S 0:04 /usr/bin/python /usr/bin/nova-api --flagfile=/etc/nova/nova.conf 1040 ? Ss 0:00 su -c nova-compute --flagfile=/etc/nova/nova.conf nova 1048 ? S 2:11 /usr/bin/python /usr/bin/nova-compute --flagfile=/etc/nova/nova.conf 1057 ? S 0:00 /usr/bin/python /usr/bin/nova-objectstore --uid 110 --gid 65534 --pidfile /var/run/nova/nova-objectstore.pid --flagfile=/etc/nova/nova.conf --nodaemon --logfile=/var/log/nova/nova-objectstore.log 1842 ? Ss 0:00 su -c nova-scheduler --flagfile=/etc/nova/nova.conf nova 1843 ? S 1:44 /usr/bin/python /usr/bin/nova-scheduler --flagfile=/etc/nova/nova.conf 3148 ? Ss 0:00 su -c nova-network --flagfile=/etc/nova/nova.conf nova 3149 ? S 0:30 /usr/bin/python /usr/bin/nova-network --flagfile=/etc/nova/nova.conf 3521 pts/0 S+ 0:00 grep --color=auto nova


You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

edit flag offensive delete link more
0

answered 2011-05-16 17:41:27 -0500

diego-lalo gravatar image

{error_logger,{{2011,5,16},{18,37,53}},std_error,"File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db."}

=ERROR REPORT==== 16-May-2011::18:37:53 === File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db. Listing queues ... 1d4c30b82a514def8264d5a37a449879 0 compute.openstack 0 b7380f15d08c4c50ac4ad3915b4f62a5 0 scheduler.openstack 0 compute_fanout_946ec41b9ae04b51a8daf46ddd89b922 0 network.openstack 0 network_fanout_ec4d522fda714aa2885cd132e1e35695 0 compute_fanout_f3dee487234648f89beb0f5071cd9593 0 scheduler_fanout_5a81125db31e49edac3d268052d1abfb 0 090f6b86a38348cfb4890d4914b57d7e 0 network_fanout_81b19b55ac7b4ae884c1107fc3272924 0 compute 0 scheduler 0 network 0 network_fanout_3bc4d7b2b41e4dcab9c47cd7fc8e8b36 0 798f654b5cfe4e46abec4f52038c0ac8 0 compute.eucalyptus 0 ...done.

This is the output of rabbit@openstack.log

=WARNING REPORT==== 16-May-2011::18:39:39 === exception on TCP connection <0.7872.0> from 192.168.1.234:44306 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:39 === closing TCP connection <0.7872.0> from 192.168.1.234:44306

=INFO REPORT==== 16-May-2011::18:39:39 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.234:53720

=INFO REPORT==== 16-May-2011::18:39:39 === starting TCP connection <0.23529.0> from 192.168.1.234:53720

=WARNING REPORT==== 16-May-2011::18:39:39 === exception on TCP connection <0.23529.0> from 192.168.1.234:53720 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:39 === closing TCP connection <0.23529.0> from 192.168.1.234:53720

=INFO REPORT==== 16-May-2011::18:39:39 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.234:53721

=INFO REPORT==== 16-May-2011::18:39:39 === starting TCP connection <0.23535.0> from 192.168.1.234:53721

=INFO REPORT==== 16-May-2011::18:39:40 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.50:50550

=INFO REPORT==== 16-May-2011::18:39:40 === starting TCP connection <0.23544.0> from 192.168.1.50:50550

=INFO REPORT==== 16-May-2011::18:39:40 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.50:50551

=INFO REPORT==== 16-May-2011::18:39:40 === starting TCP connection <0.23551.0> from 192.168.1.50:50551

=WARNING REPORT==== 16-May-2011::18:39:42 === exception on TCP connection <0.23535.0> from 192.168.1.234:53721 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:42 === closing TCP connection <0.23535.0> from 192.168.1.234:53721

edit flag offensive delete link more
0

answered 2011-05-16 17:48:21 -0500

vishvananda gravatar image

something looks a little odd there. I might try restarting rabbit and restarting all of the services.

On May 16, 2011, at 10:46 AM, Diego Lalo wrote:

Question #157772 on OpenStack Compute (nova) changed: https://answers.launchpad.net/nova/+q...

Status: Answered => Open

Diego Lalo is still having a problem: {error_logger,{{2011,5,16},{18,37,53}},std_error,"File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db."}

=ERROR REPORT==== 16-May-2011::18:37:53 === File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db. Listing queues ... 1d4c30b82a514def8264d5a37a449879 0 compute.openstack 0 b7380f15d08c4c50ac4ad3915b4f62a5 0 scheduler.openstack 0 compute_fanout_946ec41b9ae04b51a8daf46ddd89b922 0 network.openstack 0 network_fanout_ec4d522fda714aa2885cd132e1e35695 0 compute_fanout_f3dee487234648f89beb0f5071cd9593 0 scheduler_fanout_5a81125db31e49edac3d268052d1abfb 0 090f6b86a38348cfb4890d4914b57d7e 0 network_fanout_81b19b55ac7b4ae884c1107fc3272924 0 compute 0 scheduler 0 network 0 network_fanout_3bc4d7b2b41e4dcab9c47cd7fc8e8b36 0 798f654b5cfe4e46abec4f52038c0ac8 0 compute.eucalyptus 0 ...done.

This is the output of rabbit@openstack.log

=WARNING REPORT==== 16-May-2011::18:39:39 === exception on TCP connection <0.7872.0> from 192.168.1.234:44306 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:39 === closing TCP connection <0.7872.0> from 192.168.1.234:44306

=INFO REPORT==== 16-May-2011::18:39:39 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.234:53720

=INFO REPORT==== 16-May-2011::18:39:39 === starting TCP connection <0.23529.0> from 192.168.1.234:53720

=WARNING REPORT==== 16-May-2011::18:39:39 === exception on TCP connection <0.23529.0> from 192.168.1.234:53720 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:39 === closing TCP connection <0.23529.0> from 192.168.1.234:53720

=INFO REPORT==== 16-May-2011::18:39:39 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.234:53721

=INFO REPORT==== 16-May-2011::18:39:39 === starting TCP connection <0.23535.0> from 192.168.1.234:53721

=INFO REPORT==== 16-May-2011::18:39:40 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.50:50550

=INFO REPORT==== 16-May-2011::18:39:40 === starting TCP connection <0.23544.0> from 192.168.1.50:50550

=INFO REPORT==== 16-May-2011::18:39:40 === accepted TCP connection on 0.0.0.0:5672 from 192.168.1.50:50551

=INFO REPORT==== 16-May-2011::18:39:40 === starting TCP connection <0.23551.0> from 192.168.1.50:50551

=WARNING REPORT==== 16-May-2011::18:39:42 === exception on TCP connection <0.23535.0> from 192.168.1.234:53721 connection_closed_abruptly

=INFO REPORT==== 16-May-2011::18:39:42 === closing TCP connection <0.23535.0> from 192.168.1.234:53721


You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

edit flag offensive delete link more
0

answered 2011-05-16 18:04:20 -0500

vishvananda gravatar image

Not sure, I've never seen that message before.

Vish

On May 16, 2011, at 11:01 AM, Diego Lalo wrote:

Question #157772 on OpenStack Compute (nova) changed: https://answers.launchpad.net/nova/+q...

Status: Answered => Open

Diego Lalo is still having a problem: I restarted rabbitmq and then all of nova services, I also checked again the message queues and they are all empty, I also noticed something strange at the start of the queues list:

{error_logger,{{2011,5,16},{18,55,20}},std_error,"File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db."}

=ERROR REPORT==== 16-May-2011::18:55:20 === File operation error: eacces. Target: /etc/hosts. Function: get_file. Process: inet_db.

Is this a problem?

Cheers Diego


You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

edit flag offensive delete link more
0

answered 2011-05-16 20:16:36 -0500

diego-lalo gravatar image

Vish thank you very much for your help!! I finally discovered what was the problem, on Friday before I left the office I changed the hostname of the cloud controller and I suspect that triggered the problem. I had to reinstall the cloud controller because I changed the hostname to the previous name but I couldn't make the services work so I had no choice. I noticed the hostname change when I saw the services table from nova DB I saw several nova-compute services corresponding to the old hostname and to the new one. So everyone that reads this message remember to NOT CHANGE THE HOSTNAME!!!!

Cheers Diego

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2011-05-16 17:11:24 -0500

Seen: 84 times

Last updated: May 16 '11