Ask Your Question
0

Fail to run instance on RHEL6.2

asked 2012-09-05 09:48:34 -0500

genggjh gravatar image

I am trying to setup a production env, here is my topology: controller node: mysqld qpidd Horizon (httpd) openstack-nova-api openstack-nova-objectstore openstack-nova-scheduler openstack-nova-cert identity node: openstack-keystone storage node: openstack-glance-registry openstack-glance-api openstack-nova-volume 2 compute nodes: openstack-nova-compute openstack-nova-network openstack-nova-metadata-api

Here is my nova.conf on the compute node:

[DEFAULT] verbose = True osapi_host=172.16.1.20 #rabbit_host=172.16.1.20 qpid_hostname=172.16.1.20 ec2_host=172.16.1.20 image_service=nova.image.glance.GlanceImageService glance_api_servers=172.16.1.28:9292 logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = True injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@172.16.1.20/nova iscsi_ip_prefix = 172.16.1.20 connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap auth_strategy = keystone flat_interface = eth0 flat_network_bridge = br100 public_interface = eth1 multi_host = True fixed_range=10.0.0.0/8

VOLUMES

volume_group = nova-volumes volume_name_template = volume-%08x iscsi_helper = tgtadm osapi_volume_listen = 172.16.1.28 libvirt_inject_partition = -1

Here is the output from my glance server:

glance index

ID Name Disk Format Container Format Size


3241935f-d10e-4cbc-ab18-91ab1b28fef0 f16-jeos qcow2 ovf 104258723

I got following error in the compute.log when I try to run instance with the command: nova boot mytest-gjh --flavor 2 --key_name mykey --image 3241935f-d10e-4cbc-ab18-91ab1b28fef0

2012-09-05 17:30:55 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Running cmd (subprocess): mkdir -p /var/lib/nova/instances/instance-00000007/ from (pid=27944) execute /usr/lib/python2.6/site-packages/nova/utils.py:220 2012-09-05 17:30:55 INFO nova.virt.libvirt.connection [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] [instance: 55d36f58-83ec-4fcd-83c9-b8ca54d6a878] Creating image 2012-09-05 17:30:55 DEBUG nova.virt.libvirt.connection [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] block_device_list [] from (pid=27944) _volume_in_mapping /usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py:1418 2012-09-05 17:30:55 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Running cmd (subprocess): mkdir -p /var/lib/nova/instances/_base from (pid=27944) execute /usr/lib/python2.6/site-packages/nova/utils.py:220 2012-09-05 17:30:55 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Attempting to grab semaphore "478158b1252149ce64d2174ee496caf8bd5f73bf" for method "call_if_not_exists"... from (pid=27944) inner /usr/lib/python2.6/site-packages/nova/utils.py:928 2012-09-05 17:30:55 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Got semaphore "478158b1252149ce64d2174ee496caf8bd5f73bf" for method "call_if_not_exists"... from (pid=27944) inner /usr/lib/python2.6/site-packages/nova/utils.py:932 2012-09-05 17:30:56 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Running cmd (subprocess): env LC_ALL=C LANG=C qemu-img info /var/lib/nova/instances/_base/478158b1252149ce64d2174ee496caf8bd5f73bf.part from (pid=27944) execute /usr/lib/python2.6/site-packages/nova/utils.py:220 2012-09-05 17:30:56 DEBUG nova.virt.images [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] 3241935f-d10e-4cbc-ab18-91ab1b28fef0 was qcow2, converting to raw from (pid=27944) fetch_to_raw /usr/lib/python2.6/site-packages/nova/virt/images.py:96 2012-09-05 17:30:56 DEBUG nova.utils [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Running cmd (subprocess ...

(more)
edit retag flag offensive close merge delete

11 answers

Sort by ยป oldest newest most voted
0

answered 2012-09-06 03:13:48 -0500

genggjh gravatar image

Very thanks, after I registered the new image, the instance can be launched. Bug #1046643 is opened.

edit flag offensive delete link more
0

answered 2012-09-05 09:51:21 -0500

genggjh gravatar image

My version info:

nova-manage version

2012.1.1 (2012.1.1-LOCALBRANCH:LOCALREVISION

rpm -qa |grep glance

python-glance-2012.1.1-1.el6.noarch openstack-glance-2012.1.1-1.el6.noarch

edit flag offensive delete link more
0

answered 2012-09-05 10:04:02 -0500

Note those packages remove the .part file on any error so that stale files are not left around. Perhaps your system ran out of space? I can't tell though, because you seem to have left out the "Exit code:" and "Stderr:" portions of the trace.

edit flag offensive delete link more
0

answered 2012-09-05 10:27:07 -0500

genggjh gravatar image

Here are the rc and stderr:

2012-09-05 17:30:59 TRACE nova.utils Exit code: 1 2012-09-05 17:30:59 TRACE nova.utils Stdout: '' 2012-09-05 17:30:59 TRACE nova.utils Stderr: 'qemu-img: error while reading sector 1052672: Input/output error\n' 2012-09-05 17:30:59 TRACE nova.utils 2012-09-05 17:30:59 ERROR nova.rpc.amqp [req-31714935-3df3-41af-a95c-fe9e8bd60d04 4ccef92427bc437a8b4f1197155f5ccc 5843e8eec3ec496ead2e35881a15ee56] Exception during message handling

And df output: [root@gn03idp035 nova]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 49G 1.9G 44G 5% / tmpfs 56G 0 56G 0% /dev/shm /dev/sda3 1.1T 200M 990G 1% /iaas There are still 44G available under /.

BTW, can I specify the .part files save to another location?

edit flag offensive delete link more
0

answered 2012-09-05 11:11:51 -0500

So it's not a space issue. qemu-img is getting EIO. This could be due to any of:

  • disk failing
  • qemu-img bug: http://lists.gnu.org/archive/html/qemu-devel/2011-10/msg00222.html (http://lists.gnu.org/archive/html/qem...)
  • corrupted qcow2 image in glance
  • corrupted qcow2 image transfer from glance

That last possibility is eerily similar to: https://bugzilla.redhat.com/show_bug.cgi?id=832449 (https://bugzilla.redhat.com/show_bug....)

How reproducible is this for you?

If possible could you verify the image is OK in glance using something like: qemu-img check /var/lib/glance/images/$the_image

If that's Ok then would it be possible to add debugging to the "qemu-img convert" line (99) in: /usr/lib/python2.6/site-packages/nova/virt/images.py and restarting the openstack-nova-compute service? Something along the lines of comment 16 in the bug above, i.e.

try: utils.execute('qemu-img', 'convert', '-O', 'raw', path_tmp, staged) except: glance_path = os.path.join('/var/lib/glance/images', image_href.split('/')[-1]) out, err = utils.execute('cmp', path_tmp, glance_path) LOG.error("comparing fetched image %s to master %s: %s" % (path_tmp, glance_path, out))

thanks!

edit flag offensive delete link more
0

answered 2012-09-05 11:12:15 -0500

eglynn gravatar image

And I can not find /var/lib/nova/instances/_base/478158b1252149ce64d2174ee496caf8bd5f73bf.part this file on my compute node, does that mean my image is not downloaded from the glance server?

No, I think this is because the nova image code is written to remove the part file if the image conversion fails.

It would be useful to leave the .part file around for diagnostic purposes, to see whether it was truncated or corrupted or simply empty.

You could acheive this by temporarily commenting out the line:

os.unlink(path_tmp)

that closely follows line 99 in /usr/lib/python2.6/site-packages/nova/virt/images.py and then restarting the nova-compute service.

Can you try that and report back?

edit flag offensive delete link more
0

answered 2012-09-05 11:22:11 -0500

eglynn gravatar image

Ignore my reference to the unlink call, that was in a previous version of the code.

The part file removal is now handled by utils.remove_path_on_error.

edit flag offensive delete link more
0

answered 2012-09-05 12:00:25 -0500

genggjh gravatar image

I got error when check the image:

[root@gn02idp028 iaas]# qemu-img check /iaas/glance/images/092fe525-f647-49a8-88f7-3ae82537e55c Warning: cluster offset=0x6370000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0x6370000 is after the end of the image file, can't properly check refcounts. ..... Warning: cluster offset=0xcba0000 is after the end of the image file, can't properly check refcounts.

64 internal errors have occurred during the check.

An error has occurred during the check: Success The check is not complete and may have missed error.

Very thanks for your support, I will download the image and try again.

It would be better if this problem can be caught when I try to add image through glance.

Anyway, I will update this question once I get the new image.

edit flag offensive delete link more
0

answered 2012-09-05 12:19:19 -0500

eglynn gravatar image

It would be better if this problem can be caught when I try to add image through glance.

Fair point, but as things currently stand, glance is completely agnostic about image content ... it doesn't perform any kind of validity checks, and instead assumes the error will emerge if and when the image is used.

edit flag offensive delete link more
0

answered 2012-09-05 12:29:12 -0500

genggjh gravatar image

Right, at least we can check the image before covert it in the nova-compute.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2012-09-05 09:48:34 -0500

Seen: 98 times

Last updated: Sep 06 '12