Ask Your Question
0

Baremetal provisioning failing due to tgtd: conn_close (101)

asked 2013-04-18 13:13:32 -0600

pragadeeswaran gravatar image

Hi,

When provisioning a physical server from OpenStack with the help of nova-baremetal, I received the below error

daemon.err tgtd: conn_close (101) connection closed, 0x7c6f88 1

(error seen in the server console).

The physical server is able to get a DHCP IP and the required images are transferred. This error is seen when the server requests for 'request boot server to deploy image'. TCPDUMP indicates the image is being transferred.

Screenshot of failure @ http://i46.tinypic.com/34gkk9j.jpg

Setup details Operating System: Ubuntu 12.04.2 LTS Single node tgt -- 1.0.17 Quantum: Internal network for VMs + External network for physical servers Database: psql (PostgreSQL) 9.1.9 Deploy images build using steps available @ https://wiki.openstack.org/wiki/Baremetal (https://wiki.openstack.org/wiki/Barem...)

ii glance 1:2013.1-0ubuntu1~cloud0 OpenStack Image Registry and Delivery Service - Daemons ii glance-api 1:2013.1-0ubuntu1~cloud0 OpenStack Image Registry and Delivery Service - API ii glance-common 1:2013.1-0ubuntu1~cloud0 OpenStack Image Registry and Delivery Service - Common ii glance-registry 1:2013.1-0ubuntu1~cloud0 OpenStack Image Registry and Delivery Service - Registry ii keystone 1:2013.1-0ubuntu1~cloud0 OpenStack identity service - Daemons ii nova-api 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - API frontend ii nova-cert 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - certificate management ii nova-common 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - common files ii nova-compute 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - compute node ii nova-compute-kvm 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - compute node (KVM) ii nova-conductor 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - conductor service ii nova-consoleauth 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - Console Authenticator ii nova-doc 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - documentation ii nova-novncproxy 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - NoVNC proxy ii nova-scheduler 1:2013.1-0ubuntu1~cloud0 OpenStack Compute - virtual machine scheduler ii python-cinderclient 1:1.0.3-0ubuntu1~cloud0 python bindings to the OpenStack Volume API ii python-django-horizon 1:2013.1-0ubuntu2~cloud0 Django module providing web based interaction with OpenStack ii python-glance 1:2013.1-0ubuntu1~cloud0 OpenStack Image Registry and Delivery Service - Python library ii python-keystone 1:2013.1-0ubuntu1~cloud0 OpenStack identity service - Python library ii python-keystoneclient 1:0.2.3-0ubuntu1~cloud0 Client library for OpenStack Identity API ii python-nova 1:2013.1-0ubuntu1~cloud0 OpenStack Compute Python libraries ii python-novaclient 1:2.13.0-0ubuntu1~cloud0 client library for OpenStack Compute API ii python-oslo.config 1:1.1.0-0ubuntu1~cloud0 OpenStack Oslo Configuration API ii nova-baremetal 1:2013.1-0ubuntu1~cloud0 Openstack Compute - baremetal virt

nova-baremetal.log 2013-04-18 15:09:44.655 1627 INFO nova.virt.baremetal.deploy_helper [-] start deployment for node 1, params {'swap_mb': 1, 'iqn': 'iqn-9c4e7076-41e6-4047-aa96-98469904776a', 'image_path': u'/var/lib/nova/instances/instance-00000006/disk', 'address': '192.168.0.101', 'pxe_config_path': u'/tftpboot/9c4e7076-41e6-4047-aa96-98469904776a/config', 'port': '3260', 'lun': '1', 'root_mb': 40960} 2013-04-18 15:09:52.104 ERROR nova.virt.baremetal.deploy_helper [req-428e1684-f142-4653-a43a-0fb4d61e263d None None] deployment to node 1 failed

The same image works from DevStack hence I don't think there is an issue with the images being used.

Regards, Satya

edit retag flag offensive close merge delete

8 answers

Sort by ยป oldest newest most voted
0

answered 2013-06-06 04:36:03 -0600

pragadeeswaran gravatar image

The issue was with the way I built the deploy ramdisk image.

/diskimage-builder/bin/ramdisk-image-create deploy -k uname -r -o deploy-ramdisk

instead of

/diskimage-builder/bin/ramdisk-image-create deploy -k 3.5.0-23-generic -o deploy-ramdisk

I don't why the former command didn't work. (or was I stupid).

The node has 72GB of local storage.

I don't if the above solved the issue or if emo94545 suggestions helped as both the changes where done. Thanks emo94545 & Devananda.

edit flag offensive delete link more
0

answered 2013-06-05 18:48:59 -0600

Does the baremetal node being deployed to have sufficient local storage space? The flavor specification indicates a root volume size of 40GB, which baremetal-deploy-helper is attempting to create. This error could be the result of the flavor definition not matching the actual hardware specs.

For reference, this error is coming from here: https://github.com/openstack/nova/blob/master/nova/cmd/baremetal_deploy_helper.py#L83 (https://github.com/openstack/nova/blo...)

edit flag offensive delete link more
0

answered 2013-06-05 09:27:08 -0600

pragadeeswaran gravatar image

nova-baremetal-deploy-helper.log 2013-06-05 14:50:47.809 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Result was 0 execute /usr/lib/python2.7/dist-packages/nova/utils.py:232 2013-06-05 14:50:50.810 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf fdisk /dev/disk/by-path/ip-192.168.124.151:3260-iscsi-iqn-cb516846-ca8e-4014-a668-be484b88144e-lun-1 execute /usr/lib/python2.7/dist-packages/nova/utils.py:208 2013-06-05 14:50:50.881 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Result was 0 execute /usr/lib/python2.7/dist-packages/nova/utils.py:232 2013-06-05 14:50:53.882 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf fdisk /dev/disk/by-path/ip-192.168.124.151:3260-iscsi-iqn-cb516846-ca8e-4014-a668-be484b88144e-lun-1 execute /usr/lib/python2.7/dist-packages/nova/utils.py:208 2013-06-05 14:50:53.950 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Result was 1 execute /usr/lib/python2.7/dist-packages/nova/utils.py:232 2013-06-05 14:50:53.951 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf iscsiadm -m node -p 192.168.124.151:3260 -T iqn-cb516846-ca8e-4014-a668-be484b88144e --logout execute /usr/lib/python2.7/dist-packages/nova/utils.py:208 2013-06-05 14:50:54.513 DEBUG nova.utils [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] Result was 0 execute /usr/lib/python2.7/dist-packages/nova/utils.py:232 2013-06-05 14:50:54.514 ERROR nova.virt.baremetal.deploy_helper [req-75fe51d2-ff92-440d-96ee-029a6296188b None None] deployment to node 1 failed

edit flag offensive delete link more
0

answered 2013-06-05 09:21:45 -0600

pragadeeswaran gravatar image

Devananda: I don't see any message 'Deleting orphan compute node'. Searched for 'orphan' in the nova-compute log. Despite logging setting being Debug.

The error message seen in nova-scheduler.log

nova-scheduler.log 2013-06-05 14:32:09.725 15248 DEBUG nova.openstack.common.rpc.amqp [-] received {u'_context_roles': [u'admin', u'_member_'], u'_context_request_id': u'req-eb27bc47-809c-4d97-a4f9-4d27c73dcdc1', u'_context_quota_class': None, u'_context_project_name': u'bay3', u'_context_service_catalog': [{u'endpoints_links': [], u'endpoints': [{u'adminURL': u'http://10.100.10.28:8776/v1/639ad43328ef4c0185cc1d22bd55242b', u'region': u'CSALab', u'publicURL': u'http://10.100.10.28:8776/v1/639ad43328ef4c0185cc1d22bd55242b', u'internalURL': u'http://192.168.124.28:8776/v1/639ad43328ef4c0185cc1d22bd55242b', u'id': u'84e4b5032c9a4bbebd56af2b0d491017'}], u'type': u'volume', u'name': u'cinder'}], u'_context_user_name': u'admin', u'_context_auth_token': '<sanitized>', u'args': {u'request_spec': {u'block_device_mapping': [], u'image': {u'status': u'active', u'name': u'bm_image', u'deleted': False, u'container_format': u'bare', u'created_at': u'2013-06-03T14:20:25.150173', u'disk_format': u'qcow2', u'updated_at': u'2013-06-03T14:20:33.231722', u'properties': {u'kernel_id': u'4b9fabfa-4ee0-42bc-9092-d280cbd7a43f', u'ramdisk_id': u'9f413b3e-77a5-4f1a-b11d-3551585f62a6'}, u'min_disk': 0, u'min_ram': 0, u'checksum': u'9f3ee9af781eb2063aa78b3a12668259', u'owner': u'639ad43328ef4c0185cc1d22bd55242b', u'is_public': True, u'deleted_at': None, u'id': u'e094f3ed-6210-48b2-8a22-b3dd9b4da8c5', u'size': 938541056}, u'instance_type': {u'memory_mb': 1024, u'root_gb': 40, u'deleted_at': None, u'name': u'bm.tiny', u'deleted': 0, u'created_at': u'2013-06-05T04:51:00.025062', u'ephemeral_gb': 0, u'updated_at': None, u'disabled': False, u'vcpus': 2, u'extra_specs': {u'cpu_arch': u'x86_64', u'baremetal:deploy_kernel_id': u'dd76ebdb-13cb-47df-bf41-8a2989940c63', u'baremetal:deploy_ramdisk_id': u'8dfee7fe-4c84-4339-9d62-7c5d296e819e'}, u'swap': 0, u'rxtx_factor': 1.0, u'is_public': True, u'flavorid': u'18a35897-272d-4c41-921a-60b982b5a21b', u'vcpu_weight': None, u'id': 7}, u'instance_properties': {u'vm_state': u'building', u'availability_zone': None, u'launch_time': u'2013-06-05T08:59:15Z', u'ramdisk_id': u'9f413b3e-77a5-4f1a-b11d-3551585f62a6', u'instance_type_id': 7, u'user_data': None, u'vm_mode': None, u'reservation_id': u'r-6v3lg41l', u'system_metadata': {u'image_kernel_id': u'4b9fabfa-4ee0-42bc-9092-d280cbd7a43f', u'instance_type_memory_mb': 1024, u'instance_type_swap': 0, u'instance_type_vcpu_weight': None, u'instance_type_root_gb': 40, u'instance_type_id': 7, u'image_ramdisk_id': u'9f413b3e-77a5-4f1a-b11d-3551585f62a6', u'instance_type_name': u'bm.tiny', u'instance_type_ephemeral_gb': 0, u'instance_type_rxtx_factor': 1.0, u'instance_type_flavorid': u'18a35897-272d-4c41-921a-60b982b5a21b', u'instance_type_vcpus': 2, u'image_base_image_ref': u'e094f3ed-6210-48b2-8a22-b3dd9b4da8c5'}, u'user_id': u'1f87e402ca544c468d922e6118729491', u'display_description': u's4', u'key_data': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCy//OMY4Xy1UIt10lvAFTAAq5uvG0sKotiOTcDiT3vjGufAuMaIlj+uJIbMzokOInPYoCi8sM73MvQZIfGr9psfwleqvar9Xi/schXlYxYJIVfjdw/oQDhU0fXu7JxUmMtqBEl5iz+JR4VmSRqvsk0oOUr/AgSjf3pkT9jLrJPMwWaeVEdfN4j+gi5OHq6tA0Ih7Gbd71uzlXko8lA5kDiIvp4mceme/t0io3dSUiYcWedq5TCdYcQrO2zxeZIPl3phzkuI0t3oEbKNfBirGuo1DJMcG1pdbmQNGKNfbNRlOdGjYJkBuFG3OG2dZSthYrRirYuCxYnmE1mkUYGXV1d Generated by Nova\n', u'power_state': 0, u'progress': 0, u'project_id': u'639ad43328ef4c0185cc1d22bd55242b', u'config_drive': u'', u'ephemeral_gb': 0, u'access_ip_v6': None, u'access_ip_v4': None, u'kernel_id': u'4b9fabfa-4ee0-42bc-9092-d280cbd7a43f', u'key_name': u'myKeyPair', u'display_name': u's4', u'config_drive_id': u'', u'architecture': None, u'root_gb': 40, u'locked': False, u'launch_index': 0, u'memory_mb': 1024, u'vcpus': 2, u'image_ref': u'e094f3ed-6210-48b2-8a22-b3dd9b4da8c5', u'root_device_name': None, u'auto_disk_config': None, u'os_type': None, u'metadata': {}}, u'security_group': [u'default'], u'instance_uuids': [u'10d4767a-4f54-447b-a42d-7ce9bde22199']}, u'is_first_time': True, u'filter_properties': {u'config_options': {}, u'limits': {u'memory_mb': 3072.0}, u'request_spec': {u'block_device_mapping': [], u'image': {u'status': u'active', u'name': u'bm_image', u'deleted': False, u'container_format': u'bare', u'created_at': u'2013-06-03T14:20:25.150173 ...

(more)
edit flag offensive delete link more
0

answered 2013-05-24 23:12:31 -0600

A patch was merged on May 10th that addressed an issue with baremetal-deploy-helper's rootwrap filters: https://review.openstack.org/#/c/28783/2

emo94545: does the above patch include the the fix you mentioned? If not, would you mind opening a new bug report?

Thanks, D

edit flag offensive delete link more
0

answered 2013-05-17 09:11:40 -0600

This question was expired because it remained in the 'Needs information' state without activity for the last 15 days.

edit flag offensive delete link more
0

answered 2013-05-02 00:24:26 -0600

Hi,

Please take a look at the nova-compute log. If, after a deployment starts, you see "Deleting orphan compute node", followed by an error from the PXE deployment driver, then it is a known bug (1174952). If you see a different error, please post it here. Thanks!

edit flag offensive delete link more
0

answered 2013-05-23 23:09:28 -0600

Had the same exact issue and resolved it by adding to: /etc/nova/rootwrap.d/compute.filters blkid: CommandFilter, /sbin/blkid, root under the fdisk commnd filter. Restart the baremetal service and it will work:)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-04-18 13:13:32 -0600

Seen: 389 times

Last updated: Jun 06 '13