Multi-node install : problem to spawn instance

asked 2011-05-16 14:05:15 -0600

raphael-g gravatar image

Hi,

I tried to run nova on multiple nodes :

I've heard about the refactoring of the objectstore ; the existing S3ImageService in Bexar is totally different from the one in Cactus. It is now impossible to run the objectstore from a different node than the one running the compute-service (am I wrong ?)

However, if we are using Glance, we should not have the problem (running the image storage and the compute services on two different nodes), should we ?

So I tried to use Glance as my Image service : I use the 'file' storage as my default_storage.

I downloaded the image http://smoser.brickies.net/ubuntu/ttylinux-uec/ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz (http://smoser.brickies.net/ubuntu/tty...)
How can I register the following image with Glance so that it can be available for all my compute nodes ?

Indeed, I tried several things but it seems that I am still failing at the spawning step on my compute-node

Here are the 2 options that I tried :

1. #tar xzvf ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz #glance add name="image-test" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.img disk_format=ami container_format=ami #glance add name="image-test-kernel" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz disk_format=aki container_format=aki #glance add name="image-test-rd" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz disk_format=ari container_format=ari

2. #glance add name="image-test2" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz

Here is what is obtained when running euca-describe-images or glance details

#glance details

URI: http://0.0.0.0/images/6 Id: 6 Public: Yes Name: image-test Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.img Disk format: ami

Container format: ami

URI: http://0.0.0.0/images/7 Id: 7 Public: Yes Name: image-test2 Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz Disk format: raw

Container format: ovf

URI: http://0.0.0.0/images/8 Id: 8 Public: Yes Name: image-test-kernel Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz Disk format: aki

Container format: aki

URI: http://0.0.0.0/images/9 Id: 9 Public: Yes Name: image-test-rd Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd Disk format: ari

Container format: ari

#euca-describe_images IMAGE ami-00000006 None (image-test) available public machine
IMAGE ami-00000007 None (image-test2) available public machine
IMAGE aki-00000008 None (image-test-kernel) available public kernel
IMAGE ari-00000009 None (image-test-rd) available public ramdisk

I tried to run-instance in the three following way : #euca-run-instances ami-00000007 -t m1.small (the one which I registered as a raw disk with the tar.gz) #euca-run-instances ami-00000006 --kernel aki-00000008 --ramdisk ari-00000009 -t m1.small (the one which I registered as a 3 part machine with the ami,ari,aki)

Each time I got something like #euca-describe-instances INSTANCE i-00000015 ami-00000006 172.24.0.8 172.24.0.8 shutdown None (project ... (more)

edit retag flag offensive close merge delete

5 answers

Sort by ยป oldest newest most voted
0

answered 2011-05-16 16:49:37 -0600

vishvananda gravatar image

On May 16, 2011, at 7:06 AM, Raphael.G wrote:

New question #157748 on OpenStack Compute (nova): https://answers.launchpad.net/nova/+q...

Hi,

I tried to run nova on multiple nodes :

I've heard about the refactoring of the objectstore ; the existing S3ImageService in Bexar is totally different from the one in Cactus. It is now impossible to run the objectstore from a different node than the one running the compute-service (am I wrong ?)

This is incorrect. In Bexar objectstore had to run on the same node as api. Now objectstore is only doing s3 duties so it can run anywhere.

However, if we are using Glance, we should not have the problem (running the image storage and the compute services on two different nodes), should we ?

So I tried to use Glance as my Image service : I use the 'file' storage as my default_storage.

I downloaded the image http://smoser.brickies.net/ubuntu/tty...
How can I register the following image with Glance so that it can be available for all my compute nodes ?

Indeed, I tried several things but it seems that I am still failing at the spawning step on my compute-node

It is possible to get method 1 to work but you will manually have to specify which kernel and ramdisk to use. There is an easier way to register images using nova-manage (from the firstbranch dir)

nova-manage image all_register /ttylinux-uec-amd64-12.1_2.6.35-22_1.img ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd <project_id>

You can also use the uec commands to upload the gz. In all of these cases you have to make sure nova is setup to use glance with the following flags: --image_service=nova.image.glance.GlanceImageService --glance_host=<ip of="" host="" where="" glance="" is="" running="">

Here are the 2 options that I tried :

1. #tar xzvf ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz #glance add name="image-test" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.img disk_format=ami container_format=ami #glance add name="image-test-kernel" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz disk_format=aki container_format=aki #glance add name="image-test-rd" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz disk_format=ari container_format=ari

2. #glance add name="image-test2" is_public=True location=file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz

Here is what is obtained when running euca-describe-images or glance details

#glance details

URI: http://0.0.0.0/images/6 Id: 6 Public: Yes Name: image-test Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.img Disk format: ami

Container format: ami

URI: http://0.0.0.0/images/7 Id: 7 Public: Yes Name: image-test2 Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz Disk format: raw

Container format: ovf

URI: http://0.0.0.0/images/8 Id: 8 Public: Yes Name: image-test-kernel Size: 0 Location: file:///root/workdir/venv/firstbranch/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz Disk format: aki ...

(more)
edit flag offensive delete link more
0

answered 2011-05-16 17:33:29 -0600

raphael-g gravatar image

Thanks for your answer Vish.

When you tell that the objectstore can be ran from anywhere, you mean provided that we use Glance, don't you ? The default image service 'LocalStorageService' cannot be ran from anywhere, can it (is it made for deployment on one node) ?

We have a bit stepped forward :

  1. We set the image_service flag to nova.image.s3.S3ImageService for the objectstore service and kept it to nova.image.glance.GlanceImageService for all other services. (if we set GlanceImageService for all node the uec-publish-tarball procedure fails)

  2. We ran the uec-publish-tarball procedure and it seemed to work

#uec-publish-tarball ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz firstbucket Mon May 16 19:06:44 CEST 2011: ====== extracting image ====== kernel : ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz ramdisk: ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd image : ttylinux-uec-amd64-12.1_2.6.35-22_1.img Mon May 16 19:06:45 CEST 2011: ====== bundle/upload kernel ====== Mon May 16 19:06:46 CEST 2011: ====== bundle/upload ramdisk ====== Mon May 16 19:06:47 CEST 2011: ====== bundle/upload image ====== Mon May 16 19:06:51 CEST 2011: ====== done ====== emi="ami-0000000f"; eri="ari-0000000e"; eki="aki-0000000d";

Now, when calling run-instances we have a kernel and ramdisk that have been uploaded to the compute node (no nul size anymore) #cd $instances_path/instance-0xxx #ll total 18M -rw-r----- 1 root root 0 2011-05-16 19:13 console.log -rw-r--r-- 1 root root 8.0M 2011-05-16 19:13 disk -rw-r--r-- 1 root root 8.0M 2011-05-16 19:13 disk.local -rw-r--r-- 1 root root 4.3M 2011-05-16 19:13 kernel -rw-r--r-- 1 root root 2.0K 2011-05-16 19:13 libvirt.xml -rw-r--r-- 1 root root 5.7M 2011-05-16 19:13 ramdisk

But we still have the problem to spawn the instance. Looks like it's finally not only an image registry problem... The logs are still the one reported in the previous post

Any Idea about what it could be ?

edit flag offensive delete link more
0

answered 2011-05-16 17:39:42 -0600

vishvananda gravatar image

On May 16, 2011, at 10:35 AM, Raphael.G wrote:

Question #157748 on OpenStack Compute (nova) changed: https://answers.launchpad.net/nova/+q...

Status: Answered => Open

Raphael.G is still having a problem: Thanks for your answer Vish.

When you tell that the objectstore can be ran from anywhere, you mean provided that we use Glance, don't you ? The default image service 'LocalStorageService' cannot be ran from anywhere, can it (is it made for deployment on one node) ?

The objectstore can run from anywhere regardless. If you have more than one node, you will need to use glance. (You can also share the images dir to all nodes via nfs using Local service, but that is not recommended)

We have a bit stepped forward :

  1. We set the image_service flag to nova.image.s3.S3ImageService for the objectstore service and kept it to nova.image.glance.GlanceImageService for all other services. (if we set GlanceImageService for all node the uec-publish-tarball procedure fails)

Hmm, that should not be the case. You should never set s3 image service.

  1. We ran the uec-publish-tarball procedure and it seemed to work

#uec-publish-tarball ttylinux-uec-amd64-12.1_2.6.35-22_1.tar.gz firstbucket Mon May 16 19:06:44 CEST 2011: ====== extracting image ====== kernel : ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz ramdisk: ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd image : ttylinux-uec-amd64-12.1_2.6.35-22_1.img Mon May 16 19:06:45 CEST 2011: ====== bundle/upload kernel ====== Mon May 16 19:06:46 CEST 2011: ====== bundle/upload ramdisk ====== Mon May 16 19:06:47 CEST 2011: ====== bundle/upload image ====== Mon May 16 19:06:51 CEST 2011: ====== done ====== emi="ami-0000000f"; eri="ari-0000000e"; eki="aki-0000000d";

Now, when calling run-instances we have a kernel and ramdisk that have been uploaded to the compute node (no nul size anymore) #cd $instances_path/instance-0xxx #ll total 18M -rw-r----- 1 root root 0 2011-05-16 19:13 console.log -rw-r--r-- 1 root root 8.0M 2011-05-16 19:13 disk -rw-r--r-- 1 root root 8.0M 2011-05-16 19:13 disk.local -rw-r--r-- 1 root root 4.3M 2011-05-16 19:13 kernel -rw-r--r-- 1 root root 2.0K 2011-05-16 19:13 libvirt.xml -rw-r--r-- 1 root root 5.7M 2011-05-16 19:13 ramdisk

But we still have the problem to spawn the instance. Looks like it's finally not only an image registry problem... The logs are still the one reported in the previous post

Any Idea about what it could be ?

There may be a cached image on the compute node that is breaking things. Terminate the image and clean out /var/lib/nova/instances/_base


You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

edit flag offensive delete link more
0

answered 2011-05-17 09:12:44 -0600

raphael-g gravatar image

Ok

Thanks for everything.


1) Concerning uec-publish-tarball

You are right : we tried coming back to the original GlanceImageService flag for the objectstore and uec-publish-tarball worked too.

So I unfortunately don't know the real reason why uec-publish-tarball kept stucked at the bundle/upload kernel step before... (have we changed something that I forgot to notice) Before we got the following error.

Mon May 16 11:27:52 CEST 2011: ====== extracting image ====== kernel : ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz ramdisk: ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd image : ttylinux-uec-amd64-12.1_2.6.35-22_1.img Mon May 16 11:27:52 CEST 2011: ====== bundle/upload kernel ======

failed to register ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz.manifest.xml failed: euca-register first-bucket/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz.manifest.xml UnknownError: An unknown error has occurred. Please try your request again.failed to upload kernel

There is still a weird thing though :

glance details and euca-describe-images show quite similar results (images available)

#IMAGE aki-00000013 buck/ttylinux-uec-amd64-12.1_2.6.35-22_1-vmlinuz.manifest.xml available public x86_64 kernel
IMAGE ari-00000014 buck/ttylinux-uec-amd64-12.1_2.6.35-22_1-initrd.manifest.xml available public x86_64 ramdisk
IMAGE ami-00000015 buck/ttylinux-uec-amd64-12.1_2.6.35-22_1.img.manifest.xml available public x86_64 machine aki-00000013 ari-00000014

But when we try with the nova-client, we obtain the following result :

| ID | Name | Status | +----+------+--------+ | 19 | None | QUEUED | | 20 | None | QUEUED | | 21 | None | QUEUED |

The name 'None' seems normal (With glance details, images have no name either). But the status 'queued' seems weird.


  1. Concerning the spawning failure

As you told us for $instance_path/_base, we removed its content. We still obtained the same error However we tried with another (virtual) compute node and the spawning step succeeded. So we must have a setup problem on the first (physical) compute node. We'll investigate more about this

Thanks

edit flag offensive delete link more
0

answered 2011-05-17 14:21:46 -0600

raphael-g gravatar image

Thanks The problem had actually already been reported

https://bugs.launchpad.net/nova/+bug/655217 (https://bugs.launchpad.net/nova/+bug/...)

The issue was due to the fact that I put the nova sources into root. I moved them to some place that was not so restricted and it worked

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2011-05-16 14:05:15 -0600

Seen: 75 times

Last updated: May 17 '11