Ask Your Question
0

Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

asked 2012-01-25 20:57:06 -0500

alejandro-comisario gravatar image

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

PASTEBIN LINK:

http://pastebin.com/VGt8A3HX

ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,897 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Matched GET /9574193127d1439fb09d1c0ef3eb26d5/servers/detail from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware.py:100 2012-01-25 14:42:15,897 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Route path: '/:(project_id)/servers/:(id)', defaults: {'action': u'process', 'controller': } from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware.py:102 2012-01-25 14:42:15,898 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Match dict: {'action': u'process', 'controller': , 'project_id': u'9574193127d1439fb09d1c0ef3eb26d5', 'id': u'detail'} from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware ... (more)

edit retag flag offensive close merge delete

7 answers

Sort by ยป oldest newest most voted
0

answered 2012-01-26 15:20:05 -0500

Zones is going through some radical changes currently.

Specifically, we're planning to use direct Rabbit-to-Rabbit communication between trusted Zones to avoid the complication of changes to OS API, Keystone and novaclient.

To the user deploying Nova not much will change, there may be a new service to deploy (a Zones service), but that would be all. To a developer, the code in OS API will greatly simplify and the Distributed Scheduler will be able to focus on single zone scheduling (vs doing both zone and host scheduling as it does today).

We'll have more details soon, but we aren't planning on introducing the new stuff until we have a working replacement in place. The default Essex Scheduler now will largely be the same and the filters/weight functions will still carry forward, so any investments there won't be lost.

Stay tuned, we're hoping to get all this in a new blueprint soon.

Hope it helps, Sandy


From: bounces@canonical.com [bounces@canonical.com] on behalf of Alejandro Comisario [question185840@answers.launchpad.net] Sent: Thursday, January 26, 2012 8:50 AM To: Sandy Walsh Subject: Re: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

Question #185840 on OpenStack Compute (nova) changed: https://answers.launchpad.net/nova/+question/185840 (https://answers.launchpad.net/nova/+q...)

Status: Answered => Open

Alejandro Comisario is still having a problem: Sandy, Vish !

Thanks for the replies ! let me get to the relevant points.

#1 I totally agree with you guys, the policy for spawning instances maybe very special of each company strategy, but, as you can pass from "Fill First" to "Spread First" just adding a "reverse=True" on nova.scheduler.least_cost.weighted_sum" and "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless addition to manipulate (since we are going to have a lot of zones across datacenters, and many different departments are going to create many instances to load-balance their applications, we really preffer SpreadFirst to make sure hight availability of the pools )

#2 As we are going to test essex-3, i would like if you can tell me if the zones code from Chris Behrens is going to be added on Final Essex / Milestone 4, so we can keep testing other features, or you preffer us to load this as a bug to be fixed since maybe the code that broke is not going to have major changes.

Kindest regards !


You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

edit flag offensive delete link more
0

answered 2012-01-26 12:50:20 -0500

alejandro-comisario gravatar image

Sandy, Vish !

Thanks for the replies ! let me get to the relevant points.

#1 I totally agree with you guys, the policy for spawning instances maybe very special of each company strategy, but, as you can pass from "Fill First" to "Spread First" just adding a "reverse=True" on nova.scheduler.least_cost.weighted_sum" and "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless addition to manipulate (since we are going to have a lot of zones across datacenters, and many different departments are going to create many instances to load-balance their applications, we really preffer SpreadFirst to make sure hight availability of the pools )

#2 As we are going to test essex-3, i would like if you can tell me if the zones code from Chris Behrens is going to be added on Final Essex / Milestone 4, so we can keep testing other features, or you preffer us to load this as a bug to be fixed since maybe the code that broke is not going to have major changes.

Kindest regards !

edit flag offensive delete link more
0

answered 2012-01-26 15:29:51 -0500

alejandro-comisario gravatar image

Perfect ! we are gonna stop then looking for bugs, and wait for the commit of zones implementation !

Thanks Sandy.

edit flag offensive delete link more
0

answered 2012-04-23 14:50:37 -0500

johngarbutt gravatar image

Zones worked with Diablo, to some extent.

But the Zone code was pulled from the final version of Essex.

Zones are being replaced by cells in Folsom.

Take a look at the summit material: http://comstud.com/FolsomCells.pdf http://etherpad.openstack.org/FolsomComputeCells (http://etherpad.openstack.org/FolsomC...)

I hope that helps.

edit flag offensive delete link more
0

answered 2012-01-25 21:03:19 -0500

Nice.

The Less Ram thing is to enforce Fill First (vs Spread First). But obviously each company is going to have their own strategies for how they want it to behave.

-S


From: bounces@canonical.com [bounces@canonical.com] on behalf of Alejandro Comisario [question185840@answers.launchpad.net] Sent: Wednesday, January 25, 2012 5:01 PM To: Sandy Walsh Subject: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

New question #185840 on OpenStack Compute (nova): https://answers.launchpad.net/nova/+question/185840 (https://answers.launchpad.net/nova/+q...)

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

PASTEBIN LINK:

http://pastebin.com/VGt8A3HX

ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42 ...

(more)
edit flag offensive delete link more
0

answered 2012-01-25 21:11:14 -0500

vishvananda gravatar image

Nice work!

The second definitely seems like a bug. The first I suspect is a bug as well, but I will let sandy comment on that one. FYI, the zones stuff is being reworked by Chris Behrens to allow for more efficient calls. He's doing his best to not break the existing zones code, but this will hopefully be much easier once his code arrives.

Vish

On Jan 25, 2012, at 1:01 PM, Alejandro Comisario wrote:

New question #185840 on OpenStack Compute (nova): https://answers.launchpad.net/nova/+q...

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

PASTEBIN LINK:

http://pastebin.com/VGt8A3HX

ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,897 DEBUG ...

(more)
edit flag offensive delete link more
0

answered 2012-04-20 05:35:53 -0500

Hi,

I'm trying to setup a multi-zone nova setup but have some doubts regarding the same. I've already worked with multi-node(computes) type of setup and was quite successful with it. May i know the configurations that need to be done/changed to setup a multi-zone setup.

I'm trying to setup multi-zone setup on following topology: - 2 nova zones(All components of nova) - 1 Parent Node(API+SCHEDULER+MYSQL+RABBITMQ)

I'm not sure about how can i add the child zones(2 nova zones) to my parent zone.

Appreciate any help on this...!

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2012-01-25 20:57:06 -0500

Seen: 68 times

Last updated: Apr 23 '12