# Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

http://pastebin.com/VGt8A3HX

## ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,897 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Matched GET /9574193127d1439fb09d1c0ef3eb26d5/servers/detail from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware.py:100 2012-01-25 14:42:15,897 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Route path: '/:(project_id)/servers/:(id)', defaults: {'action': u'process', 'controller': } from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware.py:102 2012-01-25 14:42:15,898 DEBUG routes.middleware [2d3d5746-6718-4d2f-bb7d-3587cff052db dsmadmin 9574193127d1439fb09d1c0ef3eb26d5] Match dict: {'action': u'process', 'controller': , 'project_id': u'9574193127d1439fb09d1c0ef3eb26d5', 'id': u'detail'} from (pid=7488) __call__ /usr/lib/pymodules/python2.6/routes/middleware ...

edit retag close merge delete

Sort by » oldest newest most voted

Zones is going through some radical changes currently.

Specifically, we're planning to use direct Rabbit-to-Rabbit communication between trusted Zones to avoid the complication of changes to OS API, Keystone and novaclient.

To the user deploying Nova not much will change, there may be a new service to deploy (a Zones service), but that would be all. To a developer, the code in OS API will greatly simplify and the Distributed Scheduler will be able to focus on single zone scheduling (vs doing both zone and host scheduling as it does today).

We'll have more details soon, but we aren't planning on introducing the new stuff until we have a working replacement in place. The default Essex Scheduler now will largely be the same and the filters/weight functions will still carry forward, so any investments there won't be lost.

Stay tuned, we're hoping to get all this in a new blueprint soon.

Hope it helps, Sandy

From: bounces@canonical.com [bounces@canonical.com] on behalf of Alejandro Comisario [question185840@answers.launchpad.net] Sent: Thursday, January 26, 2012 8:50 AM To: Sandy Walsh Subject: Re: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

Status: Answered => Open


Alejandro Comisario is still having a problem: Sandy, Vish !

Thanks for the replies ! let me get to the relevant points.

#1 I totally agree with you guys, the policy for spawning instances maybe very special of each company strategy, but, as you can pass from "Fill First" to "Spread First" just adding a "reverse=True" on nova.scheduler.least_cost.weighted_sum" and "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless addition to manipulate (since we are going to have a lot of zones across datacenters, and many different departments are going to create many instances to load-balance their applications, we really preffer SpreadFirst to make sure hight availability of the pools )

#2 As we are going to test essex-3, i would like if you can tell me if the zones code from Chris Behrens is going to be added on Final Essex / Milestone 4, so we can keep testing other features, or you preffer us to load this as a bug to be fixed since maybe the code that broke is not going to have major changes.

Kindest regards !

You received this question notification because you are a member of Nova Core, which is an answer contact for OpenStack Compute (nova).

more

Sandy, Vish !

Thanks for the replies ! let me get to the relevant points.

#1 I totally agree with you guys, the policy for spawning instances maybe very special of each company strategy, but, as you can pass from "Fill First" to "Spread First" just adding a "reverse=True" on nova.scheduler.least_cost.weighted_sum" and "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless addition to manipulate (since we are going to have a lot of zones across datacenters, and many different departments are going to create many instances to load-balance their applications, we really preffer SpreadFirst to make sure hight availability of the pools )

#2 As we are going to test essex-3, i would like if you can tell me if the zones code from Chris Behrens is going to be added on Final Essex / Milestone 4, so we can keep testing other features, or you preffer us to load this as a bug to be fixed since maybe the code that broke is not going to have major changes.

Kindest regards !

more

Perfect ! we are gonna stop then looking for bugs, and wait for the commit of zones implementation !

Thanks Sandy.

more

Zones worked with Diablo, to some extent.

But the Zone code was pulled from the final version of Essex.

Zones are being replaced by cells in Folsom.

I hope that helps.

more

Nice.

The Less Ram thing is to enforce Fill First (vs Spread First). But obviously each company is going to have their own strategies for how they want it to behave.

-S

From: bounces@canonical.com [bounces@canonical.com] on behalf of Alejandro Comisario [question185840@answers.launchpad.net] Sent: Wednesday, January 25, 2012 5:01 PM To: Sandy Walsh Subject: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

http://pastebin.com/VGt8A3HX

## ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42 ...

more

Nice work!

The second definitely seems like a bug. The first I suspect is a bug as well, but I will let sandy comment on that one. FYI, the zones stuff is being reworked by Chris Behrens to allow for more efficient calls. He's doing his best to not break the existing zones code, but this will hopefully be much easier once his code arrives.

Vish

On Jan 25, 2012, at 1:01 PM, Alejandro Comisario wrote:

Hey guys, we finally got MultiZone working on Essex with KVM as hypervisor (trunk version 2012.1-dev (2012.1-LOCALBRANCH:LOCALREVISION)) we have one Parent zone and one child zone (named "zoneE") all integrated with keystone ( nova, glance ) and we are able to spawn instances across both zones ! After making this work after a couple of days of work, configuration and debbuging we have one doubt, and one ERROR. So, lets hope you can help us on both of them.

THE DOUBT: This is deffinitely because we dont clearly understand the whole concept but, why is that "nova.scheduler.least_cost.weighted_sum" function chooses as prefered host the LESS RAM memory host ? this seems to choose just one host till is fully depleted, and then, chooses the other one ! that doesnt sound like a good way of balancing instances across the zones by default, so, since we didnt know how to fix that behavior, just added a "reverse=True" in the sorted functions, so problem "fixed". The thing is, if there is (maybe trough nova.conf) a way of controlling this ( the sort order ), can any one shed some light on how to do it ?

THE ERROR / BUG: At the moment that you spawn successfully an instance from the parent zone, to the child zone (the instance is successfully spawned and running on the child zone) the "nova list" (or curl to server/detail) to the parent zone just BROKE. It apperars that when the parent zone calls the "server/detail" of the child zone, the json returned by the child, has the "uuid" field printed into the "id" field, so you can see on nova-api.log, having a keyerror "uuid" that breaks the call, you can check the pasteBin link below, or you can keep reading and watch the nova-api.log stack from here.

Waiting for some feedback ! Cheers! Alejandro.

http://pastebin.com/VGt8A3HX

## ENTIRE NOVA-API STACK TRACE:

2012-01-25 14:42:15,793 DEBUG nova [-] HTTP PERF: 0.10457 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,894 DEBUG nova [-] HTTP PERF: 0.10093 seconds to GET 172.16.146.139:35357 /v2.0/tokens/8de5e217-9615-443c-8b7e-09cabf4b4af8) from (pid=7488) getresponse /usr/local/lib/python2.6/dist-packages/keystone-1.0-py2.6.egg/keystone/common/bufferedhttp.py:99 2012-01-25 14:42:15,897 DEBUG ...

more

Hi,

I'm trying to setup a multi-zone nova setup but have some doubts regarding the same. I've already worked with multi-node(computes) type of setup and was quite successful with it. May i know the configurations that need to be done/changed to setup a multi-zone setup.

I'm trying to setup multi-zone setup on following topology: - 2 nova zones(All components of nova) - 1 Parent Node(API+SCHEDULER+MYSQL+RABBITMQ)

I'm not sure about how can i add the child zones(2 nova zones) to my parent zone.

Appreciate any help on this...!

more