heartbeats for compute nodes?

asked 2011-06-09 20:18:58 -0500

fred-yang gravatar image

nova-compute creates "nova-compute" into CC DB.service table and new entry into DB.services table. nova-compute then periodically update DB.services.updated_at and report_count fields.

Any way for CC to know a compute node is restarted or the nova-compute service is restarted? any event notification method than continually polling nova DB in monitoring update_at & report_count?

Thanks -Fred

edit retag flag offensive close merge delete

14 answers

Sort by ยป oldest newest most voted
0

answered 2011-07-12 16:10:46 -0500

fred-yang gravatar image

GreenThread is a cooperative yield model according to Eventlet doc, so it should be safe. Thanks for sharing the light.

edit flag offensive delete link more
0

answered 2011-07-11 22:31:06 -0500

Hey Fred,

Correct, it's my understanding that since we're using Eventlet (a Reactor pattern), we shouldn't run into concurrency issues since the service is essentially single threaded. GreenPool is an Eventlet-aware mechanism, so it should be safe too.

Downside is it won't take advantage of multi-core/processors, but we can fire up more than one service.

-S

edit flag offensive delete link more
0

answered 2011-07-11 21:25:10 -0500

fred-yang gravatar image

Sandy,

Continue ZoneManager.service_states[] locking question on this same thread.

ZoneManager.update_service_capabilities() is updating service_states per compute_node periodic_tasks, where JsonFilter.filter_hosts() is also looping through service_states[] to filter each hosts. There is no data locking while accessing service_states[] from both controls, is it implicitly serialized through AMQP by nova-scheduler executing zoneManger and filter_hosts. Am I read it correct?!

nova-scheduler also schedules SchedulerManager.ping() periodically through greenthread. If we derive ping() from SchedulerManager() to update service_states[] periodically on the same scheduler node, any data locking needed or what will be the better locking method?

Thanks, -Fred

edit flag offensive delete link more
0

answered 2011-06-14 11:00:43 -0500

np Fred, happy to help ... don't hesitate to ping me if you run into issues.

edit flag offensive delete link more
0

answered 2011-06-14 00:56:28 -0500

fred-yang gravatar image

Thanks Sandy Walsh, that solved my question.

edit flag offensive delete link more
0

answered 2011-06-13 16:53:51 -0500

  1. Yup, sounds correct.

  2. I would make a new nova.scheduler.api method for this call rather than adjusting get_zone_capabilities, but you have the right idea.

Look forward to seeing the ML RFC!

edit flag offensive delete link more
0

answered 2011-06-10 19:17:08 -0500

fred-yang gravatar image

You may have addressed 2 issues for me :-) though my usage model may be out of this question's scope

  1. Adding started_time, or service_boot_time, as last_capabilities to post to scheduler, which can be used to check if node service got restarted or be used by Host_filter drivers to remove newly booted compute nodes, if needed, during zone_aware_scheduler.

  2. scheduler.api.get_zone_capabilities() query derivation can be applied to build hosts trusted database though a service daemon, with refreshing host trust states when a host rebooted - This trust computing pool RFC will be posted to OpenStack mailing list for comment soon

Thanks, -Fred

edit flag offensive delete link more
0

answered 2011-06-10 17:12:08 -0500

The member variable in ZoneManager you're interested in is:

ZoneManager.service_states = {} # { <host> : { <service> : { cap k : v }}}

Look at nova.scheduler.api.get_zone_capabilities() to see how to call the Scheduler to query the ZoneManager (and the related nova.scheduler.manager.get_zone_capabilities() for the server-side counterpart)

That will give you the means to query for the info.

To add the ability for the Service to add a field about when the service was restarted last look at:

nova.manager.SchedulerDependentManager

I'd put a self.started_datetime (or something) in __init__() and initialize it to the UTC time. Tack it into the self.last_capabilities member variable in update_service_capabilities() and it'll get sent to the Schedulers on every update.

Let me know if that makes sense (I think I'm answering the right question? :)

-S

edit flag offensive delete link more
0

answered 2011-06-10 16:50:47 -0500

fred-yang gravatar image

Sandy,

so the checking can be derived from Zonemanager.ping -> scheduler.hosts_up -> service_is_up for the zone

Thanks, -Fred

edit flag offensive delete link more
0

answered 2011-06-10 00:00:22 -0500

You'd have to put a special query in the scheduler.driver/zonemanager for that ... additionally perhaps in nova.manager you may need to add a one-time flag when a service was last booted.

Both would be pretty straightforward (and handy) additions.

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2011-06-09 20:18:58 -0500

Seen: 327 times

Last updated: Jul 12 '11