Ask Your Question

Quiescing a server prior to scale-down

asked 2018-01-29 07:11:40 -0500

rajeshd gravatar image

updated 2018-01-31 12:09:50 -0500

zaneb gravatar image

hi, we are using openstack ocata and calling the scale down policies to remove nodes basis thresholds , ideally the oldest nodes should be removed but we have seen that sometimes randomly it removes any node from the Pool. Our problem is that before node removal , we need to safely remove from Load balance (F5) , ship logs to external storage and then remove nodes. However at times random nodes are deleted.

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2018-01-31 12:01:41 -0500

zaneb gravatar image

updated 2018-01-31 12:18:04 -0500

The policy for deciding which node to delete is as follows:

  • If any members are in failed state, those will be removed (if this is more than the number you want to scale down by, the difference will be made up by new nodes).
  • The member with the oldest created_time in the database is removed first. (You can determine this by getting the resource details from the API.)
  • In the event of a tie (creation times have only 1s of resolution), the member with the lexicographically smallest name is removed first (i.e. member 'aa' will be removed before member 'ab').

In theory this should allow you to anticipate which node will be removed first, since it is entirely deterministic. You can also effectively choose a member to remove by forcing it into a failed state using the openstack stack resource mark unhealthy command. However, there are also more preferred ways to accomplish what you want.

If you use the Neutron LoadBalancer API from within your Heat templates to manage the load balancer, then removing the server from the load balancer pool will be handled automatically as part of the scale down. In a similar way, if you include a SoftwareDeployment resource that runs on the DELETE action in the scaled unit of your autoscaling group, then that deployment will automatically occur on the server just before the server is deleted, so you can use it to export the logs from the server.

Another option is to add user hooks to resources in a stack (an Autoscaling group is just a nested stack), which provide you with a notification that an action is about to take place and pause progress of the stack update until you acknowledge it. So in this case you might pass an environment file containing something like:

                hooks: pre-delete

and the update would pause before deleting any member of the scaling group. You can use then ship the logs off the affected server (and update the load balancer if you're managing it externally to Heat) before acknowledging it with the openstack stack hook clear --pre-delete command, which will then continue and delete the server.

You can get notifications of when Heat reaches the hook points either by polling using the openstack stack hook poll command, and/or by specifying a Zaqar queue to which to send all notifications about the stack. If sending notifications to Zaqar, you can also set up a Zaqar subscription that e.g. calls a webhook with the notification data whenever a notification is sent to the queue. The webhook URL can be e.g. your own service that takes care of shipping the logs, or it can be a Mistral workflow. An advantage of a Mistral workflow is that it already contains the credentials it needs to clear the hook once everything is complete, so you don't need to provide them to your log-shipping service - a webhook call to that service might still be one of ... (more)

edit flag offensive delete link more

answered 2018-01-31 22:53:02 -0500

rajeshd gravatar image

thanks for your inputs Zaneb , let me try this once and will update/get back with status

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2018-01-29 07:11:40 -0500

Seen: 140 times

Last updated: Jan 31 '18