Ask Your Question

Heat stack with unreachable resource stuck in failed state

asked 2016-10-05 01:48:38 -0500

Fenuks gravatar image

I have the following problem (I'm running openstack mitaka (13.2.0) on ubuntu 14.04):

  1. I've launched a stack via Heat, that created an instance and used Os::Heat::SoftwareConfig + Os::Heat::SoftwareDeployment to set up nginx on another instance, the Gateway.
  2. SoftwareDeployment was configured with all actions: Create, Update, Delete, Suspend, Resume, as I needed it to reconfigure Gateway on any action accordingly.
  3. Gateway was irrecoverably broken at some moment, stack update attempted to recreate a port all of a sudden and failed to attach a new one while deleting old one.
  4. Now, any action on stack from 1 results in failed state, as it's triggering action of SoftwareDeployment, that can not reach the server and fails after timeout.
  5. According to documentation ( (, updates to "server" cause replacement, so it would have to delete old resource and create a new one, but delete fails after timeout and it's stuck.

I can't yet delete the whole stack, as it's being used (and not sure it will work), so I've created a new gateway and set it up manually. Now I'd like to either remove the SoftwareDeployment referencing old gateway, or make it reference the new one.

I've tried the following already:

  1. Mark the resource unhealthy — it tries to "update" it with existing values and fails after timeout.
  2. Set action='DELETE' and status='COMPLETE' directly in database Heat.resource — on stack update it tries to "create" the deleted resource with existing values before replacing it with new one (bug?) and again fails after timeout.

I don't know the logic behind this, so not sure what else I can do.

Is there any other ways I can try to fix this?

edit retag flag offensive close merge delete

1 answer

Sort by » oldest newest most voted

answered 2016-10-07 07:50:46 -0500

zaneb gravatar image

The good news is that this is fixed in Newton by this patch. There's no Launchpad bug associated with it for some reason, but if you raise one then we could consider backporting it to Mitaka.

The easiest way to work around the problem for now is probably to start an update and manually signal success to the software deployment yourself with the openstack stack resource signal command, to take the place of the server that is supposed to doing the signalling but is gone.

edit flag offensive delete link more



Thank you, reported as ( with reference to this question. Will try your workaround and mark your answer.

Fenuks gravatar imageFenuks ( 2016-10-07 08:21:45 -0500 )edit

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2016-10-05 01:48:38 -0500

Seen: 2,601 times

Last updated: Oct 07 '16