HA configuration for nova-compute nodes

asked 2014-10-14

Arumon

Hi All,

I am looking for the doc for HA configuration for nova-compute nodes. My requirement is, whenever one compute node fails and completely down then the vm's which were running on that node needs to be available on another node. How we can achieve this?

Regards, Arumon

Closed for the following reason the question is answered, right answer was accepted
close date 2015-02-25 05:06:38.120448

answered 2014-10-14

bishoy

You can use the evacuate call in the nova project and script it but at least use a shared storage so that you can just migrate your instances to other comoute node other than the one that fails. However this not the service level HA that we seek(services auto start and failover, heartbeat, monitoring) But, go to you can find that they made a very cool solution for openstack clusters. you can deploy it with a very cool management tool and give rolles to the BARE-METAL NODES then their node installer handles the nodes and deploy it. They have a failover mechanism and heartbeat services, autostars, their own cli managment tool for hpc, hadoop and openstack. it's really nice enviroment for prodution systems

Could you please explain more about two methods which you mentioned? first:In (nova and script) do we have live migration when our instances fail?or not and this is just a solution for cold migrating ?and about second, brightcomputing is opensource and free? where can i find configuration documents?

p.bagherpour ( 2017-08-02 )

I don't think it's the case anymore. This is an old question from 2014. And for BrightComputing, it's a paid cluster management.

bishoy ( 2017-12-19 )

answered 2014-10-14

Sergei Hanus

As of now - I didn't find any built-in, automatical way to do that.

There's a method (api call)- evacuate, which allows to manually cold-migrate all vms from one host to another. But, it requires some external entity to call it (some kind of script or management system). And, as far as I know, in Icehouse there's particular problem with that - machines can be evacuated only when correctly shutdown. And, this is definitely not the case when node fails - all vms will be in error state. I read some blueprint, which pointed, that this behavior could hopefully be solved in Juno - but I didn't have a chance to verify this change.

