Ask Your Question
1

Best practice to achieve High Availability and scalability for Neutron

asked 2014-01-09 11:37:21 -0500

anonymous user

Anonymous

I'm preparing for an OpenStack cloud in a production system. I've achieved HA and load balancing for most of the components to a certain degree.

I am finding it hard to understand how to best implement HA for Neutron. What is the best practice for Havana to run multiple l3-agents and dhcp-agents and cluster them together?

edit retag flag offensive close merge delete

Comments

Hi, i've tried to apply your patch but it doesn't work, it reports Hunk #1 FAILED, maybe I'm using the wrong version. I have rdo havana distro with neutron vlans. Any hint? Thanks

pgsousa gravatar imagepgsousa ( 2014-02-05 06:12:15 -0500 )edit

see my answer.

Li Ma gravatar imageLi Ma ( 2014-02-06 09:00:16 -0500 )edit

3 answers

Sort by ยป oldest newest most voted
4

answered 2014-01-09 11:46:42 -0500

jaypipes gravatar image

The Neutron L3 agent is the only OpenStack service that is not stateless, and therefore you cannot use traditional load-balancing across a set of identical nodes.

That said, there's nothing wrong with running multiple L3 nodes, with routers for different tenants hosted on different L3 agents. We do this successfully in our deployment using a custom Neutron scheduler that my colleague Alan Meadows wrote and a Python script (also written by Alan) that runs in cron looking for failures on an L3 agent and if found, moves the routers from the failed node to a working one.

The advantage to this vs. something like pacemaker is that you spread the L3 agent workload across many nodes -- accomplishing a sort of poor-man's load balancing/sharding for L3 agent requests.

To summarize, if your installation is running Grizzly or Havana and you don't want use Pacemaker (so that you can spread L3 agent load across multiple nodes):

  1. Apply this patch to Neutron: https://gist.github.com/jaypipes/8135839
  2. Set the router_scheduler_driver in nova.conf to neutron.scheduler.l3_agent_scheduler.LeastUtilizedScheduler Reference: https://github.com/stackforge/cookbook-openstack-network/blob/master/templates/default/neutron.conf.erb#L232
  3. Put this script into a cron job: https://github.com/stackforge/cookbook-openstack-network/blob/master/files/default/neutron-ha-tool.py

If you're on Icehouse, simply do:

  1. Set the router_scheduler_driver in nova.conf to neutron.scheduler.l3_agent_scheduler.LeastRoutersScheduler Reference: https://github.com/openstack/neutron/blob/master/etc/neutron.conf#L223
  2. Put this script into a cron job: https://github.com/stackforge/cookbook-openstack-network/blob/master/files/default/neutron-ha-tool.py

Things may change, so make sure you also check the High Availability documentation in the sections:

  • Network Controller Cluster Stack
    • Highly available Neutron L3 Agent
    • Highly available Neutron DHCP Agent
    • Highly available Neutron Metadata Agent
    • Manage network resources
edit flag offensive delete link more

Comments

1

nova.conf should be neutron.conf above :)

darragh-oreilly gravatar imagedarragh-oreilly ( 2014-02-05 09:49:57 -0500 )edit

does that cron job run on the networker or the controller?

andrewklau gravatar imageandrewklau ( 2014-03-29 20:48:57 -0500 )edit

This is great, but it would still result in at least 1 minute of downtime if you lose a network node (cron won't run more frequently than that).

marcantonio gravatar imagemarcantonio ( 2014-06-02 14:55:13 -0500 )edit
2

answered 2014-02-06 08:59:41 -0500

Li Ma gravatar image

updated 2014-02-07 00:25:01 -0500

It cannot be directly patched for RDO havana. I've failed to do it. Anyway, I've modified the patch in a clearer way to fit it in the havana version. Here's the link: - https://gist.github.com/li-ma/8726625

Just put all the codes into neutron/scheduler/l3_agent_scheduler.py

And then set router_scheduler_driver in /etc/neutron/neutron.conf

It has been carefully tested on RDO havana deployment.

edit flag offensive delete link more
1

answered 2015-11-13 06:50:45 -0500

Christian Zunker gravatar image

The neutron-ha-script has been removed from the repository mentioned above with https://github.com/openstack/cookbook-openstack-network/commit/64fd769eb539e17312d4cfb314ff4eb0f3b5d542 (this commit).

But Openstack has implemented a HA solution for the l3 agent. See http://docs.openstack.org/ha-guide/networking-ha-l3.html (docs) for more details.

In case you are using an older Openstack version without the l3 HA feature, there is a version actively maintained by the https://github.com/openstack/openstack-ansible/blob/master/playbooks/roles/os_neutron/templates/neutron-ha-tool.py.j2 (openstack-ansible repo).

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

4 followers

Stats

Asked: 2014-01-09 11:37:21 -0500

Seen: 2,256 times

Last updated: Feb 07 '14