Ask Your Question
3

Fail scheduling network (Grizzly, Quantum, gre)

asked 2013-06-24 13:20:21 -0500

Sam W. gravatar image

updated 2013-06-26 13:50:42 -0500

Hi everyone,

I am getting this warning (which seems more like an error) when trying to launch instances:

/var/log/quantum/server.log:2013-06-24 09:34:03  WARNING [quantum.db.agentschedulers_db] Fail scheduling network {'status': u'ACTIVE', 'subnets': [u'19f681a4-99b0-4e45-85eb-6d08aa77cedb'], 'name': u'demo-net', 'provider:physical_network': None, 'admin_state_up': True, 'tenant_id': u'c9ec52b4506b4d789180af84b78da5b1', 'provider:network_type': u'gre', 'router:external': False, 'shared': False, 'id': u'7a753d4b-b4dc-423a-98c2-79a1cbeb3d15', 'provider:segmentation_id': 2L}

The instance launches (on a different node) and seems to work correctly. I have gone over my quantum configuration on both nodes and all the IP's and hostnames seem to have been set correctly (I can post configs if handy).

Here is my setup: I have three "all-in-one nodes" (test1,test2,test3) that each run all of the services (except l3-agent... that only runs on one node at a time). The APIs, novnc-proxy, and mysql (actually Percona/Galera Cluster) are load balanced with haproxy. All services have their APIs bound to the internal IP. haproxy is bound to a VIP that can move around with pacemaker.

In the example above a request comes into haproxy on test2. quantum-server spits out the warning above on test2, test1 launches the instance and seems to work fine (without any warnings or errors). Oddly, test3 never seems to even try to launch any instances. Here is a list of my nova services and quantum agents:

root@test1:~# quantum agent-list
+--------------------------------------+--------------------+-------+-------+----------------+
| id                                   | agent_type         | host  | alive | admin_state_up |
+--------------------------------------+--------------------+-------+-------+----------------+
| 0dfc68a8-7321-4d6c-a266-a4ffef5f9a33 | DHCP agent         | test3 | :-)   | True           |
| 2ca7d965-c292-48a3-805c-11afebf18e20 | DHCP agent         | test1 | :-)   | True           |
| 2f2c3259-4f63-4e7e-9552-fbc2ed69281e | L3 agent           | test2 | :-)   | True           |
| 5a76fbee-47d0-4dc1-af50-24dfb6113400 | Open vSwitch agent | test1 | :-)   | True           |
| 7c6c4058-c9c2-4774-8924-ab6ba54266b3 | DHCP agent         | test2 | :-)   | True           |
| 7d01c7b2-1102-4249-85a0-7afcd9421884 | Open vSwitch agent | test2 | :-)   | True           |
| bde82424-b5ff-41b7-9d7e-35bf805cfae8 | Open vSwitch agent | test3 | :-)   | True           |
| dcb122ab-8c17-4f60-9b3a-41abfdf036c3 | L3 agent           | test1 | xxx   | True           |
+--------------------------------------+--------------------+-------+-------+----------------+


root@test1:~# nova-manage service list                                                                                                                                                        
Binary           Host                                 Zone             Status     State Updated_At
nova-cert        test1                                internal         enabled    :-)   2013-06-24 14:26:11
nova-conductor   test1                                internal         enabled    :-)   2013-06-24 14:26:12
nova-consoleauth test1                                internal         enabled    :-)   2013-06-24 14:26:07
nova-scheduler   test1                                internal         enabled    :-)   2013-06-24 14:26:11
nova-compute     test1                                nova             enabled    :-)   2013-06-24 14:26:08
nova-cert        test2                                internal         enabled    :-)   2013-06-24 14:26:06
nova-conductor   test2                                internal         enabled    :-)   2013-06-24 14:26:11
nova-consoleauth test2                                internal         enabled    :-)   2013-06-24 14:26:12
nova-scheduler   test2                                internal         enabled    :-)   2013-06-24 14:26:12
nova-compute     test2                                nova             enabled    :-)   2013-06-24 14:26:07
nova-cert        test3                                internal         enabled    :-)   2013-06-24 14:26:11
nova-consoleauth test3                                internal         enabled    :-)   2013-06-24 14:26:14
nova-scheduler   test3                                internal         enabled    :-)   2013-06-24 14:26:11
nova-conductor   test3                                internal         enabled    :-)   2013-06-24 14:26:12
nova-compute     test3                                nova             enabled    :-)   2013-06-24 14:26:08

Any ideas what might be happening? Again, I can post additional logs, configs, etc. Whatever might help me get past this problem.

Thanks, Sam

----------------- edit -----------------

So I stopped all the services, dropped the databases, recreated them, made sure all my services were logging (openswitch didn't seem to be logging the first time), wiped out all rabbitmq queues and exchanges, rebooted all nodes... I still get the problem.

I noticed the first VM launches on test1 and does not emit the "Fail scheduling network" warning. All subsequent launches ... (more)

edit retag flag offensive close merge delete

4 answers

Sort by ยป oldest newest most voted
1

answered 2013-08-28 06:21:04 -0500

Moss gravatar image

updated 2013-08-28 06:23:23 -0500

Hello Sam,

When a new port is created (new instance will create a new for most case), the schedule_network in agentschedulers_db is called, and since this network may already hosted (already launch instance on that network), it will directly return None in dhcp_agent_scheduler, which cause the agentschedulers_db.schedule_network treats it as a problem and log a warning message.

Actually it is a https://bugs.launchpad.net/neutron/+bug/1192786 (bug) because it treats a normal situation same as a real problem

BR, Peter

edit flag offensive delete link more
0

answered 2013-07-28 21:57:43 -0500

Can you paste the output of "df -h" on your controller and network node.

edit flag offensive delete link more
0

answered 2013-07-19 10:29:27 -0500

fm255005 gravatar image

Same here. That is the only error I am seeing in the logs.

edit flag offensive delete link more
0

answered 2013-07-01 10:34:40 -0500

arindamchoudhury gravatar image

having the same problem?

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

3 followers

Stats

Asked: 2013-06-24 13:20:21 -0500

Seen: 1,910 times

Last updated: Aug 28 '13