Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Magnum deploys cluster VMs on controller node instead of compute node

Hi, I'm having trouble deploying a kubernetes cluster on my two node setup. I have one controller and one compute node. The controller node has 1 NIC and the compute node has two, one of which is used for the flat provider network. Spinning up regular KVM VMs works well and they get deployed on the compute node with networking fine.

I'm trying to deploy a kubernetes cluster via magnum, and it fails on the networking step. I can see from the neutron logs on the controller:

ERROR neutron.plugins.ml2.managers [req-36b007f4-8da0-42b5-9b25-3c4928b32d12 415d47fb69d94d8b8b964d92cdb60e56 db07ee3987614128bc3c93527c04d1a9 - default default] Failed to bind port ed644ee8-76b0-4ff1-80e3-30ff2c7d21c6 on host openstack-control for vnic_type normal using segments [{'network_id': '07a06eaf-223e-4c53-9a69-4164b5d1b3d8', 'segmentation_id': None, 'physical_network': u'provider', 'id': '90260147-33df-4e08-a953-10a1aa5f6834', 'network_type': u'flat'}]

Nova and heat logs provide similar errors, but this is fine, because the port doesn't exist on the controller node. It does exist on the compute node. but I can't figure out how to get magnum to deploy the cluster there.

I also tried to disable the nova-compute service on the controller node:

[root@openstack-control ~(keystone_admin)]# openstack compute service set  --disable --disable-reason testing openstack-control nova-compute

...but then nova just reports:

ERROR nova.conductor.manager [req-01fedb41-3102-4eb6-b88a-ca99f3926037 51e54f8ec02f41e3903e0f40ced8a6db da668bc102784935b284613d29827e05 - default default] Failed to schedule instances: NoValidHost_Remote: No valid host was found.

I create the cluster as follows:

openstack coe cluster template create clustertemplate2 \
--coe kubernetes \
--server-type vm \
--image coreos_production_openstack \
--external-network provider \
--network-driver calico \
--volume-driver cinder \
--docker-volume-size 1 \
--docker-storage-driver cinder \
--dns-nameserver 8.8.8.8 \
--flavor m1.tiny \
--registry-enabled \
--labels cloud_provider_enabled=true

openstack coe cluster create \
--cluster-template clustertemplate2 \
--docker-volume-size 3 \
--keypair keypair1 \
--master-count 1 \
--node-count 1 \
--master-flavor m1.small \
--flavor m1.small \
--fixed-network provider  \
--fixed-subnet provider   \
--floating-ip-enabled \
cluster

Here are some service outputs:

[root@openstack-control ~(keystone_admin)]# openstack coe service list
+----+------+------------------+-------+----------+-----------------+---------------------------+---------------------------+
| id | host | binary           | state | disabled | disabled_reason | created_at                | updated_at                |
+----+------+------------------+-------+----------+-----------------+---------------------------+---------------------------+
|  1 | None | magnum-conductor | up    | False    | None            | 2020-04-06T18:50:09+00:00 | 2020-04-09T10:42:42+00:00 |
+----+------+------------------+-------+----------+-----------------+---------------------------+---------------------------+
[root@openstack-control ~(keystone_admin)]# openstack network agent list
+--------------------------------------+--------------------+--------------------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type         | Host               | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+--------------------+--------------------+-------------------+-------+-------+---------------------------+
| 2b24e18b-e325-413d-9071-99204d794e68 | Linux bridge agent | openstack-compute1 | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 35df3f47-f8d2-4393-844f-ddf51392e192 | DHCP agent         | openstack-compute1 | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 7ab7a882-b855-45e8-a73f-66da41f8827f | Metadata agent     | openstack-compute1 | None              | :-)   | UP    | neutron-metadata-agent    |
| b1f2bb45-64fe-467c-82df-1d229a700465 | DHCP agent         | openstack-control  | nova              | :-)   | UP    | neutron-dhcp-agent        |
| db9e3ee7-ec45-4122-93bc-b94f520d0f2c | Metadata agent     | openstack-control  | None              | :-)   | UP    | neutron-metadata-agent    |
+--------------------------------------+--------------------+--------------------+-------------------+-------+-------+---------------------------+
[root@openstack-control ~(keystone_admin)]# openstack compute service list
+----+----------------+--------------------+----------+---------+-------+----------------------------+
| ID | Binary         | Host               | Zone     | Status  | State | Updated At                 |
+----+----------------+--------------------+----------+---------+-------+----------------------------+
|  2 | nova-conductor | openstack-control  | internal | enabled | up    | 2020-04-09T10:43:45.000000 |
|  3 | nova-scheduler | openstack-control  | internal | enabled | up    | 2020-04-09T10:43:46.000000 |
|  4 | nova-compute   | openstack-control  | nova     | enabled | up    | 2020-04-09T10:43:43.000000 |
|  5 | nova-compute   | openstack-compute1 | nova     | enabled | up    | 2020-04-09T10:43:47.000000 |
+----+----------------+--------------------+----------+---------+-------+----------------------------+

I'm obviously missing something on the compute node so the cluster VMs don't get scheduled there, but I have no idea what. Can anyone please help out? What am I missing on the compute node for the cluster VMs to be deployed there?