Openstack "stein": Kubernetes-cluster stuck in "CREATE_IN_PROGESS"

asked 2019-08-02 06:54:23 -0500

hoover gravatar image

updated 2019-08-02 07:00:35 -0500

Hi folks,

I've set up a single openstack "stein" node on rather beefy hardware (HPE DL380G7, 128GB RAM, 24 cores, 1,5TB Raid-6) in order to experiment with terraform & kubernetes on openstack.

Using CentOS7 RDO packstack initially, I set up the default services and everything runs rather well (creating instances, volumes, floating ips, network access, simple terraform setups etc).

In order to be able to install the kubernetes stuff, I installed the following components manually following the openstack installation guide:

o heat

o magnum

all the tests from the install guide work fine, I've enabled debugging in heat and magnum to track down possible errors.

When I try to deploy the kubernetes-cluster example however, the cluster never gets past the "CREATE_IN_PROGRESS" state.

$ openstack coe cluster show kubernetes-cluster
+---------------------+------------------------------------------------------------+
| Field               | Value                                                      |
+---------------------+------------------------------------------------------------+
| status              | CREATE_IN_PROGRESS                                         |
| cluster_template_id | 3b421e0d-eb37-49d3-9b88-862ae98455d6                       |
| node_addresses      | []                                                         |
| uuid                | 04964ffd-d7cb-4e4b-bb60-fbbfdc12405d                       |
| stack_id            | f249f89b-cfe2-4c55-9055-da16938b7a2d                       |
| status_reason       | None                                                       |
| created_at          | 2019-08-02T10:32:02+00:00                                  |
| updated_at          | 2019-08-02T10:32:14+00:00                                  |
| coe_version         | None                                                       |
| labels              | {}                                                         |
| faults              |                                                            |
| keypair             | hoover                                                     |
| api_address         | None                                                       |
| master_addresses    | []                                                         |
| create_timeout      | 60                                                         |
| node_count          | 1                                                          |
| discovery_url       | https://discovery.etcd.io/aXXXXXXXXX   #obscured intentionall :-) |
| master_count        | 1                                                          |
| container_version   | None                                                       |
| name                | kubernetes-cluster                                         |
| master_flavor_id    | m1.small                                                   |
| flavor_id           | m1.small                                                   |
+---------------------+------------------------------------------------------------+

Here's the output of heat according to the magnum debugging guide:

heat stack-list -n 
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+----------------------------------------------------------------------------------------------------------+--------------------+----------------------+--------------+--------------------------------------+----------------------------------+
| id                                   | stack_name                                                                                               | stack_status       | creation_time        | updated_time | parent                               | project                          |
+--------------------------------------+----------------------------------------------------------------------------------------------------------+--------------------+----------------------+--------------+--------------------------------------+----------------------------------+
| f249f89b-cfe2-4c55-9055-da16938b7a2d | kubernetes-cluster-e57ftcjbezrq                                                                          | CREATE_IN_PROGRESS | 2019-08-02T10:32:13Z | None         | None                                 | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 629bcc3c-969c-4121-aa5b-ca46e49fdc9b | kubernetes-cluster-e57ftcjbezrq-network-zl2j6nxuyifu                                                     | CREATE_COMPLETE    | 2019-08-02T10:32:17Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 57cd9384-93cc-4e71-a5a8-46de6bbf3162 | kubernetes-cluster-e57ftcjbezrq-network-zl2j6nxuyifu-network_switch-uderfpvdkra5                         | CREATE_COMPLETE    | 2019-08-02T10:32:24Z | None         | 629bcc3c-969c-4121-aa5b-ca46e49fdc9b | 9d7efd87bb6e45acb73e007fcf3cd373 |
| ce78b3d0-664f-4120-b9a0-2cec21dad554 | kubernetes-cluster-e57ftcjbezrq-api_lb-gce5rvqsulqb                                                      | CREATE_COMPLETE    | 2019-08-02T10:32:29Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| e87e8db0-227d-4522-b56b-2143dcbec14f | kubernetes-cluster-e57ftcjbezrq-etcd_lb-imruey6tptl7                                                     | CREATE_COMPLETE    | 2019-08-02T10:32:29Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 951f578f-57d7-4309-a089-b62e2a215306 | kubernetes-cluster-e57ftcjbezrq-kube_masters-zvqibk5nm6iy                                                | CREATE_COMPLETE    | 2019-08-02T10:32:31Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 4f5e07db-267e-46ff-a1d3-327fe2f24c5a | kubernetes-cluster-e57ftcjbezrq-kube_masters-zvqibk5nm6iy-0-3etsojghnycz                                 | CREATE_COMPLETE    | 2019-08-02T10:32:35Z | None         | 951f578f-57d7-4309-a089-b62e2a215306 | 9d7efd87bb6e45acb73e007fcf3cd373 |
| ace07cb3-2623-4b39-8a00-61b7cafafb09 | kubernetes-cluster-e57ftcjbezrq-kube_masters-zvqibk5nm6iy-0-3etsojghnycz-api_address_switch-byooeo3ne6jb | CREATE_COMPLETE    | 2019-08-02T10:32:48Z | None         | 4f5e07db-267e-46ff-a1d3-327fe2f24c5a | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 09973ef5-364e-4e7e-9d29-f06877ed34b3 | kubernetes-cluster-e57ftcjbezrq-api_address_lb_switch-ioysjhsf255d                                       | CREATE_COMPLETE    | 2019-08-02T10:33:29Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| afb5ea9d-26e3-4340-bd3e-74f73314fb87 | kubernetes-cluster-e57ftcjbezrq-etcd_address_lb_switch-yblh6xdha4lu                                      | CREATE_COMPLETE    | 2019-08-02T10:33:30Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 91b9e9cb-84d3-406b-be5a-bc58895b9c10 | kubernetes-cluster-e57ftcjbezrq-kube_minions-vnpphr3wksvg                                                | CREATE_IN_PROGRESS | 2019-08-02T10:33:31Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| e81bc460-6508-471f-9fed-ea4da35e0b16 | kubernetes-cluster-e57ftcjbezrq-api_address_floating_switch-bwd663eqgwen                                 | CREATE_COMPLETE    | 2019-08-02T10:33:31Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 66a156bd-5730-4c24-a12b-2656e5af069c | kubernetes-cluster-e57ftcjbezrq-kube_minions-vnpphr3wksvg-0-idwti3hdczqn                                 | CREATE_IN_PROGRESS | 2019-08-02T10:33:34Z | None         | 91b9e9cb-84d3-406b-be5a-bc58895b9c10 | 9d7efd87bb6e45acb73e007fcf3cd373 |
+--------------------------------------+----------------------------------------------------------------------------------------------------------+--------------------+----------------------+--------------+--------------------------------------+----------------------------------+

As you can see, three components are "IN_PROGRESS" after over half an hour of runtime:

$

 heat stack-list -n  | grep PROG
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
| f249f89b-cfe2-4c55-9055-da16938b7a2d | kubernetes-cluster-e57ftcjbezrq                                                                          | CREATE_IN_PROGRESS | 2019-08-02T10:32:13Z | None         | None                                 | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 91b9e9cb-84d3-406b-be5a-bc58895b9c10 | kubernetes-cluster-e57ftcjbezrq-kube_minions-vnpphr3wksvg                                                | CREATE_IN_PROGRESS | 2019-08-02T10:33:31Z | None         | f249f89b-cfe2-4c55-9055-da16938b7a2d | 9d7efd87bb6e45acb73e007fcf3cd373 |
| 66a156bd-5730-4c24-a12b-2656e5af069c | kubernetes-cluster-e57ftcjbezrq-kube_minions-vnpphr3wksvg-0-idwti3hdczqn                                 | CREATE_IN_PROGRESS | 2019-08-02T10:33:34Z | None         | 91b9e9cb-84d3-406b-be5a-bc58895b9c10 | 9d7efd87bb6e45acb73e007fcf3cd373 |

Nothing is listed as "FAILED", so I assume that the "minions" bit is stuck somewhere, but I have no idea where or where to look for further clues.

checking the log in horizon for the minion VM in question, I see the following message:

kubernetes-cluster-e57ftcjbezrq-minion-0 login: [   48.024560] random: crng init done
[  107.674316] device-mapper: thin: Data device (dm-2) discard unsupported: Disabling discard passdown.
[  188.642262] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready

I can access the master node via ssh just fine (username "fedora"), the master node has no load whatsoever so I assume (again :-)) Installation has finished successfully.

I'd ... (more)

edit retag flag offensive close merge delete