During stack deployment of a k8s master node, installation eventually times out and cluster create fails

asked 2018-07-09 11:21:13 -0500

garypen gravatar image


Using OpenStack:queens. When trying to deploy a k8s cluster, using the latest fedora-atomic image (28-20180625.1), the cluster creation fails with a timeout.

Examining log files on the master node, I note that:

2018-07-09 15:30:04,400 - util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-013'] with allowed return codes [0] (shell=False, capture=False)
2018-07-09 15:31:01,403 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-013 [5]
2018-07-09 15:31:01,451 - util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-013 [5]
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 802, in runparts
    subp(prefix + [exe_path], capture=False)
  File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 1858, in subp
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-013']
Exit code: 5
Reason: -
Stdout: -
Stderr: -
2018-07-09 15:31:01,492 - util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-014'] with allowed return codes [0] (shell=False, capture=False)

part-013 is a very simple script, which runs:


. /etc/sysconfig/heat-params

set -ux

atomic install \
--storage ostree \
--system \
--system-package no \
--set REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt \
--name heat-container-agent \

systemctl start heat-container-agent

I tried running the atomic install command manually on the master node (as root) and added the --debug flag. This results in:

[root@dwa3-k8s-cluster-uj3kfq4qehvz-master-0 log]# atomic --debug install --storage ostree --system --system-package no --set REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt --name heat-container-agent docker.io/openstackmagnum/heat-container-agent:rawhide
Namespace(_class=<class 'Atomic.install.Install'>, args=[], assumeyes=False, debug=True, display=False, func='install', ignore=False, image='docker.io/openstackmagnum/heat-container-agent:rawhide', name='heat-container-agent', opt1=None, opt2=None, opt3=None, profile=False, remote=None, runtime=None, setvalues=['REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt'], storage='ostree', system=True, system_package='no', user=False)

Traceback (most recent call last):
  File "/bin/atomic", line 185, in <module>
  File "/usr/lib/python2.7/site-packages/Atomic/install.py", line 118, in install
    remote_image_obj = be.make_remote_image(self.args.image)
  File "/usr/lib/python2.7/site-packages/Atomic/backends/_ostree.py", line 159, in make_remote_image
  File "/usr/lib/python2.7/site-packages/Atomic/objects/image.py", line 157, in populate_remote_inspect_info
    remote_inspect_info = self.remote_inspect()
  File "/usr/lib/python2.7/site-packages/Atomic/objects/image.py", line 174, in remote_inspect
    inspect_info = ri.inspect()
  File "/usr/lib/python2.7/site-packages/Atomic/discovery.py", line 41, in inspect
    inspect_data = util.skopeo_inspect("docker://{}".format(self.fqdn), return_json=True)
  File "/usr/lib/python2.7/site-packages/Atomic/util.py", line 348, in skopeo_inspect
    raise ValueError(error)

At this point, I'm out of ideas. Any suggestions?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2018-07-12 03:24:48 -0500

garypen gravatar image

I'll answer my own question here and note that the problem was caused by DNS. The private network didn't have a default DNS name server, so the created node used a default for the private network (odd, I would have thought it would use the name server specified in the cluster template, but ...). Anyway, I'm now stuck on a different problem with cube master creation. Looks similar to: https://ask.openstack.org/en/question/113963/hi-everyone-ive-tried-magnum-in-queen-version-with-centos-and-the-probelm-is-kube-matser-in-stucked-in-create-in-progress-state/ (https://ask.openstack.org/en/question...) except I'm using fedora.

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2018-07-09 11:21:13 -0500

Seen: 127 times

Last updated: Jul 09 '18