Openstack Rocky keypair issues

asked 2019-04-24 03:43:14 -0500

JeffBannister gravatar image

I have installed Openstack Rocky using Packstack. I cannot connect to instances using keypairs - the instance refuse when connecting with the public key. I've a similar setup with Ocata and it works fine. I've tried with a number of ssh clients and the problem is the same. Could someone provide some guidance as to where I should look to solve this? I've tried verbose on ssh in Linux and looking at Wireshark but can't really see anything that would suggest what's going wrong. TIA

I've noticed that in the cirros instance created in Rocky, there is no .ssh directory in the instance so no authorized_keys file. There is in Ocata

JeffBannister ( 2019-04-24 03:53:41 -0500 )

One of the staps in documentation

$ . demo-openrc
$ ssh-keygen -q -N ""
$ openstack keypair create --public-key ~/.ssh/ mykey
tjoen ( 2019-04-24 05:08:03 -0500 )

You launched the instance with the --key-name parameter? If yes, one reason for not getting the key is a problem with the metadata API. Use openstack network agent list and systemctl status *neutron* to see if the metadata agent is healthy.

My Rocky Packstack has no keypair problems,

Bernd Bausch ( 2019-04-24 05:35:35 -0500 )

You can also check instances’ console logs for errors regarding metadata and/or keys.

Bernd Bausch ( 2019-04-24 05:36:18 -0500 )

Thanks Bernd, I looked at the logs for cirros and for an ubuntu instance I created. There are errors around sshd - this is for cirros:

Starting dropbear sshd: failed to get instance-id of datasource
WARN: generating key of type ecdsa failed!
JeffBannister ( 2019-04-24 20:16:54 -0500 )

There's also a fail on the route:

WARN: failed: route add -net "" gw ""
cirros-ds 'net' up at 13.08
failed 1/20: up 13.49. request failed
failed 2/20: up 16.37. request failed

The instance is online

JeffBannister ( 2019-04-24 20:19:04 -0500 )

Your instance doesn't get access to metadata (failed 1/20: up 13.49. request failed, failed to get instance-id). I don't know why the route command fails.

Does the instance have network connectivity at all, i.e. did DHCP succeed? If yes, something is wrong with the metadata agent, I'd guess.

Bernd Bausch ( 2019-04-24 21:43:31 -0500 )

Yes it does get an IP address

$ip route
default via dev eth0 via dev eth0 dev eth0  src 
PING ( 56 data bytes
64 bytes from seq=0 ttl=120 time=34.511 ms
JeffBannister ( 2019-04-24 22:24:59 -0500 )

Will try to troubleshoot metadata agent Thanks

JeffBannister ( 2019-04-24 22:25:30 -0500 )

I guess the route command fails because the default route is already configured. cloud-init may not bother checking this; what counts is the result.

Yes, your focus should be on the metadata agent or Nova's metadata service configuration.

Bernd Bausch ( 2019-04-24 22:33:35 -0500 )

I'm tracing between the VM/Instance and the meta-data server and I can see that the instance sends:

GET /2009-04-04/meta-data/instance-id HTTP/1.1\r\n

But it's going to!

JeffBannister ( 2019-04-25 00:09:03 -0500 )

That's correct. is the address of the metadata API. Yes, metadata is originally an AWS concept that Nova copied.

The question is: How is the address NAT'ed, and is anything receiving the request.

Bernd Bausch ( 2019-04-25 01:09:06 -0500 )

Yes sorry, I realised this after some googling. I'm still getting the fail. Any suggestions how/where to check the NAT mapping?

JeffBannister ( 2019-04-25 02:24:21 -0500 )

answered 2019-04-26 00:26:03 -0500

JeffBannister gravatar image

updated 2019-04-26 00:27:06 -0500

Checked the router and the NAT table is fine - same as yours. My setup is that I have 3 nodes: controller, compute & network. I did a Wireshark trace on the network node as follows:

  • capture on a mirror port on br-int, capture on interface between network ( & controller (
  • on the instance ( I did curl which gives a 500 Internal Error

I can't attach the Wireshark trace file but I've attached a screen grab of the output (image). After the HTTP GET to 169, there's a set of AMQP messages between the network node and the controller which I'm assuming is for the meta-data - they don't seem to be in error. But then 169 returns a 500 error.

So I've looked at an older one that works (Ocata) and I can see the network node issuing the GET to the controller, but my Rocky doesn't

JeffBannister ( 2019-04-26 01:13:55 -0500 )

Ahh, my network node is going to for the meta-data even though metadata_agent.ini says nova_metadata_ip= I can see the error in the metadata log file that says it's trying port 8775. But I don't know why it's doing that.

JeffBannister ( 2019-04-26 01:36:02 -0500 )

Fixed! The metadata_agent.ini must now refer to the metadata server by nova_metadata_host=IP addr or name rather than nova_metadata_ip (since Queens): github

JeffBannister ( 2019-04-26 03:05:47 -0500 )

answered 2019-04-25 05:09:28 -0500

updated 2019-04-25 05:13:19 -0500

Here is how the metadata traffic is NAT’ed. I enter the network namespace for a router that is connected to the external network ( and print the netfilter nat table:

[stack@rocky ~(alice)]$ openstack router list
| ID                                   | Name            | Status | State | Distributed | HA   | Project                          |
| aad9670b-2eaa-4134-b350-c1acc6e9ac65 | saturn-router-2 | ACTIVE | UP    | None        | None | 6376262d8f524f368ba4fe14d683d5eb |
| f39ad645-0f7b-4296-9d3c-79cdf393553a | saturn-router   | ACTIVE | UP    | None        | None | 6376262d8f524f368ba4fe14d683d5eb |
[stack@rocky ~(alice)]$ sudo ip netns exec qrouter-f39ad645-0f7b-4296-9d3c-79cdf393553a /bin/bash
[root@rocky stack(alice)]# ip r
default via dev qg-9115266f-e1 dev qg-9115266f-e1 proto kernel scope link src dev qr-32f354fe-e9 proto kernel scope link src 
[root@rocky stack(alice)]# iptables -t nat -S
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d -j DNAT --to-destination
-A neutron-l3-agent-POSTROUTING ! -i qg-9115266f-e1 ! -o qg-9115266f-e1 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697

The last line takes care of metadata traffic.

