Ask Your Question
1

cloud-init fails only in a particular tenant

asked 2015-04-28 10:33:10 -0500

kjtanaka gravatar image

updated 2015-04-29 12:05:34 -0500

We have a pretty basic OpenStack Juno with Neutron which you can build by following "OpenStack Installation Guide for Ubuntu 14.04". It works great, but recently cloud-init started to fail ssh key injection, and nova console-log <server> shows following errors. Even more strangely, this problem only happens in one particular tenant. Is it an issue in quota setting, or something else? Any help or comment would be appreciated.

2015-04-28 14:18:07,546 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [50/120s]: request error [(<urllib3.connectionpool.HTT
PConnectionPool object at 0x7fdf0af35450>, 'Connection to 169.254.169.254 timed out. (connect timeout=50.0)')]
2015-04-28 14:18:58,600 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [101/120s]: request error [(<urllib3.connectionpool.HT
TPConnectionPool object at 0x7fdf0af355d0>, 'Connection to 169.254.169.254 timed out. (connect timeout=50.0)')]
2015-04-28 14:19:16,621 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [119/120s]: request error [(<urllib3.connectionpool.HT
TPConnectionPool object at 0x7fdf0af350d0>, 'Connection to 169.254.169.254 timed out. (connect timeout=17.0)')]
2015-04-28 14:19:17,623 - DataSourceEc2.py[CRITICAL]: Giving up on md from ['http://169.254.169.254/2009-04-04/meta-data/instance-id'] after 120 seconds
2015-04-28 14:20:07,675 - url_helper.py[WARNING]: Calling 'http://10.23.0.3//latest/meta-data/instance-id' failed [50/120s]: request error [(<urllib3.connectionpool.HTTPConnecti
onPool object at 0x7fdf0af35990>, 'Connection to 10.23.0.3 timed out. (connect timeout=50.0)')]
2015-04-28 14:20:58,728 - url_helper.py[WARNING]: Calling 'http://10.23.0.3//latest/meta-data/instance-id' failed [101/120s]: request error [(<urllib3.connectionpool.HTTPConnect
ionPool object at 0x7fdf0af35210>, 'Connection to 10.23.0.3 timed out. (connect timeout=50.0)')]
2015-04-28 14:21:16,749 - url_helper.py[WARNING]: Calling 'http://10.23.0.3//latest/meta-data/instance-id' failed [119/120s]: request error [(<urllib3.connectionpool.HTTPConnect
ionPool object at 0x7fdf0af351d0>, 'Connection to 10.23.0.3 timed out. (connect timeout=17.0)')]
2015-04-28 14:21:17,751 - DataSourceCloudStack.py[CRITICAL]: Giving up on waiting for the metadata from ['http://10.23.0.3//latest/meta-data/instance-id'] after 120 seconds
Apr 28 14:21:58 ubuntu pollinate[713]: ERROR: Network communication failed [28]\n14:21:18.264414 * Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0  0     0    0     0
   0     0      0      0 --:--:--  0:00:02 --:--:--     014:21:21.778354 * Resolving timed out after 3513 milliseconds
  0     0    0     0    0     0      0      0 --:--:--  0:00:40 --:--:--     0
14:21:58.303965 * Closing connection 0
curl: (28) Resolving timed out after 3513 milliseconds
2015-04-28 14:21:58,364 - util.py[WARNING]: Running seed_random (<module 'cloudinit.config.cc_seed_random' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_seed_random
.pyc'>) failed
edit retag flag offensive close merge delete

Comments

Have you checked if it's the image not the tenant, can you have that tenant test with a Cirros test image. Also when you reboot other instances are they able to hit your metadata server with out any issues?

omar-munoz gravatar imageomar-munoz ( 2015-04-28 13:52:56 -0500 )edit

Thanks for the comment. This issue happens only in this particular tenant. The image is shared all over the tenants, and everything works if I switch to another tenant.

kjtanaka gravatar imagekjtanaka ( 2015-04-28 14:20:55 -0500 )edit

Also, the network is shared. Other than tenant id, everything is the same.

kjtanaka gravatar imagekjtanaka ( 2015-04-28 14:32:26 -0500 )edit

well if you want to check your quotas try 'nova hypervisor-stats' to check your compute nodes have room. The following will just validate the quotas if they are enforcing. I believe '-1' = unlimited

nova quota-show --tenant $UUIDofTenant

neutron quota-show --tenant-id $UUIDofTenant
omar-munoz gravatar imageomar-munoz ( 2015-04-29 10:39:31 -0500 )edit

Thanks Omar again. But I'm not sure if the issue belongs to quota setting. I have actually increased quotas on nova and neutron but nothing changed.

kjtanaka gravatar imagekjtanaka ( 2015-04-29 12:33:08 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2016-11-16 08:47:33 -0500

mvazquezc gravatar image

updated 2016-11-16 08:48:06 -0500

I don't know if you're still facing this issue, anyway I ran in a similar situation and finally it was related with security groups. Could you check if instances on this particular tenant have http access allowed to 169.254.169.254 on any of its security groups?

hope it helps!

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2015-04-28 10:33:10 -0500

Seen: 985 times

Last updated: Nov 16 '16