Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Why does Metadata Agent keep falling over?

I've been running Icehouse neutron-metadata-agent (with ML2 OVS and network namespaces) since I upgraded from Havana in August.

Within the last few days it's started periodically failing. Frestarting the matadata-agent usually fixes, sometimes I also need to restart the dhcp-agent but I'm not 100% sure that is required. Service works soemtiems for hours sometimes minutes then goes back into failed state.

The neutron-ns-metadataproxy-<uuid>.log shows this for all requests when failing:

2014-11-15 12:32:57.316 30563 INFO neutron.wsgi [-] (30563) accepted ('10.10.161.17', 40973)

2014-11-15 12:32:57.317 30563 DEBUG neutron.agent.metadata.namespace_proxy [-] Request: GET /latest/meta-data/ HTTP/1.0
Accept: */*
Content-Type: text/plain
Host: 169.254.169.254 __call__ /usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py:68
2014-11-15 12:32:57.318 30563 ERROR neutron.agent.metadata.namespace_proxy [-] Unexpected error.
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy Traceback (most recent call last):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 74, in __call__
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     req.body)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 105, in _proxy_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     connection_type=UnixDomainHTTPConnection)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1569, in request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1316, in _request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._conn_request(conn, request_uri, method, body, headers)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1251, in _conn_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     conn.connect()
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 48, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     self.sock.connect(cfg.CONF.metadata_proxy_socket)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 192, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     while not socket_connect(fd, address):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 39, in socket_connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     raise socket.error(err, errno.errorcode[err])
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy error: [Errno 111] ECONNREFUSED
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy

It's unclear to me what is refusing connections, and since I'm not restarting the nova-api service that handles the actual request it seems unlikely that's the issue, though it's the most liekly thing the metadata-agent would be connecting to.

I do think metadata requests have increased, seems to be between 1-2 per second now, but that doesn't seem like an actually high number of request to a webservice. Sever load is low (90-95% idle) and ulimit for filedescriptors is set at 8k where it's openfiles are never much over 1k (looking at number of entries in /proc/<pid>/fd/).

Anyone have a clue what's going on here and better how to fix it?

Why does Metadata Agent keep falling over?

I've been running Icehouse neutron-metadata-agent (with ML2 OVS and network namespaces) since I upgraded from Havana in August.

Within the last few days it's started periodically failing. Frestarting Restarting the matadata-agent usually fixes, sometimes I also need to restart the dhcp-agent but I'm not 100% sure that is required. Service works soemtiems for hours sometimes minutes then goes back into failed state.

The neutron-ns-metadataproxy-<uuid>.log shows this for all requests when failing:

2014-11-15 12:32:57.316 30563 INFO neutron.wsgi [-] (30563) accepted ('10.10.161.17', 40973)

2014-11-15 12:32:57.317 30563 DEBUG neutron.agent.metadata.namespace_proxy [-] Request: GET /latest/meta-data/ HTTP/1.0
Accept: */*
Content-Type: text/plain
Host: 169.254.169.254 __call__ /usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py:68
2014-11-15 12:32:57.318 30563 ERROR neutron.agent.metadata.namespace_proxy [-] Unexpected error.
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy Traceback (most recent call last):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 74, in __call__
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     req.body)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 105, in _proxy_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     connection_type=UnixDomainHTTPConnection)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1569, in request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1316, in _request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._conn_request(conn, request_uri, method, body, headers)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1251, in _conn_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     conn.connect()
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 48, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     self.sock.connect(cfg.CONF.metadata_proxy_socket)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 192, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     while not socket_connect(fd, address):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 39, in socket_connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     raise socket.error(err, errno.errorcode[err])
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy error: [Errno 111] ECONNREFUSED
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy

It's unclear to me what is refusing connections, and since I'm not restarting the nova-api service that handles the actual request it seems unlikely that's the issue, though it's the most liekly thing the metadata-agent would be connecting to.

I do think metadata requests have increased, seems to be between 1-2 per second now, but that doesn't seem like an actually high number of request to a webservice. Sever load is low (90-95% idle) and ulimit for filedescriptors is set at 8k where it's openfiles are never much over 1k (looking at number of entries in /proc/<pid>/fd/).

Anyone have a clue what's going on here and better how to fix it?

Why does Metadata Agent keep falling over?

I've been running Icehouse neutron-metadata-agent (with ML2 OVS and network namespaces) since I upgraded from Havana in August.

Within the last few days it's started periodically failing. Restarting the matadata-agent metadata-agent usually fixes, sometimes I also need to restart the dhcp-agent but I'm not 100% sure that is required. Service works soemtiems sometimes for hours sometimes minutes then goes back into failed state.

The neutron-ns-metadataproxy-<uuid>.log shows this for all requests when failing:

2014-11-15 12:32:57.316 30563 INFO neutron.wsgi [-] (30563) accepted ('10.10.161.17', 40973)

2014-11-15 12:32:57.317 30563 DEBUG neutron.agent.metadata.namespace_proxy [-] Request: GET /latest/meta-data/ HTTP/1.0
Accept: */*
Content-Type: text/plain
Host: 169.254.169.254 __call__ /usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py:68
2014-11-15 12:32:57.318 30563 ERROR neutron.agent.metadata.namespace_proxy [-] Unexpected error.
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy Traceback (most recent call last):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 74, in __call__
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     req.body)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 105, in _proxy_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     connection_type=UnixDomainHTTPConnection)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1569, in request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1316, in _request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     (response, content) = self._conn_request(conn, request_uri, method, body, headers)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1251, in _conn_request
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     conn.connect()
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/namespace_proxy.py", line 48, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     self.sock.connect(cfg.CONF.metadata_proxy_socket)
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 192, in connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     while not socket_connect(fd, address):
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy   File "/usr/lib/python2.7/dist-packages/eventlet/greenio.py", line 39, in socket_connect
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy     raise socket.error(err, errno.errorcode[err])
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy error: [Errno 111] ECONNREFUSED
2014-11-15 12:32:57.318 30563 TRACE neutron.agent.metadata.namespace_proxy

It's unclear to me what is refusing connections, and since I'm not restarting the nova-api service that handles the actual request it seems unlikely that's the issue, though it's the most liekly likely thing the metadata-agent would be connecting to.

I do think metadata requests have increased, seems to be between 1-2 per second now, but that doesn't seem like an actually high number of request to a webservice. Sever load is low (90-95% idle) and ulimit for filedescriptors is set at 8k where it's openfiles are never much over 1k (looking at number of entries in /proc/<pid>/fd/).

Anyone have a clue what's going on here and better how to fix it?