trove guest instance starts normally, but trove-taskmanager still reports Error state

asked 2016-08-23 08:32:57 -0600

generalkalbasa gravatar image

Good day community! i have a multi node openstack mitaka deployment with secure management network and external network for instances as in openstack reference install guide.

controller has access to both networks. Trove-taskmanager uses rabbitMQ with rabbit_host set to CONTROLLER_MANAGEMENT_HOSTNAME, while guest agent uses rabbit_host set to CONTROLLER_EXTERNAL_IP.

I've have manualy built ubuntu 14.04 guest image for trove with mysql 5.6. my trove-guestagent.conf looks like this (have rabbit in both oslo and default just in case)

verbose = True
debug = True
rabbit_userid = openstack
rabbit_password = PASS
nova_proxy_admin_user = admin
nova_proxy_admin_pass = ADMIN_PASS
nova_proxy_admin_tenant_name = service
trove_auth_url = http://CONTROLLER_MANAGEMENT_HOSTNAME:35357/v3
rpc_backend = rabbit
log_dir = /var/log/trove/
log_file = trove-guestagent.log
datastore_manager = mysql
datastore_registry_ext = mysql:trove.guestagent.datastore.mysql.manager.Manager

rabbit_userid = openstack
rabbit_password = PASS

trove_auth_url is set to management host (though the instance has no access to that network), but accoroding to logs guest-agent doesn't even try to connect to that url so i don't think it is related to the issue.

When i start the instance the guest agent starts normaly, connects to rabbitmq and receives all the tasks it needs to do and finishes them without any error. in the end i have a fully functional Mysql database, accessible using the user/pass i entered during the creation. In the end, guestagent repeats the following message in the logs:

2016-08-23 12:53:44.685 1620 DEBUG oslo_service.periodic_task [-] Running periodic task Manager.update_status run_periodic_tasks /usr/local/lib/python2.7/dist-packages/oslo_service/
2016-08-23 12:53:44.686 1620 DEBUG trove.guestagent.datastore.manager [-] Update status called. update_status /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/
2016-08-23 12:53:44.687 1620 DEBUG trove.guestagent.datastore.service [-] Determining status of DB server. update /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/
2016-08-23 12:53:44.688 1620 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): /usr/bin/mysqladmin ping execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/
2016-08-23 12:53:44.737 1620 DEBUG oslo_concurrency.processutils [-] CMD "/usr/bin/mysqladmin ping" returned: 0 in 0.049s execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/
2016-08-23 12:53:44.741 1620 INFO trove.guestagent.datastore.mysql_common.service [-] MySQL Service Status is RUNNING.
2016-08-23 12:53:44.742 1620 DEBUG trove.guestagent.datastore.service [-] Casting set_status message to conductor (status is 'running'). set_status /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/
2016-08-23 12:53:44.745 1620 DEBUG trove.conductor.api [-] Making async call to cast heartbeat for instance: 7a54fad1-b87f-484b-8ed0-a413f68cedad heartbeat /usr/lib/python2.7/dist-packages/trove/conductor/
2016-08-23 12:53:44.747 1620 DEBUG oslo_messaging._drivers.amqpdriver [-] CAST unique_id: b496930502f5448f8b9864b892d3162c exchange 'openstack' topic 'trove-conductor' _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/
2016-08-23 12:53:44.752 1620 DEBUG trove.guestagent.datastore.service [-] Successfully cast set_status. set_status /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/

Could you please go through this thread Let us know the outcome.

Please also check

sunnyarora gravatar imagesunnyarora ( 2016-08-25 21:39:13 -0600 )edit

i have checked through those threads:1st one has another issue (stuck in build) 2nd one does have a kinda similar issue, but has no solution to it, unfortunately.

generalkalbasa gravatar imagegeneralkalbasa ( 2016-08-26 05:25:46 -0600 )edit

most trove issues are related to trove guest images not working properly, but mine is kinda working (according to logs and the ability to access mysql), but it somehow does not report the "success" to trove-taskmanager MQ

generalkalbasa gravatar imagegeneralkalbasa ( 2016-08-26 05:28:44 -0600 )edit

Look in your Trove Conductor log. Was the message from the guest successfully received and processed on the conductor? If you are running Mitaka, one issues around this time is caused by a mismatch between the versions of oslo.context on the guest and the controller.

amrith gravatar imageamrith ( 2016-08-26 07:20:12 -0600 )edit

trove conductor log from that day shows only regular INFO and DEBUG messages that it has started working at service restart long before the test instance launch. No messages with the timestamp close to the testing time

generalkalbasa gravatar imagegeneralkalbasa ( 2016-08-26 11:42:05 -0600 )edit

1 answer

answered 2017-02-04 13:54:52 -0600

AlexZ gravatar image

You need to add control_exchange = trove in your trove-guestagent.conf.

