problems with nova-compute on Controller-0 while upgrading Ocata minor

asked 2018-07-30 04:53:49 -0500

Hello,

We are upgrading our 3 compute + 3 controllers environment to Ocata minor. The upgrade drops with a timeout issue when we run openstack overcloud update stack -i overcloud

We have found using curl -s -u admin:<passwd> "http://192.168.24.10:1993/;csv" | egrep -vi "(frontend|backend)" | awk -F',' '{ print $1" "$2" "$18 }' | grep DOWN That services are down on the controller-0.

aodh overcloud-controller-0.internalapi.localdomain DOWN ceilometer overcloud-controller-0.internalapi.localdomain DOWN gnocchi overcloud-controller-0.internalapi.localdomain DOWN horizon overcloud-controller-0.internalapi.localdomain DOWN keystone_admin overcloud-controller-0.ctlplane.localdomain DOWN keystone_public overcloud-controller-0.internalapi.localdomain DOWN nova_placement overcloud-controller-0.internalapi.localdomain DOWN panko overcloud-controller-0.internalapi.localdomain DOWN panko overcloud-controller-1.internalapi.localdomain DOWN panko overcloud-controller-2.internalapi.localdomain DOWN redis overcloud-controller-1.internalapi.localdomain DOWN redis overcloud-controller-2.internalapi.localdomain DOWN

When investigating this I can see that nova-engine and other services are restarting every 3 seconds: ``` [heat-admin@overcloud-controller-0 ~]$ sudo systemctl status openstack-nova-compute ● openstack-nova-compute.service - OpenStack Nova Compute Server Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2018-07-30 09:49:37 UTC; 691ms ago Main PID: 731938 (nova-compute) Tasks: 1 Memory: 91.6M CGroup: /system.slice/openstack-nova-compute.service └─731938 /usr/bin/python2 /usr/bin/nova-compute

Jul 30 09:49:33 overcloud-controller-0 systemd[1]: Starting OpenStack Nova Compute Server... Jul 30 09:49:37 overcloud-controller-0 systemd[1]: Started OpenStack Nova Compute Server. ```

When we check sudo tail -f /var/log/nova/nova-compute.log this is what we see:

2018-07-30 09:28:21.365 427967 ERROR oslo_service.service [req-116658f0-66c9-45f0-a63e-d9070b6cee37 - - - - -] Error starting thread.: InternalServerError: (pymysql.err.InternalError) (1054, u"Unknown column 'nodes.version' in 'field list'") [SQL: u'SELECT nodes.created_at AS nodes_created_at, nodes.updated_at AS nodes_updated_at, nodes.version AS nodes_version, nodes.id AS nodes_id, nodes.uuid AS nodes_uuid, nodes.instance_uuid AS nodes_instance_uuid, nodes.name AS nodes_name, nodes.chassis_id AS nodes_chassis_id, nodes.power_state AS nodes_power_state, nodes.target_power_state AS nodes_target_power_state, nodes.provision_state AS nodes_provision_state, nodes.target_provision_state AS nodes_target_provision_state, nodes.provision_updated_at AS nodes_provision_updated_at, nodes.last_error AS nodes_last_error, nodes.instance_info AS nodes_instance_info, nodes.properties AS nodes_properties, nodes.driver AS nodes_driver, nodes.driver_info AS nodes_driver_info, nodes.driver_internal_info AS nodes_driver_internal_info, nodes.clean_step AS nodes_clean_step, nodes.resource_class AS nodes_resource_class, nodes.raid_config AS nodes_raid_config, nodes.target_raid_config AS nodes_target_raid_config, nodes.reservation AS nodes_reservation, nodes.conductor_affinity AS nodes_conductor_affinity, nodes.maintenance AS nodes_maintenance, nodes.maintenance_reason AS nodes_maintenance_reason, nodes.console_enabled AS nodes_console_enabled, nodes.inspection_finished_at AS nodes_inspection_finished_at, nodes.inspection_started_at AS nodes_inspection_started_at, nodes.extra AS nodes_extra, nodes.boot_interface AS nodes_boot_interface, nodes.console_interface AS nodes_console_interface, nodes.deploy_interface AS nodes_deploy_interface, nodes.inspect_interface AS nodes_inspect_interface, nodes.management_interface AS nodes_management_interface, nodes.network_interface AS nodes_network_interface, nodes.raid_interface AS nodes_raid_interface, nodes.storage_interface AS nodes_storage_interface, nodes.power_interface AS nodes_power_interface, nodes.vendor_interface AS nodes_vendor_interface, node_tags_1.created_at AS node_tags_1_created_at, node_tags_1.updated_at AS node_tags_1_updated_at, node_tags_1.version AS node_tags_1_version, node_tags_1.node_id AS node_tags_1_node_id, node_tags_1.tag AS node_tags_1_tag \nFROM nodes LEFT OUTER JOIN node_tags AS node_tags_1 ON node_tags_1.node_id = nodes.id \nWHERE nodes.instance_uuid = %(instance_uuid_1)s'] [parameters: {u'instance_uuid_1': u'739eefe4-e0d1-442e-805f-24490539cb6e'}] (HTTP 500) 2018-07-30 09:28:21.365 427967 ERROR oslo_service.service Traceback (most recent call last): 2018-07-30 09:28:21.365 427967 ERROR oslo_service.service ... (more)

edit retag flag offensive close merge delete