Using ::WaitCondition as rolling_update mechanism on ::ResourceGroup

asked 2017-06-06 10:34:59 -0500

ivens.zambrano gravatar image

When trying to use OS::Heat::ResourceGroup to handle rolling upgrades we are facing the following issues:

The ResourceGroup defines a proprety: update_policy

  test:
    type: OS::Heat::ResourceGroup
    update_policy:
      rolling_update:
        max_batch_size: 1
        min_in_service: 2
    properties:
      count: 8
      resource_def:
        type: test_detail.yaml

and inside test_detail.yaml we have:

interface1:
    type: OS::Neutron::Port
    [...]

  wait_condition:
    type: OS::Heat::WaitCondition
    [...]

  wait_handle:
    type: OS::Heat::WaitConditionHandle
    [...]

  vm:
    type: OS::Nova::Server
    properties:
      networks:
      - port: { get_resource: interface1 }
      user_data_format: RAW
      user_data: { get_resource: user_data }
    [...]

  user_data:
    type: OS::Heat::MultipartMime
    properties:
      parts:
      - config:
          str_replace:
             params:
               wc_notify: { get_attr: [wait_handle, curl_cli] }
             template: |
               #cloud-config
               merge_how: 'list(append)+dict(recurse_array,no_replace)+str()'
               write_files:
                 - path: /run/cloud-init/phonehome.sh
                   owner: root:root
                   permissions: '0777'
                   content: |
                      #!/bin/bash -x
                      wc_notify --data-binary '{"status": "SUCCESS"}'
               runcmd:
                 - /run/cloud-init/phonehome.sh

We want to use the wait condition notifications as the main driver for the rolling_update policy but it appears that the only condition working to hold all the VMs going for update at the same time is if we use "pause_time"as part of the "rolling_update" definition. The problem with this approach is: "pause_time" is not deterministic for the Server status!! we should not move to the next instance in the ResourceGroup just based on time, we need to be sure that the instance is ready providing the service before moving to the next one.

Is there a way to achieve this?? maybe using a different resource type??

Thanks

edit retag flag offensive close merge delete