Ask Your Question
0

Heat tempalte autoscaling error

asked 2014-08-07 03:45:47 -0500

stipi gravatar image

updated 2014-08-07 04:28:17 -0500

Hi!

I've been trying to create an autoscaling heat template based on the autoscaling wordpress template. My code looks like this:

heat_template_version: 2013-05-23

description: Autoscaling Test Template

parameters:
  image:
    type: string
    description: Image use to boot a server
  subnet_id:
    type: string
    description: subnet on which the load balancer will be located
  key:
    type: string
    description: keypair for authentication
  flavor:
    type: string
    description: flavor of the servers

resources:
  server_group:
    type: OS::Heat::AutoScalingGroup
    properties:
      min_size: 1
      max_size: 3
      resource:
        type: "https://github.com/openstack/heat-templates/blob/master/hot/lb_server.yaml"
        properties:
            flavor: {get_param: flavor}
            image: {get_param: image}
            key_name: {get_param: key}
            pool_id: {get_resource: pool}
            metadata: {"metering.stack": {get_param: "OS::stack_id"}}
            user_data: ""
  server_scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: server_group}
      cooldown: 60
      scaling_adjustment: 1
  server_scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: server_group}
      cooldown: 60
      scaling_adjustment: -1
  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-up if the average CPU > 50% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 50
      alarm_actions:
        - {get_attr: [server_scaleup_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: gt
  cpu_alarm_low:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-down if the average CPU < 15% for 10 minutes
      meter_name: cpu_util
      statistic: avg
      period: 600
      evaluation_periods: 1
      threshold: 15
      alarm_actions:
        - {get_attr: [server_scaledown_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: lt
  monitor:
    type: OS::Neutron::HealthMonitor
    properties:
      type: TCP
      delay: 3
      max_retries: 5
      timeout: 5
  pool:
    type: OS::Neutron::Pool
    properties:
      protocol: HTTP
      monitors: [{get_resource: monitor}]
      subnet_id: {get_param: subnet_id}
      lb_method: ROUND_ROBIN
      vip:
        protocol_port: 80
  lb:
    type: OS::Neutron::LoadBalancer
    properties:
      protocol_port: 80
      pool_id: {get_resource: pool}

When I try to run the stack I get the following error in heat -engine.log:

    2014-08-07 09:59:35.424 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating HealthMonitor "monitor"
2014-08-07 09:59:35.426 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating Pool "pool"
2014-08-07 09:59:35.428 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingResourceGroup "server_group"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating LoadBalancer "lb"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingPolicy "server_scaledown_policy"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating CeilometerAlarm "cpu_alarm_low"
2014-08-07 09:59:35.431 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingPolicy "server_scaleup_policy"
2014-08-07 09:59:35.432 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating CeilometerAlarm "cpu_alarm_high"
2014-08-07 09:59:35.796 6526 INFO heat.engine.resource [-] creating HealthMonitor "monitor" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:35.950 6526 WARNING heat.common.keystoneclient [-] stack_user_domain ID not set in heat.conf falling back to using default
2014-08-07 09:59:35.958 6526 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): controller
2014-08-07 09:59:36.698 6526 INFO heat.engine.resource [-] creating Pool "pool" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:41.052 6526 INFO heat.engine.resource [-] creating AutoScalingResourceGroup "server_group" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:41.178 6526 INFO heat.engine.environment [-] Registering OS::Heat::ScaledResource -> AWS::EC2::Instance ...
(more)
edit retag flag offensive close merge delete

Comments

Are you running this in 'admin' tenant or with a user without admin privileges?

Do you have the 'heat_stack_user" role created?
check using:

keystone role-list

Check the project membership of the user you create the stack with.

AndyHardwick gravatar imageAndyHardwick ( 2014-08-07 05:13:53 -0500 )edit

I have the heat_stck_user role created, and the demo user I used has this role, but still no luck. The admin user works, the stack creation finishes, however the scaleup doesn't happen when I stress the cpu of the vm. I see the alarm in the ceilometer logs but no new instance is started.

stipi gravatar imagestipi ( 2014-08-08 03:48:35 -0500 )edit

For the Demo user. - The heat_stack_user role is used when creating the scale users in the project (tenant). Remove it from your demo user, hopefully that should solve it.

For Scale - When the alarm is raised ceilometer calls the web hook created in the scale policy. The heat-cfn-api service handles this which listens on port 8000.

What errors do you see in the heat logs.

AndyHardwick gravatar imageAndyHardwick ( 2014-08-08 04:07:14 -0500 )edit

The scaling happened at last. The problem is that one new instance was started 15 mindutes after I began stressing the first VM, and another one is started 10 minutes later. Why is there so much delay? The alarm arrived after ~1 minute in ceilometer logs. (This still happends in the admin tenant.)

stipi gravatar imagestipi ( 2014-08-08 04:15:01 -0500 )edit

Oh I guess the problem is that ceilometer sampling interval for cpu_util is set to 10 minutes on the compute nodes. I will ty to decrease that, and see if that works.

stipi gravatar imagestipi ( 2014-08-08 05:28:01 -0500 )edit

2 answers

Sort by ยป oldest newest most voted
1

answered 2014-08-07 04:10:41 -0500

Hi,

I suggest changing the URL to:

"https://raw.githubusercontent.com/openstack/heat-templates/master/hot/lb_server.yaml".

Your link is too the blob view which is marked up.

edit flag offensive delete link more

Comments

Hi Andy! Thanks for your reply, it seems to have fixed the first error, but produced another I added to the question via an edit.

stipi gravatar imagestipi ( 2014-08-07 04:30:01 -0500 )edit
1

answered 2014-08-11 02:53:21 -0500

stipi gravatar image

updated 2014-08-11 04:13:33 -0500

So I figured it out at last: Here is my autosacling template, that works

 heat_template_version: 2013-05-23

description: Autoscaling Test Template

parameters:
  image:
    type: string
    description: Image use to boot a server
  subnet_id:
    type: string
    description: subnet on which the load balancer will be located
  network_id:
    type: string
    description: network on which the instances will be located
  key:
    type: string
    description: keypair for authentication
  flavor:
    type: string
    description: flavor of the servers

resources:
  server_group:
    type: OS::Heat::AutoScalingGroup
    properties:
      min_size: 1
      max_size: 3
      resource:
        type: "http://10.0.0.11:80/lb_server.yaml"
        properties:
            flavor: {get_param: flavor}
            image: {get_param: image}
            key_name: {get_param: key}
            pool_id: {get_resource: pool}
            metadata: {"metering.stack": {get_param: "OS::stack_id"}}
            network_id: {get_param: network_id}
            user_data: 
              str_replace:
                template: |
                  #!/bin/bash -v

                  apt-get -y install apache2
                params:
                  dummy: ""
  server_scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: server_group}
      cooldown: 30
      scaling_adjustment: 1
  server_scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: server_group}
      cooldown: 30
      scaling_adjustment: -1
  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-up if the max CPU > 60% for 10 secs
      meter_name: cpu_util
      statistic: avg
      period: 10
      evaluation_periods: 1
      threshold: 60
      alarm_actions:
        - {get_attr: [server_scaleup_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: gt
  cpu_alarm_low:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-down if the max CPU < 40% for 10 secs
      meter_name: cpu_util
      statistic: avg
      period: 10
      evaluation_periods: 1
      threshold: 40
      alarm_actions:
        - {get_attr: [server_scaledown_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: lt
  monitor:
    type: OS::Neutron::HealthMonitor
    properties:
      type: PING
      delay: 3
      max_retries: 5
      timeout: 5
  pool:
    type: OS::Neutron::Pool
    properties:
      protocol: TCP
      monitors: [{get_resource: monitor}]
      subnet_id: {get_param: subnet_id}
      lb_method: ROUND_ROBIN
      vip:
        protocol_port: 80
        subnet: {get_param: subnet_id}
  lb:
    type: OS::Neutron::LoadBalancer
    properties:
      protocol_port: 80
      pool_id: {get_resource: pool}

And lb_server.yaml:

heat_template_version: 2013-05-23
description: A load-balancer server
parameters:
  image:
    type: string
    description: Image used for servers
  key_name:
    type: string
    description: SSH key to connect to the servers
  flavor:
    type: string
    description: flavor used by the servers
  pool_id:
    type: string
    description: Pool to contact
  network_id:
    type: string
    description: network for the instances
  user_data:
    type: string
    description: Server user_data
  metadata:
    type: json
resources:
  server:
    type: OS::Nova::Server
    properties:
      flavor: {get_param: flavor}
      image: {get_param: image}
      key_name: {get_param: key_name}
      metadata: {get_param: metadata}
      networks:
          - network: {get_param: network_id}
      user_data: {get_param: user_data}
      user_data_format: RAW
  member:
    type: OS::Neutron::PoolMember
    properties:
      pool_id: {get_param: pool_id}
      address: {get_attr: [server, first_address]}
      protocol_port: 80

Now for this to work you have to set the metering period to maximum 10 seconds in /etc/ceilometer/pipeline.yaml. If you set a higher period you will end up with alarms showing insufficient data forever. Note that such a high sampling period reqires more space in your database as well!

edit flag offensive delete link more

Comments

I also run this script as an admin tenant user, I couldn't figure out how to run it as a demo tenant user.

stipi gravatar imagestipi ( 2014-08-11 02:57:23 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-08-07 03:45:47 -0500

Seen: 2,896 times

Last updated: Aug 11 '14