Heat tempalte autoscaling error
Hi!
I've been trying to create an autoscaling heat template based on the autoscaling wordpress template. My code looks like this:
heat_template_version: 2013-05-23
description: Autoscaling Test Template
parameters:
image:
type: string
description: Image use to boot a server
subnet_id:
type: string
description: subnet on which the load balancer will be located
key:
type: string
description: keypair for authentication
flavor:
type: string
description: flavor of the servers
resources:
server_group:
type: OS::Heat::AutoScalingGroup
properties:
min_size: 1
max_size: 3
resource:
type: "https://github.com/openstack/heat-templates/blob/master/hot/lb_server.yaml"
properties:
flavor: {get_param: flavor}
image: {get_param: image}
key_name: {get_param: key}
pool_id: {get_resource: pool}
metadata: {"metering.stack": {get_param: "OS::stack_id"}}
user_data: ""
server_scaleup_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: {get_resource: server_group}
cooldown: 60
scaling_adjustment: 1
server_scaledown_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: {get_resource: server_group}
cooldown: 60
scaling_adjustment: -1
cpu_alarm_high:
type: OS::Ceilometer::Alarm
properties:
description: Scale-up if the average CPU > 50% for 1 minute
meter_name: cpu_util
statistic: avg
period: 60
evaluation_periods: 1
threshold: 50
alarm_actions:
- {get_attr: [server_scaleup_policy, alarm_url]}
matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
comparison_operator: gt
cpu_alarm_low:
type: OS::Ceilometer::Alarm
properties:
description: Scale-down if the average CPU < 15% for 10 minutes
meter_name: cpu_util
statistic: avg
period: 600
evaluation_periods: 1
threshold: 15
alarm_actions:
- {get_attr: [server_scaledown_policy, alarm_url]}
matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
comparison_operator: lt
monitor:
type: OS::Neutron::HealthMonitor
properties:
type: TCP
delay: 3
max_retries: 5
timeout: 5
pool:
type: OS::Neutron::Pool
properties:
protocol: HTTP
monitors: [{get_resource: monitor}]
subnet_id: {get_param: subnet_id}
lb_method: ROUND_ROBIN
vip:
protocol_port: 80
lb:
type: OS::Neutron::LoadBalancer
properties:
protocol_port: 80
pool_id: {get_resource: pool}
When I try to run the stack I get the following error in heat -engine.log:
2014-08-07 09:59:35.424 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating HealthMonitor "monitor"
2014-08-07 09:59:35.426 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating Pool "pool"
2014-08-07 09:59:35.428 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingResourceGroup "server_group"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating LoadBalancer "lb"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingPolicy "server_scaledown_policy"
2014-08-07 09:59:35.430 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating CeilometerAlarm "cpu_alarm_low"
2014-08-07 09:59:35.431 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating AutoScalingPolicy "server_scaleup_policy"
2014-08-07 09:59:35.432 6526 INFO heat.engine.resource [req-dd0d5a79-0fcb-40b2-8a3a-4dcf88a39634 None] Validating CeilometerAlarm "cpu_alarm_high"
2014-08-07 09:59:35.796 6526 INFO heat.engine.resource [-] creating HealthMonitor "monitor" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:35.950 6526 WARNING heat.common.keystoneclient [-] stack_user_domain ID not set in heat.conf falling back to using default
2014-08-07 09:59:35.958 6526 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): controller
2014-08-07 09:59:36.698 6526 INFO heat.engine.resource [-] creating Pool "pool" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:41.052 6526 INFO heat.engine.resource [-] creating AutoScalingResourceGroup "server_group" Stack "mystack4" [5fb09f73-084b-47c2-868f-7028214a4497]
2014-08-07 09:59:41.178 6526 INFO heat.engine.environment [-] Registering OS::Heat::ScaledResource -> AWS::EC2::Instance ...
Are you running this in 'admin' tenant or with a user without admin privileges?
Do you have the 'heat_stack_user" role created?
check using:
Check the project membership of the user you create the stack with.
I have the heat_stck_user role created, and the demo user I used has this role, but still no luck. The admin user works, the stack creation finishes, however the scaleup doesn't happen when I stress the cpu of the vm. I see the alarm in the ceilometer logs but no new instance is started.
For the Demo user. - The heat_stack_user role is used when creating the scale users in the project (tenant). Remove it from your demo user, hopefully that should solve it.
For Scale - When the alarm is raised ceilometer calls the web hook created in the scale policy. The heat-cfn-api service handles this which listens on port 8000.
What errors do you see in the heat logs.
The scaling happened at last. The problem is that one new instance was started 15 mindutes after I began stressing the first VM, and another one is started 10 minutes later. Why is there so much delay? The alarm arrived after ~1 minute in ceilometer logs. (This still happends in the admin tenant.)
Oh I guess the problem is that ceilometer sampling interval for cpu_util is set to 10 minutes on the compute nodes. I will ty to decrease that, and see if that works.