Ask Your Question
0

Unable to scale up using heat

asked 2017-02-02 05:22:22 -0600

DarkKnight gravatar image

updated 2017-02-02 08:33:54 -0600

I am testing the heat auto scaling. I have attached all the templates. Under my scale up policy,

====

heat_template_version: 2014-10-16
description: Example auto scale group, policy and alarm
resources:
  scaleup_group:
    type: OS::Heat::AutoScalingGroup
    properties:
      cooldown: 60
      desired_capacity: 1
      max_size: 3
      min_size: 1
      resource:
        type: OS::Nova::Server::Rhel7

  scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { get_resource: scaleup_group }
      cooldown: 60
      scaling_adjustment: 1

  scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { get_resource: scaleup_group }
      cooldown: 60
      scaling_adjustment: -1

  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      meter_name: cpu_util
      statistic: avg
      period: 5
      evaluation_periods: 1
      threshold: 30
      alarm_actions:
        - {get_attr: [scaleup_policy, alarm_url]}
      comparison_operator: gt

  cpu_alarm_low:
    type: OS::Ceilometer::Alarm
    properties:
      meter_name: cpu_util
      statistic: avg
      period: 5
      evaluation_periods: 1
      threshold: 5
      alarm_actions:
        - {get_attr: [scaledown_policy, alarm_url]}
      comparison_operator: lt

=============================

Now i created the stack using the below command:

heat stack-create simple -f example.yaml -e environment.yaml

================ I have run some load generation scripts on my VM to make its cpu_util go up. But my alarm state is always stuck on insufficient_data, eventhough

[root@controller heat]# ceilometer sample-list --query resource=c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 --meter=cpu_util
+--------------------------------------+----------+-------+-----------------+------+----------------------------+
| Resource ID                          | Name     | Type  | Volume          | Unit | Timestamp                  |
+--------------------------------------+----------+-------+-----------------+------+----------------------------+
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.008200372   | %    | 2017-02-02T11:15:41.958000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 99.9962297323   | %    | 2017-02-02T11:14:41.973000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.044306526   | %    | 2017-02-02T11:13:41.950000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.175740919   | %    | 2017-02-02T11:12:41.957000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.343388673   | %    | 2017-02-02T11:11:41.963000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.198711322   | %    | 2017-02-02T11:10:41.949000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.176487547   | %    | 2017-02-02T11:09:41.988000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.173883123   | %    | 2017-02-02T11:08:41.964000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 99.9941500248   | %    | 2017-02-02T11:07:41.978000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.003349514   | %    | 2017-02-02T11:06:42.114000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 99.9840375484   | %    | 2017-02-02T11:05:41.956000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.022073549   | %    | 2017-02-02T11:04:41.947000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.002962927   | %    | 2017-02-02T11:03:41.970000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 99.9613351755   | %    | 2017-02-02T11:02:41.962000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.048090749   | %    | 2017-02-02T11:01:41.949000 |
| c10a3207-b7a7-46c0-b2d6-a56462f3c2c4 | cpu_util | gauge | 100.030257486   | %    | 2017-02-02T11:00:41.948000 |

===================

When i do a openstack alarm show <id> it gives me this

[root@controller heat]# openstack alarm show simple-cpu_alarm_high-rpl2gsksjdsn
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| Field                     | Value                                                                                                                               |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| alarm_actions             | [u'http://controller:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3A5949f9bc4502425ba41d29184ac9713d%3Astacks%2Fsimple%2F5a6e0ed6-1a71  |
|                           | -4b1d-bfdf-8a173abe2b9a%2Fresources%2Fscaleup_policy?Timestamp=2017-02-02T10%3A44%3A34Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=9 |
|                           | 21f02a8183b45469d58f324b4734d81&SignatureVersion=2&Signature=c8CygvYt2yOMtuF7JFRbl7zjlrnTxPOjdlWdpCVEHo0%3D']                       |
| alarm_id                  | 0fe1d06e-744f-4185-a3cc-3d9f2b26a7ea                                                                                                |
| comparison_operator       | gt                                                                                                                                  |
| description               | Alarm when cpu_util is gt a avg of 30.0 over 5 seconds                                                                              |
| enabled                   | True                                                                                                                                |
| evaluation_periods        | 1                                                                                                                                   |
| exclude_outliers          | False                                                                                                                               |
| insufficient_data_actions | None                                                                                                                                |
| meter_name                | cpu_util                                                                                                                            |
| name                      | simple-cpu_alarm_high-rpl2gsksjdsn                                                                                                  |
| ok_actions                | None                                                                                                                                |
| period                    | 5                                                                                                                                   |
| project_id                | 5949f9bc4502425ba41d29184ac9713d                                                                                                    |
| query                     |                                                                                                                                     |
| repeat_actions            | True                                                                                                                                |
| severity                  | low                                                                                                                                 |
| state                     | insufficient data                                                                                                                   |
| state_timestamp           | 2017-02-02T10:45:27.649693                                                                                                          |
| statistic                 | avg                                                                                                                                 |
| threshold                 | 30.0                                                                                                                                |
| time_constraints          | []                                                                                                                                  |
| timestamp                 | 2017-02-02T10:45:27.649693                                                                                                          |
| type                      | threshold                                                                                                                           |
| user_id                   | e53846934ec04076b6c83ccb4e5e2bd6                                                                                                    |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------+

Can someone suggest what is going wrong? and why is the scaling policy not getting triggered.

There is nothing in the logs (ceilometer/aodh) against these alarm-id.

[root@controller heat]# grep -irl "0fe1d06e-744f-4185-a3cc-3d9f2b26a7ea" /var/log/ceilometer/*
[root@controller ...
(more)
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2017-02-02 05:39:55 -0600

The alarm period should be a multiple of the sample interval (http://docs.openstack.org/admin-guide...). Adjust and try again.

edit flag offensive delete link more

Comments

adjusted the interval still unable to scale up. Edited question

DarkKnight gravatar imageDarkKnight ( 2017-02-02 08:30:11 -0600 )edit

[root@controller heat]# grep "51b60304-8637-4f8a-a288-0a08bc199fc3" /var/log/aodh/evaluator.log 2017-02-02 17:06:33.029 2648 INFO aodh.evaluator [-] alarm 51b60304-8637-4f8a-a288-0a08bc199fc3 transitioning to ok because Transition to ok due to 1 samples inside threshold, most recent: 20.5171997571

DarkKnight gravatar imageDarkKnight ( 2017-02-02 08:47:27 -0600 )edit

I am confused that when ceilometer reports 99 & 100 as cpu_util, why is this value "20.5171997571" shown in the aodh/evaluator.logs?

DarkKnight gravatar imageDarkKnight ( 2017-02-02 08:48:39 -0600 )edit

also interesting to note is that

 ceilometer sample-list --query resource=fab8a0f1-ffd0-41bd-b538-fdf3cba8e54f --meter=cpu_util
this command displays o/p but the next one doesnot

  ceilometer statistics -m cpu_util -q 'metadata.user_metadata.stack=fb82cc59-f6cd-4b8d-b260-6bd10e751
DarkKnight gravatar imageDarkKnight ( 2017-02-02 08:54:12 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2017-02-02 05:22:22 -0600

Seen: 275 times

Last updated: Feb 02 '17