Ask Your Question
0

Ceilometer's alarms on network.*.bytes don't trigger state change

asked 2014-11-20 04:29:53 -0600

mc__ gravatar image

updated 2014-11-20 04:52:28 -0600

At the moment I'm trying to set up the autoscaling function in a stack via HOT (-template). The standard example shows the use of cpu.util, which works.
Instead, I'm trying to use a combination of CPU and network. Therefore, as you can guess, I need the network metering (and AlarmCombination).

So, stack and setup runs successfully. But although I stress my VMs with iperf, the alarm-history of my network-alarms shows no state changes at all. A look at the database showed me - yes, metering is received correctly and also values over the threshold. The logs tell me, that Ceilometer tries to evaluate the period of time and then - I don't know how to interpret.. is the threshold breach detected or not? Can't see it from the logs. In the end, every (dependant) alarm stays in insufficient state.
Has anyone a clue, why the alarm state not adapted?

Attached, you can find some logs in the relevant time:


Openstack Icehouse
ceilometer --version 1.0.9
heat --version 0.2.8
qemu-io version 2.0.0
libvirtd (libvirt) 1.2.2
polling_intervall in pipeline.yml @ 60s for network and cpu

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted
1

answered 2014-11-20 08:13:43 -0600

mc__ gravatar image
+---------------------------+----------------------------------------------------------------------+
| Property                  | Value                                                                |
+---------------------------+----------------------------------------------------------------------+
| alarm_actions             | []                                                                   |
| alarm_id                  | 3432c243-e500-4ee9-8bfe-19d00c75648f                                 |
| comparison_operator       | lt                                                                   |
| description               | Triggers if the average CPU < 10% for 2 minutes                      |
| enabled                   | True                                                                 |
| evaluation_periods        | 1                                                                    |
| exclude_outliers          | False                                                                |
| insufficient_data_actions | []                                                                   |
| meter_name                | cpu_util                                                             |
| name                      | TestStack_with_CPU_and_NW_stress-cpu_alarm_low-q2qgc7hpzqpx          |
| ok_actions                | []                                                                   |
| period                    | 120                                                                  |
| project_id                | 9036a06bb9f648beb8b4d4592e693735                                     |
| query                     | metadata.user_metadata.stack == 7f16b737-824c-46c1-977c-b053c27f0068 |
| repeat_actions            | True                                                                 |
| state                     | alarm                                                                |
| statistic                 | avg                                                                  |
| threshold                 | 10.0                                                                 |
| type                      | threshold                                                            |
| user_id                   | 578d0bd463944738833c54396835f7bb                                     |
+---------------------------+----------------------------------------------------------------------+
+---------------------------+------------------------------------------------------------------------+
| Property                  | Value                                                                  |
+---------------------------+------------------------------------------------------------------------+
| alarm_actions             | []                                                                     |
| alarm_id                  | 57abffe1-b1c1-46a8-8061-fc7cbb3ff978                                   |
| comparison_operator       | gt                                                                     |
| description               | Triggers, if the incoming average network traffic > 2MB/s for 1 minute |
| enabled                   | True                                                                   |
| evaluation_periods        | 1                                                                      |
| exclude_outliers          | False                                                                  |
| insufficient_data_actions | []                                                                     |
| meter_name                | network.incoming.bytes.rate                                            |
| name                      | TestStack_with_CPU_and_NW_stress-nw_IN_traffic_alarm_high-ulfwuki7if3w |
| ok_actions                | []                                                                     |
| period                    | 60                                                                     |
| project_id                | 9036a06bb9f648beb8b4d4592e693735                                       |
| query                     | metadata.user_metadata.stack == 7f16b737-824c-46c1-977c-b053c27f0068   |
| repeat_actions            | True                                                                   |
| state                     | insufficient data                                                      |
| statistic                 | avg                                                                    |
| threshold                 | 2097152.0                                                              |
| type                      | threshold                                                              |
| user_id                   | 578d0bd463944738833c54396835f7bb                                       |
+---------------------------+------------------------------------------------------------------------+

samples for network.incoming.bytes.rate
+-----------------------------------------------------------------------+-----------------------------+-------+---------------+------+---------------------+
| Resource ID                                                           | Name                        | Type  | Volume        | Unit | Timestamp           |
+-----------------------------------------------------------------------+-----------------------------+-------+---------------+------+---------------------+
| instance-00000061-d20f7f35-dda5-480a-97a4-cf8f7f41e439-tap6e92f333-6c | network.incoming.bytes.rate | gauge | 18.3770491803 | B/s  | 2014-11-20T14:02:25 |
| instance-00000061-d20f7f35-dda5-480a-97a4-cf8f7f41e439-tap6e92f333-6c | network.incoming.bytes.rate | gauge | 24.4406779661 | B/s  | 2014-11-20T14:01:24 |
| instance-00000061-d20f7f35-dda5-480a-97a4-cf8f7f41e439-tap6e92f333-6c | network.incoming.bytes.rate | gauge | 132.233333333 | B/s  | 2014-11-20T14:00:25 |
| instance-00000061-d20f7f35-dda5-480a-97a4-cf8f7f41e439-tap6e92f333-6c | network.incoming.bytes.rate | gauge | 591.393442623 | B/s  | 2014-11-20T13:59:25 |
| instance-00000061-d20f7f35-dda5-480a-97a4-cf8f7f41e439-tap6e92f333-6c | network.incoming.bytes.rate | gauge | 727.083333333 | B/s  | 2014-11-20T13:58:24 |
+-----------------------------------------------------------------------+-----------------------------+-------+---------------+------+---------------------+
statistics for network.incoming.bytes.rate
+--------+---------------------+---------------------+-------+---------------+---------------+---------------+---------------+----------+---------------------+---------------------+
| Period | Period Start        | Period End          | Count | Min           | Max           | Sum           | Avg           | Duration | Duration Start      | Duration End        |
+--------+---------------------+---------------------+-------+---------------+---------------+---------------+---------------+----------+---------------------+---------------------+
| 0      | 2014-11-20T13:58:24 | 2014-11-20T14:02:25 | 5     | 18.3770491803 | 727.083333333 | 1493.52783644 | 298.705567287 | 241.0    | 2014-11-20T13:58:24 | 2014-11-20T14:02:25 |
+--------+---------------------+---------------------+-------+---------------+---------------+---------------+---------------+----------+---------------------+---------------------+
constrained statistics for network.incoming.bytes.rate
+--------+--------------+------------+-------+-----+-----+-----+-----+----------+----------------+--------------+
| Period | Period Start | Period End | Count | Min | Max | Sum | Avg | Duration | Duration Start | Duration End |
+--------+--------------+------------+-------+-----+-----+-----+-----+----------+----------------+--------------+
+--------+--------------+------------+-------+-----+-----+-----+-----+----------+----------------+--------------+
samples for cpu_util
+--------------------------------------+----------+-------+----------------+------+---------------------+
| Resource ID                          | Name     | Type  | Volume         | Unit | Timestamp           |
+--------------------------------------+----------+-------+----------------+------+---------------------+
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.475409836066 | %    | 2014-11-20T14:03:25 |
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.5            | %    | 2014-11-20T14:02:24 |
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.516666666667 | %    | 2014-11-20T14:01:24 |
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.525423728814 | %    | 2014-11-20T14:00:24 |
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.55737704918  | %    | 2014-11-20T13:59:25 |
| d20f7f35-dda5-480a-97a4-cf8f7f41e439 | cpu_util | gauge | 0.616666666667 | %    | 2014-11-20T13:58:24 |
+--------------------------------------+----------+-------+----------------+------+---------------------+
statistics for cpu_util
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+
| Period | Period Start        | Period End          | Count | Min            | Max            | Sum           | Avg            | Duration | Duration Start      | Duration End        |
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+
| 0      | 2014-11-20T13:58:24 | 2014-11-20T14:03:25 | 6     | 0.475409836066 | 0.616666666667 | 3.19154394739 | 0.531923991232 | 301.0    | 2014-11-20T13:58:24 | 2014-11-20T14:03:25 |
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+
constrained statistics for cpu_util
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+
| Period | Period Start        | Period End          | Count | Min            | Max            | Sum           | Avg            | Duration | Duration Start      | Duration End        |
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+
| 0      | 2014-11-20T13:58:24 | 2014-11-20T14:03:25 | 6     | 0.475409836066 | 0.616666666667 | 3.19154394739 | 0.531923991232 | 301.0    | 2014-11-20T13:58:24 | 2014-11-20T14:03:25 |
+--------+---------------------+---------------------+-------+----------------+----------------+---------------+----------------+----------+---------------------+---------------------+

in case, the non-kept output is also relevant, I pasted it here: http://pastebin.com/DGRq3hu7

edit flag offensive delete link more
0

answered 2014-11-20 06:21:11 -0600

Thanks for the detailed logs/diagnostics.

Can you provide a little more:

$ for a in 3432c243-e500-4ee9-8bfe-19d00c75648f 57abffe1-b1c1-46a8-8061-fc7cbb3ff978
   do
     ceilometer alarm-show -a $a
  done

$ 5MINS_AGO=$(date -u +"%Y-%m-%dT%H:%M:%SZ" -d "-5mins")
$ for m in  network.incoming.bytes.rate cpu_util
   do
     echo "samples for $m"
     ceilometer --debug sample-list -m $m -q "timestamp>=$5MINS_AGO"
     echo "statistics for $m"
     ceilometer --debug statistics -m $m -q "timestamp>=$5MINS_AGO"
     echo "constrained statistics for $m"
     ceilometer --debug statistics -m $m -q "timestamp>=$5MINS_AGO;metadata.user_metadata.stack=7f16b737-824c-46c1-977c-b053c27f0068"
  done
edit flag offensive delete link more

Comments

I've gotten to the bottom of this issue, it's a ceilometer bug:

https://bugs.launchpad.net/ceilometer...

Targeting a fix at kilo-1.

eglynn@redhat.com gravatar imageeglynn@redhat.com ( 2014-11-20 12:11:57 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-11-20 04:29:53 -0600

Seen: 551 times

Last updated: Nov 20 '14