Ask Your Question
0

Ceilometer Alarm insufficient Data

asked 2016-05-17 08:17:12 -0500

AB239 gravatar image

updated 2016-05-25 01:57:27 -0500

Update #2:

Since ceilometer-agent-compute service was failing to start. I raised a bug at launchpad and came to know they no longer manage packaging for Kilo ceilometer.

To play around, I copied /etc/ceilometer files (except ceilometer.conf) from controller to compute node at same location. To my surprise, service got started and atleast I could see some logs related to ceilometer on compute.

But the primary issue still persists. Will post an update if something works out.


Update #1:

My openrc file for ceilometer access is:

export OS_PROJECT_NAME=<project-name>
export OS_TENANT_NAME=<tenant-name>
export OS_USERNAME=<username>
export OS_PASSWORD=<password>
export OS_AUTH_URL=http://controller:35357/v2.0
export OS_IMAGE_API_VERSION=2

Outputs of commands:

   root@kilo-controller:# ceilometer sample-list -m cpu_util --limit 5
+-------------+------+------+--------+------+-----------+
| Resource ID | Name | Type | Volume | Unit | Timestamp |
+-------------+------+------+--------+------+-----------+
+-------------+------+------+--------+------+-----------+

root@kilo-controller:# ceilometer statistics -m cpu_util -q 'metadata.user_metadata.stack=a4d4497d-793b-4a55-acdb-9b941f5dc4a8'
    +--------+--------------+------------+-----+-----+-----+-----+-------+----------+
    | Period | Period Start | Period End | Max | Min | Avg | Sum | Count | Duration | Duration Start | Duration End |
    +--------+--------------+------------+-----+-----+-----+-----+-------+----------+-----------

[ORIGINAL QUESTION]

I am sure this would have been asked previously, but there has been no proper solution as much as I can search internet.

In our recent installation of Ceilometer on Kilo flavor of OpenStack, we have come across a problem. Whenever we create a new alarm (either from HEAT or Ceilometer API from terminal), alarm state goes to "insufficient data". We plan to use ceilometer for auto-scaling purpose where trigger can be used by scale_up/down policy can use it and take actions accordingly. Hence we need to monitor CPU consumption.

   root@kilo-controller: ceilometer alarm-list
    +--------------------------------------+------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+ | Alarm ID                             | Name                               | State             | Severity | Enabled | Continuous | Alarm condition         | Time constraints |
    +--------------------------------------+------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+ | a4d1d3f0-c025-443c-a499-f9d5dded9bc4 | simple-cpu_alarm_high-mfqdjvja57g3 | insufficient data | low      | True    | True       | cpu_util > 10.0 during 1 x 30s | None             |
    +--------------------------------------+------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+

Following are the questions for which I am seeking support from the community:

1) Is there a way I can see what data is insufficient for each alarm? If it is a param that I have missed, is there a way it can be checked?

2) I have debugged a bit on my own and found few files under /var/log/ceilometer directory and have DEBUG logs printed but still I can't see any relevant log in any of the file.

3) After bit of research I found that many people have faced this problem and possible cause could be that evalution_timer in ceilometer.conf could be smaller than alarm timer. Though I could see no such param in my ceilometer.conf file. SO I explicitly added one in [alarm] section (added the section as well) and restarted all services. ISSUE was NOT resolved even after this.

4) Lastly, same timer was decreased in /etc/ceilometer/pipeline.yaml file. But this dint solve the issue as well.

Please let me know if there is anything else I am missing here. Adding HEAT snippet for alarm below:

  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-up if the average CPU > 10% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 30
      enabled: True
      evaluation_periods: 1
      threshold: 10
      alarm_actions:
        - {get_attr: [scaleup_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param ...
(more)
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
1

answered 2016-05-18 02:15:37 -0500

yprokule gravatar image

@AB239, so insufficient data means that there is no enough data to make a decision regarding alarm's state.

In your example, the alarm will transition between states based on the value of cpu_util meter. So 1st step is to check if U have this meter:

ceilometer sample-list -m cpu_util --limit 5

2ndly, heat template filters sample using matching_metadata, that means resource id of cpu_util meter must have field user_metadata.stack: <os::stack_id> Check this by running:

ceilometer resource-show <ResourceId>
ceilometer statistics -m cpu_util -q 'metadata.user_metadata.stack=<StackID>'

Regarding interval. Its name is evaluation_interval and according to description:

# Period of evaluation cycle, should be >= than configured
# pipeline interval for collection of underlying metrics.
# (integer value)
# Deprecated group/name - [alarm]/threshold_evaluation_interval
evaluation_interval=60

And if your cpu_util meter is collected every 10minutes, it doesn't make sense to have evaluation_period in alarm equal to 60seconds.

Finally, enable debug/verbose mode in ceilomter.conf restart ceilometer* services and inspect logs.

Regards, Yurii

edit flag offensive delete link more

Comments

Hello Yurii, Thanks for a detailed reply. Much appreciated. I have updated the original question under subheading 'Update #1'. I couldn't see anything in any of the commands that I executed. I can see default alarms like vcpus, etc but not mine.

AB239 gravatar imageAB239 ( 2016-05-19 01:15:55 -0500 )edit

I'd recommend to create a regular vm(without heat's template) and check if cpu_meter is present. To get list of all meters for a vm run:

ceilometer meter-list -q 'resource_id=VM_ID

Then if it is, create a stack and check meter list for created vm. If meter is not present check compute.log

yprokule gravatar imageyprokule ( 2016-05-19 01:23:59 -0500 )edit

Hi. I tried launching a VM manually and see if there is anything in meter-list for corresponding resource-id. Unfortunately I din't get any result from that. Getting this error in ceilometer-agent-central.log. TRACE ceilometer.agent.base TypeError: utcnow() takes no arguments (1 given)

AB239 gravatar imageAB239 ( 2016-05-19 02:15:15 -0500 )edit

@AB239 - here we go, the problem is that ceilometer-compute is failing gathering data. Check this review - https://review.openstack.org/#/c/292274/

yprokule gravatar imageyprokule ( 2016-05-19 23:26:42 -0500 )edit

@yprokule: I have applied the patch (though I couldn't see 'True' in either of files, both on controller/compute). I created a new VM and created a new alarm attached to it. But result is still the same :(

AB239 gravatar imageAB239 ( 2016-05-20 01:02:02 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2016-05-17 08:17:12 -0500

Seen: 3,182 times

Last updated: May 25 '16