Ask Your Question
2

ceph heartbeat

asked 2014-06-14 04:17:06 -0500

updated 2014-06-14 04:19:44 -0500

[root@controller my-cluster]# ceph -s
cluster fcf53ba1-52ed-4551-b903-7ce1bccfa82e
 health HEALTH_WARN 192 pgs stale; 192 pgs stuck stale; 192 pgs stuck unclean
 monmap e1: 1 mons at {controller=10.20.0.11:6789/0}, election epoch 1, quorum 0 controller
 osdmap e82: 15 osds: 4 up, 2 in
  pgmap v3037: 448 pgs, 5 pools, 197 MB data, 60 objects
        10313 MB used, 4828 GB / 4838 GB avail
             192 stale+active+remapped
             256 active+clean




  [root@controller my-cluster]# ceph osd tree
# id    weight  type name   up/down reweight
-1  4.73    root default
-2  3.64        host network
13  3.64            osd.13  up  1   
-3  1.09        host compute1
14  1.09            osd.14  up  1   
0   0   osd.0   down    0   
1   0   osd.1   down    0   
2   0   osd.2   down    0   
3   0   osd.3   down    0   
4   0   osd.4   down    0   
5   0   osd.5   up  0   
6   0   osd.6   down    0   
7   0   osd.7   up  0   
8   0   osd.8   down    0   
9   0   osd.9   down    0   
10  0   osd.10  down    0   
11  0   osd.11  down    0   
12  0   osd.12  down    0

Above osd of 0-12, I have been in accordance with the document deleted. But still there and still 5 and 7 UP. How can completely remove and clean? I delete performed in accordance with http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ (http://ceph.com/docs/master/rados/ope...) . And I execute the command: rbd ls no response Here is my log

    2014-06-14 17:01:10.522967 7f42d4792700 -1 osd.13 76 heartbeat_check: no reply from osd.5 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:50.522966)
2014-06-14 17:01:10.522990 7f42d4792700 -1 osd.13 76 heartbeat_check: no reply from osd.7 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:50.522966)
2014-06-14 17:01:11.523253 7f42d4792700 -1 osd.13 76 heartbeat_check: no reply from osd.5 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:51.523252)
2014-06-14 17:01:11.523270 7f42d4792700 -1 osd.13 76 heartbeat_check: no reply from osd.7 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:51.523252)
2014-06-14 17:01:11.842527 7f42c1c0e700 -1 osd.13 76 heartbeat_check: no reply from osd.5 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:51.842526)
2014-06-14 17:01:11.842547 7f42c1c0e700 -1 osd.13 76 heartbeat_check: no reply from osd.7 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 (cutoff 2014-06-14 17:00:51.842526)
2014-06-14 17:01:12.523530 7f42d4792700 -1 osd.13 76 heartbeat_check: no reply from osd.5 ever on either front or back, first ping sent 2014-06-14 16:13:37.606610 ...
(more)
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2014-07-05 06:01:39 -0500

dachary gravatar image

The osd.5 and osd.7 are up

...
5   0   osd.5   up  0   
6   0   osd.6   down    0   
7   0   osd.7   up  0   
...

which means they communicate successfully with the monitors. However, they are not reachable from osd.13. It probably means that the host on which osd.5 is running cannot be reached from the host running osd.13. Unless you need to use osd.5 and osd.7, the simplest solution is to mark them down with:

ceph osd down osd.5
ceph osd down osd.7
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-06-14 04:17:06 -0500

Seen: 1,736 times

Last updated: Jul 05 '14