Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

help with failing Mirantis fuel 9.2 deployment

Hi All,

I'm currently trying to deploy a basic 4 node PoC with Mirantis fuel 9.1 All hardware is identical and the comms check passes on the Fuel dashboard. I have one controller node and 3 compute/osd nodes. Cinder, Glance, Nove and Swift are all backed by Ceph with an object replication factor of 3.

The controller node finishes deployment without issue, but all the compute/osd nodes fail as they timeout running ceph commands. On the controller node, if I run "ceph -s", I get the following output:

cluster a7f64266-0894-4f1e-a635-d0aeaca0e993
  health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
  monmap e1: 1 mons at {node1=192.168.0.1:6789/0}, election epoch 1, quorum 0 node1
  osdmap e1: 0 osds: 0 up, 0 in
  pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects
     0 kB used, 0 kB / 0 kB avail
     192 creating

The ceph log file on and controller node also outputs the following:

mon.node-14@0(leader).auth v33 caught error when trying to handle auth request, probably malformed request

Running any ceph command on any of the osd nodes results in a pause, followed by the following being output continuously:

monclient: hunting for new mon

Any help with how to start debudding this would be great.

Thanks.

help with failing Mirantis fuel 9.2 deployment

Hi All,

I'm currently trying to deploy a basic 4 node PoC with Mirantis fuel 9.1 All hardware is identical and the comms check passes on the Fuel dashboard. I have one controller node and 3 compute/osd nodes. Cinder, Glance, Nove and Swift are all backed by Ceph with an object replication factor of 3.

The controller node finishes deployment without issue, but all the compute/osd nodes fail as they timeout running ceph commands. On the controller node, if I run "ceph -s", I get the following output:

cluster a7f64266-0894-4f1e-a635-d0aeaca0e993
  health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
  monmap e1: 1 mons at {node1=192.168.0.1:6789/0}, election epoch 1, quorum 0 node1
  osdmap e1: 0 osds: 0 up, 0 in
  pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects
     0 kB used, 0 kB / 0 kB avail
     192 creating

The ceph log file on and controller node also outputs the following:

mon.node-14@0(leader).auth v33 caught error when trying to handle auth request, probably malformed request

Running any ceph command on any of the osd nodes results in a pause, followed by the following being output continuously:

monclient: hunting for new mon

Any help with how to start debudding this would be great.

Thanks.