Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Mirantis Fuel 7.0 Deploying Additional Ceph OSD nodes

I used Fuel 7.0 to deploy an OpenStack environment, and everything is working great. Ceph is used for the backend storage.

Here is my situation- I have three really nice HP DL380G9 servers packed with 10x 6TB HDD's (for Ceph OSD's) and a couple of SSD's (for Ceph journaling). These disks are attached to a P440 (4G) RAID controller in JBOD mode. The servers also have two 300GB disks in the rear for the OS (and these two drives are attached to yet another RAID controller inside of the server, separate from the P440 in JBOD mode.)

Long story short, the rear drives (for the OS) are RAID 1. The rest of the disks (for Ceph OSD's and journaling) are presented in JBOD (HBA) mode to the OS. If I manually install Ubuntu 14.04.x LTS it sees all of the drives just fine (I even used these servers to manually install a test Ceph cluster so I know for sure the servers work with Ceph and Ubuntu 14.04.x LTS, but now it's time to wipe them and let Fuel take control).

I could not get these servers to work during the initial Fuel deployment or I would have used them right off the bat.

Instead I had to grab a couple of Dell R610's to act as Ceph servers (they only have one OSD each) in order to stand up the deployment successfully. Now that the deployment is online, I'm trying to add my nice, dense DL380G9 servers as additional Ceph OSD's using Fuel.

Fuel discovers them fine, brings them in and shows all of the disks (in the "nodes" tab-->configure disks button). I set the logical drive presented from the RAID1 rear drives as the "base system", all of the 6TB drives as "ceph", and the 400GB SSD's as "ceph journals". The Fuel web UI sees all of the drives correctly.

Then I redeploy the changes to the environment and it reports "success". It did succeed, but only halfway. For some reason Fuel makes all of the other drives except the rear OS drives disappear. If I log into the new DL380G9 node from the fuel master and do an "lsblk" or "fdsik -l" the drives no longer show up, so the node has no OSD's but still gets connected to the Ceph cluster (ceph -s). So something in the way Fuel is loading Ubuntu is making the rest of the drives disappear (I know it's not an Ubuntu problem because I've used these servers with Ubuntu many times so it must be Fuel).

Searching dmesg turned up this, which is peculiar:

# dmesg | grep scsi

[    3.523496] scsi0 : hpsa
[    3.534261] scsi 0:3:0:0: RAID              HP       H240ar           1.18 PQ: 0 ANSI: 5
[    3.536219] scsi 0:0:0:0: Direct-Access     HP       LOGICAL VOLUME   1.18 PQ: 0 ANSI: 5
[    3.538429] scsi 0:3:0:0: Attached scsi generic sg0 type 12
[    3.540648] sd 0:0:0:0: Attached scsi generic sg1 type 0
[    3.586512] scsi1 : hpsa
[    3.605695] scsi 1:3:0:0: RAID              HP       P440             2.52 PQ: 0 ANSI: 5
[    3.607638] scsi 1:3:0:0: Attached scsi generic sg2 type 12

The P440 is verified to be in HBA (JBOD) mode, so I have no idea why it shows as RAID under dmesg (I would expect RAID for the HP240ar/Logical volume which is the rear drives for the OS that are RAID1).

Can anyone shed some light on what is going on here? Why is fuel making my disks disappear? I really need to get these storage nodes strapped into the cloud environment.

Thanks for your time!