Mirantis Fuel 7.0 Deploying Additional Ceph OSD nodes

asked 2015-10-21 16:15:29 -0600

kclev gravatar image

I used Fuel 7.0 to deploy an OpenStack environment, and everything is working great. Ceph is used for the backend storage.

Here is my situation- I have three really nice HP DL380G9 servers packed with 10x 6TB HDD's (for Ceph OSD's) and a couple of SSD's (for Ceph journaling). These disks are attached to a P440 (4G) RAID controller in JBOD mode. The servers also have two 300GB disks in the rear for the OS (and these two drives are attached to yet another RAID controller inside of the server, separate from the P440 in JBOD mode.)

Long story short, the rear drives (for the OS) are RAID 1. The rest of the disks (for Ceph OSD's and journaling) are presented in JBOD (HBA) mode to the OS. If I manually install Ubuntu 14.04.x LTS it sees all of the drives just fine (I even used these servers to manually install a test Ceph cluster so I know for sure the servers work with Ceph and Ubuntu 14.04.x LTS, but now it's time to wipe them and let Fuel take control).

I could not get these servers to work during the initial Fuel deployment or I would have used them right off the bat.

Instead I had to grab a couple of Dell R610's to act as Ceph servers (they only have one OSD each) in order to stand up the deployment successfully. Now that the deployment is online, I'm trying to add my nice, dense DL380G9 servers as additional Ceph OSD's using Fuel.

Fuel discovers them fine, brings them in and shows all of the disks (in the "nodes" tab-->configure disks button). I set the logical drive presented from the RAID1 rear drives as the "base system", all of the 6TB drives as "ceph", and the 400GB SSD's as "ceph journals". The Fuel web UI sees all of the drives correctly.

Then I redeploy the changes to the environment and it reports "success". It did succeed, but only halfway. For some reason Fuel makes all of the other drives except the rear OS drives disappear. If I log into the new DL380G9 node from the fuel master and do an "lsblk" or "fdsik -l" the drives no longer show up, so the node has no OSD's but still gets connected to the Ceph cluster (ceph -s). So something in the way Fuel is loading Ubuntu is making the rest of the drives disappear (I know it's not an Ubuntu problem because I've used these servers with Ubuntu many times so it must be Fuel).

Searching dmesg turned up this, which is peculiar:

# dmesg | grep scsi

[    3.523496] scsi0 : hpsa
[    3.534261] scsi 0:3:0:0: RAID              HP       H240ar           1.18 PQ: 0 ANSI: 5
[    3.536219] scsi 0:0:0:0: Direct-Access     HP       LOGICAL VOLUME   1.18 PQ: 0 ANSI: 5
[    3.538429] scsi 0:3:0:0: Attached scsi ...
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2015-11-07 10:37:52 -0600

kclev gravatar image

updated 2015-11-07 10:40:10 -0600

This turned out to be an issue with the load of Ubuntu that Fuel installs during provisioning and the P440 RAID controller inside of the HP DL380G9 server. The hpsa.ko driver that gets used for the P440 controller was out-of-date in the generic kernel, so the Mirantis team was kind enough to update the driver for us (from hpsa 3.4.0 to 3.4.8) and recompile it against their generic kernel so that we have a temporary fix.

There is a bug launched on this issue, so they hope to provide a permanent fix soon: https://bugs.launchpad.net/fuel/+bug/1513535

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2015-10-21 16:15:29 -0600

Seen: 917 times

Last updated: Nov 07 '15