Ask Your Question
0

Swift cluster performance query

asked 2014-05-05 19:32:55 -0500

nmap911 gravatar image

updated 2014-07-17 15:28:02 -0500

smaffulli gravatar image

Hi All

I've recently setup an object storage cluster that comprises of :

  1. 4 Nodes running all swift related services (object, container, account, proxy)
  2. 2 Nodes running Pound load balancer services

I've setup 4 zones accross these 4 nodes and have my replica set to 3.

My 4 nodes have identical hardware spec :

  1. Supermicro X9SRi-F mainboards - http://www.supermicro.com/products/motherboard/Xeon/C600/X9SRi-F.cfm (http://www.supermicro.com/products/mo...)
  2. Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz - http://ark.intel.com/products/64588/intel-xeon-processor-e5-2609-10m-cache-2_40-ghz-6_40-gts-intel-qpi (http://ark.intel.com/products/64588/i...)
  3. 32GB DDR3 RAM
  4. 10 x 1TB Western Digital Black drives ( for object services - JBODs )
  5. 2 x 120GB Virtex SSD drives ( for container services - JBODs)
  6. 2 x 60GB Virtex SSD drives ( for accounting services - JBODs)
  7. 1 gigabit NIC
  8. SMC2108 RAID card - http://www.supermicro.com/products/accessories/addon/AOC-USAS2LP-H8iR.cfm (http://www.supermicro.com/products/ac...) with 512MB cache + BBU ( writeback, read ahead )

They are all connected to the same physical switch.

I ran some benchmark tests using some real life data (emails, photos, videos, documents, etc.) and while the cluster performs at an acceptable speed while being benchmarked I've noticed something very concerning happening post benchmark tests.

When I hit 10 000 containers with 17000 objects on my test account my 4 nodes would idle at a load average of 2.0 over 5 minutes and every couple of minutes ramp up to 4.0 - 6.0 over 5 minutes - all 4 cores running at maximum capacity. The services running on the nodes eating up all of the CPU is (as seen through htop) :

  1. swift-container-replicator (averages at 80% CPU usage)
  2. swift-container-sync (averages at 80% CPU usage)
  3. swift-container-server (averages at 60% CPU usage)
  4. swift-container-server (averages at 60% CPU usage)
  5. swift-object-replicator (averages at 25% CPU usage)
  6. swift-object-server (averages at 25% CPU usage)

sysstat tools tells me there is next to 0 IOWait on my SSD's and HDD's so CPU isnt waiting for disk - I'm using 2GB of my 32GB of RAM so there is no memory contention and my network adapters are running well below a troubling rate.

So i ran my benchmark tests again, now sitting at 17000 containers with 27000 objects and a collective size of +/- 56GB of data - my 4 nodes now idle at 4.0 over 5 minutes and peak at 6.0 to 8.0 load average when replicator + sync services start - my CPU's are litteraly running at 100% all the time.

Finally : my question for you guys.

Is this normal expected behavior? Are the container services supposed to be this heavy on CPU resources? Or is there some configuration changes that needs to occur on my side to mitigate these effects?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2014-05-06 07:39:21 -0500

nmap911 gravatar image

I've found the answer to my questions - https://julien.danjou.info/blog/2012/openstack-swift-consistency-analysis (https://julien.danjou.info/blog/2012/...)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Get to know Ask OpenStack

Resources for moderators

Question Tools

Follow
1 follower

Stats

Asked: 2014-05-05 19:32:55 -0500

Seen: 277 times

Last updated: May 06 '14