Revision history [back]

I've not heard of anyone attempting to run Swift as an LRU using the object-expirer feature.

I'm not sure how well this will work, and I don't want to discourage you!

Using fast-POST is definitely needed to even attempt something like this - but it's not the default behavior anymore - so I wanted to make sure you'd gotten at least that far. You can be sure it's working by uploading a really big object (2-5GB) and making a POST - if a response comes back in less than 5 seconds you've validated you've correctly configured fast-POST.

The .expiring_objects container's are being used like a queue. Every POST to update X-Delete-At will delete the old container entry for the delete marker and add a new one. From your description it sounds like this is happening for every request - that probably explains why your container servers are so busy - there's probably a lot of records in those db's. The expirier I think makes some unneeded backend delete object requests when trying to clean out the delete markers - I think that was a bug report once - not sure.

More workers is probably ok, less background process concurrency might lower load but may cost you in the consistency window. On the object server's specificaly you can also try threadpools.

But the expiring objects containers are probably causing the biggest load - I'd imagine there's quite a bit of contention in those db's. Look for "Lock" or "Timeout" in the container server and replicator logs. Are the container server's on SSD's? Are they on the same server's as the objects?

I like to get a single proxy process going without auth in the pipeline so I can poke at the system account directly with 'curl http://localhost:9000/v1/.expiring_objects -I'. A few HEAD requests could be very informative. You can also find the db's with swift-get-nodes /etc/swift/account.ring.gz .expiring_objects then you can sqlite3 -line <HASH>.db "select * from account_stat" to see the uncommitted stats anyway (messing with the .pending files is a bit more trouble).

One idea I had was you may want to turn your "expiring_objects_container_divisor" down to something crazy like 3600 or even 600!?