It's hard to get to the exact reasons why you're seeing what you're seeing, based on the info you've given so far. However, a brief overview of the write path and how the container listing gets updated may shine some light on a few areas that may result in what you're seeing.
When an object write request (PUT or DELETE) is received by the proxy server, the proxy sends new requests to the proper object servers. These new, backend requests also include a directive for the object server to update a particular container server. Like objects, container information is also replicated in the cluster. So if you've got 3x replicas for objects and 3x replicas for containers, each container server receiving an object write request will also get a directive to update a unique replica of the container.
After the object server has flushed the object write to disk successfully, it attempts the container update. If it fails for any reason, then an async_pending
is created and a background daemon (the object-updater
) processes it later. So your first thing to look at is how many async_pendings
are in the system (use swift-recon
to help get this info) and make sure the swift-object-updater
is running on all the nodes. It could be that the container update failed initially and the update is queued in async_pendings
and hasn't been processed.
However, if that isn't the issue, then perhaps the container updates were initially successful and all async_pendings
have been processed in the cluster. In that case, maybe you've updated the container ring and container-replication
hasn't been able to reconcile the various partial replicas in the system yet. Or maybe the container ring isn't the same on every node (and container-replication
hasn't been able to reconcile the various partial replicas in the system yet). Check that the rings are the same everywhere (again swift-recon
can help); check that the replicator daemons are all running.
Getting into more unlikely possibilities, perhaps the container drives have been filling up and load has been shed to other drives. Replication cannot possibly keep up in that case (because there's no space on primary locations). However, the listing request may be querying stale replicas and giving the strange results you see. Check drive fullness, add capacity as necessary, and ensure the replicator processes are running. I strongly doubt this is an issue in your cluster because you're able to write objects and get updated listings.
So all of the above is assuming the issue is related to the listings simply being out of date. Swift is designed to remain available even when there are failures in the cluster, so a side effect is sometimes giving stale results like you're seeing. However, it should reconcile itself. The above troubleshooting is to see where those issues may be.
Good luck!