The lost object isn't removing from container

asked 2020-06-10 04:51:14 -0500

Hett gravatar image

updated 2020-06-10 05:19:52 -0500

We have an little cluster of 9 nodes. We noticed that some objects ceased to be available (the swift returns 404)

One of this object, for example, is listed in container:

# swift -U ${user} -K ${key} -A http://10.10.2.23:8080/auth/v1.0 list 52b98808f3680  --lh | grep 52b9993db6aff
718M 2013-12-24 14:25:02 52b9993e3e31b/52b9993db6aff

But can't be exists

# swift -U ${user} -K ${key} -A http://10.10.2.23:8080/auth/v1.0 stat 52b98808f3680 52b9993e3e31b/52b9993db6aff
Object 52b98808f3680/52b9993e3e31b/52b9993db6aff not found

The cluster has no any replication in progress, and all rings is correct:

# swift-recon --md5 --validate-servers
===============================================================================
--> Starting reconnaissance on 9 hosts (object)
===============================================================================
[2020-06-10 09:43:58] Validating server type 'object' on 9 hosts...
9/9 hosts ok, 0 error[s] while checking hosts.
===============================================================================
[2020-06-10 09:43:58] Checking ring md5sums
9/9 hosts matched, 0 error[s] while checking hosts.
===============================================================================
[2020-06-10 09:43:58] Checking swift.conf md5sum
9/9 hosts matched, 0 error[s] while checking hosts.
===============================================================================

I found where objects is placed

swift-get-nodes /etc/swift/object.ring.gz AUTH_system 52b98808f3680 52b9993e3e31b/52b9993db6aff

Account     AUTH_system
Container   52b98808f3680
Object      52b9993e3e31b/52b9993db6aff


Partition   153446
Hash        95d996c9c18d517b2c38ec2f728ac4ac

Server:Port Device  10.10.2.23:6000 dev14
Server:Port Device  10.10.2.27:6000 dev31
Server:Port Device  10.10.2.25:6000 dev27    [Handoff]
Server:Port Device  10.10.2.26:6000 dev17    [Handoff]

And run manually object-auditor with specified devices, and it not found any problems.

Maybe I don't understand, which the process should remove the object from the container, because it dosn't exists?

Sorry for my english.

edit retag flag offensive close merge delete

Comments

I suspect that object was lost when both disks was broke down (we use 2 replicas).

Hett gravatar imageHett ( 2020-06-10 05:25:51 -0500 )edit