I would like to deploy a swift cluster with five zones, and three replica. three zones will be the primary zones, and two will be hand-off zones. Based on the above, I have the following questions:

1) are there any ways to designate two zones (i.e., zone 4 and zone 5 ) as hand off zones only?

2) when calculating the power_part for ring size, do we have to include the number of hard disks on the hand-off zone nodes?

3) If we can designate two zones as hand-off zones, do we have to set their storage capacity to be about the same as each primary zone?


Swift does not support "handoff zones", as you put it. Partitions are distributed among all the devices in the ring, usually according to the devices' weights, but sometimes ignoring weights in favor of maximal dispersion. However, with 5 similarly-sized zones and 3 replicas, the maximal-dispersion requirement won't come into play, and the replicas will end up divided evenly.

See also http://docs.openstack.org/developer/swift/overview_ring.html#building-the-ring for lots of details.

Thanks for your reply. When I refer to handoff zones, I mean I just would like the devices on that zone to be handoff devices. I read the official openstack guide for swift, it recommends 5 zones. what are the benefit of five zones? thanks.

There is a blueprint about configurable handoff constraints: including zone, ip, and hard disk
See blueprint swift-configurable-handoff-constraints
Load ring_handoff constraint from /etc/swift/swift.conf
zone : handoff nodes in different zones from primary replicas and
other handoff nodes. This is the default swift behavior ip : handoff nodes have different IP addresses than primary nodes
but may be in the same zone as primary nodes or other handoffs. This usually implies that the handoff nodes are on a different server
disk : handoff nodes have are on different disk drives than primary nodes, but may be on the same server in the same zone as primary nodes or other handoff nodes
ip and disk constraints will generate as many handoff nodes are there are zones.
Example /etc/swift/swift.conf:
ring_handoff = ip
Not sure if these are implemented or not.

Every device will be used both as a primary location for some partitions and as a handoff location for others. Zones are separate failure domains; if you have more zones, then the number of partitions impacted by a failure goes down. Make zones that match your setup.

Thanks. That helps. One more question, if we have more zones (say 5 zones) than the replicas (normally 3), none of the zones have a full set of objects uploaded to the cluster, am I right? If we have only three zones, and three replicas, then zone 1, zone 2 and zone 3 each has a full set of data uploaded.

Yes, that's correct.

