Revision history [back]

  1. As I was having a web application running as a front-end for swift cluster and there are some existing user so I have a plan that I will download all data using curl command for respective user and will have a fresh multinode installation then then will create existing user and will re-upload the data. So is it the right way to proceed with or you can suggest me any other approach ?

I think learning how to do a smooth ring rebalance and capacity adjustment would be a good exercise - it's sort of a big deal. If you have headroom maybe leave things where they're at and spin up some vm's to practice?

  1. I have 3 servers now ( 4x3TB + 4x3TB + 1x2TB ) and planning for multinode swift production cluster. Can you suggest me the number of zone, partition power and replication count for the same ?

Region and Zone are physical things, just like drives and servers, there's no longer any reason to make them up. If you have link(s) between some node(s) that are slower or faster than others, that's a region. Zone is failure domain, TOR switch, redundant power/battery backup, different rack/location in the datacenter/colo.

Part power is hard, you have to shoot for the middle and think ahead (see below).

IMHO Replica count is ment to be 3, I've heard of people getting by with 2 and now that we have adjustable replica count maybe the first time you have a downed node and start throwing 404'S or 500's or loose data because of multiple correlated disk failures - you can consider adding capacity and shoring up your replica count? You'd really have to ask people running with 1 or 2 replica's to say if that's working out for them?

  1. Can you tell me how to decide the partition power and replication count according to the available storage space ?

If you plan on running those nodes at capacity without adding more nodes/drives (say > 70% disk full) - I sure would hate to see your replication cycle times with a part power greater than 17. I think 16 may even be reasonable, but then if down the road you've got more than 600 drives in maybe 25-30 nodes (let's say 5TB drives by then, that's ~1PB usable) you may be fighting balance issues as each drive has less than 100 partitions on it :\ Maybe limiting your max object size to something like 2 GB might help protect you, or maybe if you've got metrics you can apply back presure from disk capacity over a rebalance loop until you get unabalanced partitions spread around (hard when there's only 100 of them drive - that acctually probably wouldn't work :), so... maybe swift has adjustable partpower by then!?

Too big is problem now, too small is problem later. I personally save hard problems for future me. You could get by with as much as 18 (a very respectible part power IMHO) if you plan on growing rapidly, and I don't think there's any reason to limit yourself to less than 16 with 9 drives.

  1. Can we increase or decrease the partition power of an existing swift-cluster ring files, having data in the cluster ? if so, then is it going to make replica of new upcoming data only or will it create replica for existing data ?

Moving from one part power to another is currently not an "adjustment" it's a migration, it's not a process that is built into swift today. Duplicating subsets of data during the migration would probably be needed to ensure seamless availability. Not trivial.

  1. What kind of challanges I can end up in future with this kind of multinode production cluster so I can design for a scalable cluster ?

Load balancing is a PITA! Swift can make the storage part easy to scale and takes failures in stride if you've got a sound understanding of how the bits go together. Good luck!