Ask Your Question
0

Limit bandwidth usage of the replicas synchronisation

asked 2013-04-24 08:02:32 -0500

choisy-alex gravatar image

Hello, I am working on swift for my company for a couple of months now. I am trying to get an overall knowledge about the Object Storage and the way it works. Actually, i've set up an experimental environnement composed by 4 storage nodes, divided in 4 zones, and 1 proxy node. I use the Keystone Identity Manager as an authentication system for the dialog with the environment. I configured 3 replicas. I wanted to watch what happens when i unmount one of the storage nodes. I worked with 2 GB of datas already saved on my swift cluster. As far as i've looked, it seems that when one zone falls down, the replicator try to get back is third replication on another zone. This is normal, but what bothers me a bit is that the replicator tries to do it as fast as he can and by this way, saturate my network and hardly slowing down all other requests that can be made to the swift cluster. In this way, i was wondering if there was a way to configure the swift cluster to limit the bandwidth usage of the replicator and rsync. I've tried to look for it but unfortunatly i didn't find anything. I know there is an Rsync option to limit the bandwith usage for the synchronisation, so is there a way to activate this option for swift ?

Thank you for reading, Have a nice day.

edit retag flag offensive close merge delete

7 answers

Sort by ยป oldest newest most voted
0

answered 2013-05-09 16:22:30 -0500

choisy-alex gravatar image

Thanks Brian Cline, that solved my question.

edit flag offensive delete link more
0

answered 2013-04-24 12:18:08 -0500

choisy-alex gravatar image

Finally, i've tried to add an argument when calling for Rsync program in the python code. It seems to work quite fine. I am still looking for an other way than modifying the python code.

For those who may be interested in, i changed : obj/replicator.py ( ~ L. 320) :

args = [ 'rsync', '--bwlimit=1000', // Added this part to limit the bandwith rate used by Rsync '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=%s' % self.rsync_io_timeout, '--contimeout=%s' % self.rsync_io_timeout, ] node_ip = rsync_ip(node['ip'])

and common/db_replicator (~ L. 175) :

popen_args = ['rsync', '--bwlimit=1000', '--quiet', '--no-motd', // I added the '--bwlimit=1000' to limit the bandwidth rate '--timeout=%s' % int(math.ceil(self.node_timeout)), '--contimeout=%s' % int(math.ceil(self.conn_timeout))]

After that, i've spread these modifications on all my nodes.

edit flag offensive delete link more
0

answered 2013-04-24 12:18:20 -0500

choisy-alex gravatar image

Finally, i've tried to add an argument when calling for Rsync program in the python code. It seems to work quite fine. I am still looking for an other way than modifying the python code.

For those who may be interested in, i changed : obj/replicator.py ( ~ L. 320) :

args = [ 'rsync', '--bwlimit=1000', // Added this part to limit the bandwith rate used by Rsync '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=%s' % self.rsync_io_timeout, '--contimeout=%s' % self.rsync_io_timeout, ] node_ip = rsync_ip(node['ip'])

and common/db_replicator (~ L. 175) :

popen_args = ['rsync', '--bwlimit=1000', '--quiet', '--no-motd', // I added the '--bwlimit=1000' to limit the bandwidth rate '--timeout=%s' % int(math.ceil(self.node_timeout)), '--contimeout=%s' % int(math.ceil(self.conn_timeout))]

After that, i've spread these modifications on all my nodes.

edit flag offensive delete link more
0

answered 2013-05-01 20:37:47 -0500

briancline gravatar image

As I understand the issue, there's not a lot you can do in a non-crude way here, other than modifying the code. That being said, two of the easier (crude) options available to you are:

1) Use iptables to limit the packets/second rate (see the limit and hashlimit module documentation in iptables). Something like this on your storage nodes may do the trick:

iptables -A OUTPUT -p tcp --dport 6000 -m state --state RELATED,ESTABLISHED -m limit --limit 50/second --limit-burst 100/second -j ACCEPT

This effectively puts in place a rule that reduces the throughput rate of a connection to 50 packets/sec only after it sees that connection hit a rate of 100 packets/sec. Note that 50 and 100 here are sample values -- you'll have to determine what packets/sec rate is acceptable between object servers in your environment, and what the trigger threshold should be, but you get the idea.

2) Use tc (traffic control) to limit the actual throughput rate.

This is a bit more involved, but you'd essentially use iptables with -j MARK and --set-xmark to mark outbound packets on port 6000 with a specific class, then set up your tc rules to tell the kernel the maximum throughput rate for an interface, and the maximum throughput rate acceptable for packets marked with a specific mask. You can optionally get relatively fancy depending on the tc classes to make them adaptive based on current usage in other classes, and so forth.

Here's a few good links on tc if you wish to go this route: http://lartc.org/howto/ http://www.cyberciti.biz/faq/linux-tr... http://shearer.org/Linux_Shaping_Temp...

edit flag offensive delete link more
0

answered 2013-05-02 12:51:10 -0500

choisy-alex gravatar image

Thank you for the answer. I am going to look at these two solutions and it seems that traffic control may be really helpful. There is a last thing that disturbs me a bit.

While i am watching the packets going through my network, i am not able to really differenciate the packets moving from a PUT request (uploading an object via the API) to the packets moving because of a drive failure (unmounting 1 device for exemple).

In this way, when 1 disk fails, the replicator starts doing his job and copy the missing datas to another zone. For this time, i can't really use the swift API for another purpose, like listing objects on an account or something. And i feel like if i limit the rsync bandwidth the problem will still be the same but just longer. I wanted to know if you already somthing like that ?

I remember reading something, a few days ago, that was telling about the configuration of a number of nodes that are used directly when there is a PUT request against the cluster. I can't remember what was it exactly so if you have and idea ? Because i was wondering if the fact to have enough nodes configured to cumulate the functions it could resolve my problems. I mean, if you configure 1 node for PUT request, 3 replicas and there is 1 node which fails, maybe having 1+3+1=5 nodes separated in 5 zones could help ? This just is an idea, that i'll work on it.

Anyway, really thank you for the answer, and sorry for disturbing. Have a nice day.

edit flag offensive delete link more
0

answered 2013-05-07 22:13:20 -0500

briancline gravatar image

If I understand correctly, it sounds like you want to limit the throughput rate of only the replication between storage nodes, and not the data from proxy nodes.

If so, you can either use tc with multiple classes, placing proxy nodes into a specific class, and storage nodes into a different class with a lower priority and its own set of throughput limits. Depending on how you construct your tc ruleset, proxy-to-storage traffic can then be prioritized over storage-to-storage traffic (replication in this case). You could also put the aforementioned rate/throughput-limiting rules in place only for storage node IPs, so that traffic to/from proxies won't be limited.

The configuration value you're thinking of is probably replica count -- when you set up your rings, you can specify the number of replicas that are created. Each replica has throughput cost, as you mention, however you'll want to consider whether reducing the redundancy of data stored within the cluster is worth reducing the bandwidth spent to persist it. Personally, I'd recommend keeping replica count at 3 replicas (or more), and limiting/deprioritizing the throughput between storage nodes separately.

Hope this helps!

edit flag offensive delete link more
0

answered 2013-05-09 16:22:16 -0500

choisy-alex gravatar image

Thank you for your answer. You understood the problem right. I am going to look at these solutions that seems really good and can rule the throughput rate the way i want. Thank you again for you help.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-04-24 08:02:32 -0500

Seen: 272 times

Last updated: May 09 '13