Thanks for the zero_byte_files_per_second pointer. I missed that. Setting it to 1 (which is effectively turning it off), I still get 4 client-visible errors per day, where a GET fails.

We were just using the cfq scheduler. I switched it to the deadline scheduler. I'll report back what happens.

I'm still a bit concerned that so much tuning is required for what should be a light load on the overall cluster.