zlacker

And when you're at Twitter scale, sprinkle some jitter too.

replies(1): >>oblio+Mb

>>fathyb+(OP)
What do you mean?

replies(3): >>jyxent+ic >>wolfga+Hc >>8organ+ne

>>oblio+Mb
Adding some randomization to the exponential backoff times to avoid the thundering herd problem: https://en.wikipedia.org/wiki/Thundering_herd_problem

>>oblio+Mb
Say you have a bug that caused 100,000 HTTP requests to hang, and you kick the node and make them all fail at once. One second later, 100,000 clients suddenly retry simultaneously, causing a huge spike in load which makes most of their requests fail. They use exponential backoff, so two seconds after that, 99,000 clients retry, causing a huge spike in load that makes most of their requests fail. Four seconds after that, 98,000 clients retry...

If you introduce a bit of randomness into the retry timing (say, multiply by 1.8~2.2 instead of a straight doubling), that thundering herd will spread itself out and be much easier to recover from.

>>oblio+Mb
Jitter is a little randomness in how long clients wait between retries. It ensures that you don't have a "thundering herd" all retrying at the same time. Imagine if your API used exponential backoff of [1s, 2s, 4s, 8s, ...] and a large group of requests gets a retryable error at t=0. They will all retry at exactly t=1, t=2, etc. If the group is large enough that repeated surge of requests can knock you offline.

There's nasty form of this where the site is offline for a bit and then all the clients rush their requests in when it comes back online. The client requests are all coordinated on the site recovery time and end up overloading the site with their coordinated retries.

replies(4): >>henry2+Jh >>nights+Su >>doomle+gA >>bezout+Dm1

>>8organ+ne
I enjoyed your comment a lot. Interesting to think that abstract structures like a load balanced webserver have a simil to the fundamental frequency observer in physical structure

>>8organ+ne
Also useful in caching mechanisms.

>>8organ+ne
Ah, so it’s like CDMA in WLAN, TIL

>>8organ+ne
TIL about jitter