zlacker

I’ll play the devils advocate here but frontend devs need to smarten up. This is basic error handling that should have been in place for years. Blocking tweets with 403 or whatever they chose shouldn’t trigger endless retries on short intervals.. ever!

replies(7): >>jlund-+E >>giovan+N >>sheeps+g2 >>dimmke+b3 >>the-rc+sb >>hakane+TM >>yxre+cx1

>>goalie+(OP)
I’d bet (not that much, but like $20) that someone has a .ifClientError() or if responseStatus === 4xx somewhere.

If you’ve never had to handle authorization in a particular area, it might have been safe to assume that any 4xx error should have been retried when the code was originally written and someone didn’t write them all out

replies(1): >>wongar+Q1

>>goalie+(OP)
Maybe the frontend team and QA were never made aware that 403 errors could be triggered like this.

replies(1): >>8n4vid+lx

>>jlund-+E
But wouldn't responseStatus === 4xx indicate that the problem is on your end and retrying is unlikely to fix the issue. A 5xx is worth a retry, a 4xx should imho just produce an error message.

And even if you do retry, exponential backoff has been the standard for a long time (and is mentioned by the Twitter API documentation as a good solution to 429 responses)

replies(1): >>crater+I4

>>goalie+(OP)
..and using exponential back-off, if not limiting the number of retries.

Though it’s hard to know for sure what really went down. Could be a number of things. Including a lack of subject matter experts (Elon recently admitted to laying off some people they shouldn’t have).

replies(1): >>Dextro+T2

>>sheeps+g2
Devil's advocate here: did we consider that any such exponential back off goes out the window when users, faced with a non-working site, will just refresh the page therefor reseting the whole process?

replies(1): >>NavinF+78

>>goalie+(OP)
You're right about error handling, but consider that the user who posted this was logged in. The screenshot they post is of a specific tweet, but they reference the home feed being down. What kind of API call was it making where it's asking for a list of tweets to serve to a logged in user and that call is not authenticated? It makes me think that the blocking was implemented improperly.

Unless the home feed being down is simply a side effect - the service that fetches tweets being DDOS'd by other views in the app making numerous non authenticated calls.

But I was also thinking about this earlier today. These days, everybody is so quick to say "the software is easy, it's the community that's hard" - I've even said it myself a few times in the past few weeks, but I think that might be overstated.

Building good software is hard. Keeping it good is even harder. What does the codebase look like for Twitter's front-end at this point?

How many frameworks has the base functionality been ported through? How many quick pivots from product adding features, adjusting things squashed down the ability to address technical debt or even have functioning unit and regression testing?

The fact that this 1. Made it to production and 2. Was not noticed and rolled back immediately (like, in under 30 minutes) is extremely concerning (and obviously very embarrassing.) If I had private data stored on Twitter of ANY kind (like DMs that I don't want getting out - a messaging system rich, famous, and powerful people have been using like email for over a decade), at this point I would be trying to get that data removed however I could, or accept that there's a strong possibility there's going to be a huge data breach and all of the data will be leaked.

replies(1): >>spixy+eh1

>>wongar+Q1
There are libraries, usually under the general heading of "circuit breaker", that handle 429 and other reasons to retry in a sane manner. I'm not a JS expert but I believe either yammer/circuit-breaker-js or nodeshift/opossum would work in the browser. Even a hand-coded exponential backoff with jitter is simple enough to do for most cases.

>>Dextro+T2
The server load from that is negligible since those requests stop at the load balancer.

On that note, the 10 requests/second in the post is also negligible for the same reason. Only requests that hit backend servers matter

replies(1): >>8n4vid+Bx

>>goalie+(OP)
Not just frontend developers. The backend should serve a 429 or 503 error, complete with a Retry-After header: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...

That would give the server side more control over the retrying logic (when the header is properly interpreted). I'm surprised Elon hasn't implemented this himself.

>>giovan+N
BS. If there's a new limit of 600 tweets or whatever, then they should try hitting that limit on each client to see what happens. This sounds very reproducible.

>>NavinF+78
How does the load balancer know if they hit the tweet limit or not? Sounds like they need to query a db for that

replies(2): >>sheeps+Bb1 >>NavinF+KE3

>>goalie+(OP)
Just speculating, but the server is probably throwing 429 (Too Many Requests) errors, and yes, the client should respect and back off when it encounters that. According to the post, it seems not to. Strange!

>>8n4vid+Bx
The load balancer doesn't care about the source of request when it hits a hard requests per second limit. It all depends on how it is configured.

replies(1): >>8n4vid+Z32

>>dimmke+b3
I see status 429 (Too Many Requests)

that is definitely frontend error

>>goalie+(OP)
I don't think its intentional. I think it's a side effect of using a listener pattern.

If you use a listener, useEffect in react, to load data, it will start the request, track it is loading with a boolean, and then store the payload. That passes unit tests and QA.

If the listener doesn't check the error before starting the api request again, you have this infinite loop happen where the loading flag goes off and the payload is still null, so it just starts it again.

It's sloppy code, but its an unintentional side effect.

replies(1): >>ricard+jz1

>>yxre+cx1
Minor correction, this is not a listener pattern, but React’s hook-based reactivity DSL.

It’s actually exactly the type of problem declarative UI libraries like react were supposed to prevent, yet here we are 8 years later.

>>sheeps+Bb1
Oh, I was referring to the per user tweet limit

>>8n4vid+Bx
There are a million ways to skin that cat.

Personally I'd just cache HTTP 429 responses for 1 minute, but you could also implement rate-limiting inside the load balancer with an in-memory KV store or bloom filter if you wanted to.

Perhaps the context you're missing is that all large sites use ECMP routing and consistent hashing to ensure that requests from the same IP hit the same load balancer. Twitter only has ~238 million daily active users. 10 requests/second on keepalive TCP+TLS connections can be handled by a couple of nginx servers. The linked "Full-stack Drupal developer" has no idea how any of this works and it's kinda sad how most people in this thread took his post at face value