zlacker

[return to "Twitter Is DDOSing Itself"]
1. aeyes+KF[view] [source] 2023-07-01 22:09:35
>>ZacnyL+(OP)
This bug is very unlikely to be the reason. The rate limiter on the server side is cheap and the frontend bug only gets triggered with the rate limit active.

I have seen similar bugs in the systems I oversee because network libraries love to retry requests without sane limitations by default. But I never saw them make our rate limiters sweat. It's slightly more annoying when they hit an API which actually does some expensive work before returning an error but that's why we have rate limits on all public endpoints.

I also guess that the webapp is the least of Twitters traffic and the native apps probably don't have this problem.

◧◩
2. evan_+jK[view] [source] 2023-07-01 22:43:31
>>aeyes+KF
I don’t think it’s necessarily saying the self-inflicted DDoS has caused a technical issue that’s forced them to shut down access. I think it’s possible that shutting down anonymous access caused the DDoS, which led to giant spikes in some metric, which led them (Elon) to conclude that there was an uptick in scraping, so they imposed the 600/tweet/day limit to punish scrapers.

Seems like either my quota reset or they changed the policy because I’m able to access the site again.

◧◩◪
3. pschue+q31[view] [source] 2023-07-02 01:36:15
>>evan_+jK
This. I'd bet substantial amounts of money that the evil scraper idea is the result of a) another issue + b) paranoia + c) Musk thinking he understands better than anybody else.
◧◩◪◨
4. berkle+v91[view] [source] 2023-07-02 02:42:35
>>pschue+q31
This is a really ignorant take to dismiss scrapers. LLMs operate by having petabytes of conversational training data. Scraping is how OpenAI trained GPT. It’s how all their copycats are trying to do the same.

Elon can be a monumental asshat, and he can be self-DDOS’ing, and can be accurate about scraping at the same time. It’s why every single social media platform is heading toward becoming a walled garden.

◧◩◪◨⬒
5. pschue+oa1[view] [source] 2023-07-02 02:49:35
>>berkle+v91
I'm not denying that scrapers exist, I'm just highly suspicious of this explanation given that: a) he's proven time and time again how willing he is to say shit just to get attention b) he doesn't seem to understand software very well c) if shit was imploding for reasons related to decisions he made, this is precisely the kind of blame externalization I would expect.
◧◩◪◨⬒⬓
6. evan_+7b1[view] [source] 2023-07-02 02:57:50
>>pschue+oa1
Yeah, scrapers have always existed and while their traffic is undoubtedly higher than it has been in the past, it can't possibly be any significant amount of traffic when compared to the rest of the traffic hitting the site.

A real scraper would be stopped by a rate limit set to, like, 100 tweets/minute. 600 tweets/day is a completely pointless, punitive limit.

◧◩◪◨⬒⬓⬔
7. berkle+Cp2[view] [source] 2023-07-02 15:41:45
>>evan_+7b1
> A real scraper would be stopped by a rate limit

I'm guessing you've never played an offensive or defensive role in scraping because what you've described is in no way a problem for a serious scraping effort. I agree the rate limits are stupid. They fuck over users, they stop amateur scrapers, and do nothing whatsoever to impede professional scraping.

If you want to stop most scraping, employ device attestation techniques and TLS fingerprinting.

◧◩◪◨⬒⬓⬔⧯
8. costco+yF2[view] [source] 2023-07-02 17:23:10
>>berkle+Cp2
But then you have to contend with this: https://github.com/bogdanfinn/tls-client... Just used this to bypass a Cloudflare check! I've never scraped Twitter but Elon said there was a large scraping operation from Oracle IPs. He could substantially raise the cost of scraping by just banning datacenter IPs. Something like p0f would probably help too. I pay for static residential proxies (basically servers running squid that somehow have IPs belonging to consumer ISPs) and with TCP fingerprinting these would be detected as Linux and expose my Windows or iPhone user-agents as inconsistent but I've never encountered a site that checks this. Although maybe sites are doing so silently but I don't notice because I don't otherwise meet the bot threshold.
◧◩◪◨⬒⬓⬔⧯▣
9. berkle+1T2[view] [source] 2023-07-02 18:39:36
>>costco+yF2
for sure, using a custom TLS library like uTLS helps -- need to inject that GREASE cipher selection. I have a suspicion that private residential proxies are out of budget for many outfits, or the IPs are too few and then simple rate limiting kicks in? Who do you use if you're willing to share? I've not been happy with the, uhh, questionable ethics of Luminati/BrightData in the past.

There are definitely more and more sites doing TLS/TCP/etc fingerprinting or device attestation for mobile APIs, but it's still pretty rare. I mean Twitter is trying to limit requests by IP, so definitely amateur hour over there.

◧◩◪◨⬒⬓⬔⧯▣▦
10. costco+T23[view] [source] 2023-07-02 19:44:38
>>berkle+1T2
I use https://www.pingproxies.com/isp which is like $3/IP/month and unlimited bandwidth (I assume if you used a ridiculous amount they might charge you). Luminati pricing is extortionate. I have no idea how anyone doing anything at scale can afford $10/GB. I haven't investigated but I don't know if Twitter limits are per account or per IP.
◧◩◪◨⬒⬓⬔⧯▣▦▧
11. berkle+X33[view] [source] 2023-07-02 19:52:36
>>costco+T23
Seriously. I don't even consider a provider if they want to charge for bandwidth. I'm doing about 50 TB/mo atm.
[go to top]