zlacker

[return to "Cloudflare outage on December 5, 2025"]
1. mixedb+Xv1[view] [source] 2025-12-05 22:54:13
>>meetpa+(OP)
This is architectural problem, the LUA bug, the longer global outage last week, a long list of earlier such outages only uncover the problem with architecture underneath. The original, distributed, decentralized web architecture with heterogeneous endpoints managed by myriad of organisations is much more resistant to this kind of global outages. Homogeneous systems like Cloudflare will continue to cause global outages. Rust won't help, people will always make mistakes, also in Rust. Robust architecture addresses this by not allowing a single mistake to bring down myriad of unrelated services at once.
◧◩
2. tobyjs+KD1[view] [source] 2025-12-05 23:51:16
>>mixedb+Xv1
I’m not sure I share this sentiment.

First, let’s set aside the separate question of whether monopolies are bad. They are not good but that’s not the issue here.

As to architecture:

Cloudflare has had some outages recently. However, what’s their uptime over the longer term? If an individual site took on the infra challenges themselves, would they achieve better? I don’t think so.

But there’s a more interesting argument in favour of the status quo.

Assuming cloudflare’s uptime is above average, outages affecting everything at once is actually better for the average internet user.

It might not be intuitive but think about it.

How many Internet services does someone depend on to accomplish something such as their work over a given hour? Maybe 10 directly, and another 100 indirectly? (Make up your own answer, but it’s probably quite a few).

If everything goes offline for one hour per year at the same time, then a person is blocked and unproductive for an hour per year.

On the other hand, if each service experiences the same hour per year of downtime but at different times, then the person is likely to be blocked for closer to 100 hours per year.

It’s not really bad end user experience that every service uses cloudflare. It’s more-so a question of why is cloudflare’s stability seeming to go downhill?

And that’s a fair question. Because if their reliability is below average, then the value prop evaporates.

◧◩◪
3. ccakes+9Q1[view] [source] 2025-12-06 01:38:59
>>tobyjs+KD1
> If an individual site took on the infra challenges themselves, would they achieve better? I don’t think so.

The point is that it doesn’t matter. A single site going down has a very small chance of impacting a large number of users. Cloudflare going down breaks an appreciable portion of the internet.

If Jim’s Big Blog only maintains 95% uptime, most people won’t care. If BofA were at 95%.. actually same. Most of the world aren’t BofA customers.

If Cloudflare is at 99.95% then the world suffers

◧◩◪◨
4. esrauc+qM2[view] [source] 2025-12-06 13:49:16
>>ccakes+9Q1
I'm not sure I follow the argument. If literally every individual site had an uncorrelated 99% uptime, that's still less available than a centralized 99.9% uptime. The "entire Internet" is much less available in the former setup.

It's like saying that Chipotle having X% chance of tainted food is worse than local burrito places having 2*X% chance of tainted food. It's true in the lens that each individual event affects more people, but if you removed that Chipotle and replaced with all local, the total amount of illness is still strictly higher, it's just tons of small events that are harder to write news articles about.

◧◩◪◨⬒
5. Akrony+Wa3[view] [source] 2025-12-06 17:11:11
>>esrauc+qM2
Also what about individual sites having 99% uptime while behind CF with an uncorrelated uptime of 99.9%?

Just because CF is up doesnt mean the site is

[go to top]