Cloudflare outage on December 5, 2025

>>meetpa+(OP)
Ironically, this time around the issue was in the proxy they're going to phase out (and replace with the Rust one).

I truly believe they're really going to make resilience their #1 priority now, and acknowledging the release process errors that they didn't acknowledge for a while (according to other HN comments) is the first step towards this.

HugOps. Although bad for reputation, I think these incidents will help them shape (and prioritize!) resilience efforts more than ever.

At the same time, I can't think of a company more transparent than CloudFlare when it comes to these kind of things. I also understand the urgency behind this change: CloudFlare acted (too) fast to mitigate the React vulnerability and this is the result.

Say what you want, but I'd prefer to trust CloudFlare who admits and act upon their fuckups, rather than trying to cover them up or downplaying them like some other major cloud providers.

@eastdakota: ignore the negative comments here, transparency is a very good strategy and this article shows a good plan to avoid further problems

>>denysv+E4
> HugOps

This childish nonsense needs to end.

Ops are heavily rewarded because they're supposed to be responsible. If they're not then the associated rewards for it need to stop as well.

>>fidotr+V7
I have never seen an Ops team being rewarded for avoiding incidents (focusing in tech debt reduction), but instead they get the opposite - blamed when things go wrong.

I think it's human nature (it's hard to realize something is going well until it breaks), but still has a very negative psychological effect. I can barely imagine the stress the team is going through right now.

>>denysv+M9
> I have never seen an Ops team being rewarded for avoiding incidents

That's why their salaries are so high.

>>fidotr+Va
Depending on the tech debt, the ops team might just be in "survival mode" and not have the time to fix every single issue.

In this particular case, they seem to be doing two things: - Phasing out the old proxy (Lua based) which is replaced by FL2 (Rust based, the one that caused the previous incident) - Reacting to an actively exploited vulnerability in React by deploying WAF rules - and they're doing them in a relatively careful way (test rules) to avoid fuckups, which caused this unknown state, which triggered the issue

>>denysv+Fc
They deliberately ignored an internal tool that started erroring out at the given deployment and rolled it out anyway without further investigation.

That's not deserving of sympathy.

zlacker