zlacker

[return to "Cloudflare outage on December 5, 2025"]
1. paradi+q5[view] [source] 2025-12-05 15:56:37
>>meetpa+(OP)
The deployment pattern from Cloudflare looks insane to me.

I've worked at one of the top fintech firms, whenever we do a config change or deployment, we are supposed to have rollback plan ready and monitor key dashboards for 15-30 minutes.

The dashboards need to be prepared beforehand on systems and key business metrics that would be affected by the deployment and reviewed by teammates.

I've never seen a downtime longer than 1 minute while I was there, because you get a spike on the dashboard immediately when something goes wrong.

For the entire system to be down for 10+ minutes due to a bad config change or deployment is just beyond me.

◧◩
2. theide+Oc[view] [source] 2025-12-05 16:23:58
>>paradi+q5
Same, my time at a F100 ecommerce retailer showed me the same. Every change control board justification needed an explicit back-out/restoration plan with exact steps to be taken, what was being monitored to ensure that was being held to, contacts of prominent groups anticipated to have an effect, emergency numbers/rooms for quick conferences if in fact something did happen.

The process was pretty tight, almost no revenue-affecting outages from what I can remember because it was such a collaborative effort (even though the board presentation seemed a bit spiky and confrontational at the time, everyone was working together).

◧◩◪
3. prdona+Af[view] [source] 2025-12-05 16:35:30
>>theide+Oc
And you moved at a glacial pace compared to Cloudflare. There are tradeoffs.
[go to top]