zlacker

[parent] [thread] 3 comments
1. jakub_+(OP)[view] [source] 2025-12-05 17:00:33
The interesting part:

After rolling out a bad ruleset update, they tried a killswitch (rolled out immediately to 100%) which was a code path never executed before:

> However, we have never before applied a killswitch to a rule with an action of “execute”. When the killswitch was applied, the code correctly skipped the evaluation of the execute action, and didn’t evaluate the sub-ruleset pointed to by it. However, an error was then encountered while processing the overall results of evaluating the ruleset

> a straightforward error in the code, which had existed undetected for many years

replies(1): >>8cvor6+y
2. 8cvor6+y[view] [source] 2025-12-05 17:03:19
>>jakub_+(OP)
> have never before applied a killswitch to a rule with an action of “execute”

One might think a company on the scale of Cloudflare would have a suite of comprehensive tests to cover various scenarios.

replies(2): >>hnthro+M2 >>robrya+UX1
◧◩
3. hnthro+M2[view] [source] [discussion] 2025-12-05 17:12:06
>>8cvor6+y
I kinda think most companies out there are like that. Moving fast is the motto I heard the most.

They are probably OK with occasional breaks as long as customers don't mind.

◧◩
4. robrya+UX1[view] [source] [discussion] 2025-12-06 08:08:31
>>8cvor6+y
Yeah the example they gave does feel like pretty isolated unit test territory, or at least an integration test on a subset of the system that could be ran in isolation.
[go to top]