zlacker

[parent] [thread] 7 comments
1. usrnm+(OP)[view] [source] 2025-12-05 16:05:54
Like who? Which large tech company doesn't have outages?
replies(3): >>k8sToG+p1 >>k__+C1 >>nish__+Tc
2. k8sToG+p1[view] [source] 2025-12-05 16:10:40
>>usrnm+(OP)
It's not about outages. It's about the why. Hardware can fail. Bugs can happen. But to continue a roll out despite warning sings and without understanding the cause and impact is on another level. Especially if it is related to the same problem as last time.
replies(1): >>udev40+sc
3. k__+C1[view] [source] 2025-12-05 16:11:28
>>usrnm+(OP)
"tripping on their own feet" == "not rolling back"
◧◩
4. udev40+sc[view] [source] [discussion] 2025-12-05 16:56:14
>>k8sToG+p1
And yet, it's always clownflare breaking everything. Failures are inevitable, which is widely known, therefore we build resilience systems to overcome the inevitable
replies(1): >>deadba+4j
5. nish__+Tc[view] [source] 2025-12-05 16:58:08
>>usrnm+(OP)
Google does pretty good.
replies(1): >>hanson+3F
◧◩◪
6. deadba+4j[view] [source] [discussion] 2025-12-05 17:23:12
>>udev40+sc
It is healthy for tech companies to have outages, as they will build experience in resolving them. Success breeds complacency.
replies(1): >>wizzwi+H11
◧◩
7. hanson+3F[view] [source] [discussion] 2025-12-05 18:59:50
>>nish__+Tc
Google docs was just down a couple weeks ago almost the whole day.
◧◩◪◨
8. wizzwi+H11[view] [source] [discussion] 2025-12-05 20:48:41
>>deadba+4j
You don't need outages to build experience in resolving them, if you identify conditions that increase the risk of outages. Airlines can develop a lot of experience resolving issues that would lead to plane crashes, without actually crashing any planes.
[go to top]