zlacker

[return to "War story: the hardest bug I ever debugged"]
1. BobbyT+Hp7[view] [source] 2025-03-27 03:27:20
>>jakevo+(OP)
Interesting writeup, but 2 days to debug “the hardest bug ever”, while accurate, seems a bit overdone.

Though abs() returning negative numbers is hilarious.. “You had one job…”

To me, the hardest bugs are nearly irreproducible “Heisenbugs” that vanish when instrumentation is added.

I’m not just talking about concurrency issues either…

The kind of bug where a reproduction attempt takes a week, not parallelizable due to HW constraints, and logging instrumentation makes it go away or fail differently.

2 days is cute though.

◧◩
2. steveB+t48[view] [source] 2025-03-27 11:54:40
>>BobbyT+Hp7
Yes ! I've dealt with complex issues that turned out to be vendor-swapped-hardware-woopsie which we spent over a month trying to solve in software before finally figuring it out.

Part of it was difficulty of pinpointing the actual issue - fullness of drive vs throughput of writes.

A lot of it was unfortunately organizational politics such that the system spanned two teams with different reporting lines that didn't cooperate well / had poor testing practices.

◧◩◪
3. voidif+958[view] [source] 2025-03-27 12:01:16
>>steveB+t48
> A lot of it was unfortunately organizational politics

The hardest bugs in my experience are those where your only source of vital information is a third party who is straight-up lying to you.

[go to top]