“Rust is safe” is not some kind of absolute guarantee of code safety

>>rvz+(OP)
I don't think I buy Linus' high level claim. It is not necessarily better to press on with the wrong answer, in some cases failure actually is an option and might be much better than oops we did it wrong.

This morning I was reading about the analysis of an incident in which a London tube train drove away with open doors. Nobody was harmed, or even in immediate danger, the train had relatively few passengers and in fact they only finally alerted the driver at the next station, classic British politeness (they made videos, took photographs, but they didn't use the emergency call button until the train got to a station)

Anyway, the underlying cause involves systems which were flooded with critical "I'm failing" messages and would just periodically reboot and then press on. The train had been critically faulty for minutes, maybe even days before the incident, but rather than fail, and go out of service, systems kept trying to press on. The safety systems wouldn't have allowed this failed train to drive with its doors open - but the safety critical mistake to disable safety systems and drive the train anyway wouldn't have happened if the initial failure had caused the train to immediately go out of passenger service instead of limping on for who knows how long.

>>tialar+yf
I think it's obvious that Linus is correct here.

For example, say there's a bug in the Linux kernel that would produce a "panic" at midnight Dec 31st 2022... do we accept a billion devices shutting down? In the best case rebooting and resuming a whatever user space program was running?

Despite the bad taste, I think the obvious answer is as Linus says: the Kernel should keep going despite errors.

>>flumpc+Wq
A better analogy would be: Let's say if we have kernel A that contains a bug; we don't know when it will trigger or what it will do. We have another kernel, B, which has the same bug, but while we don't know when it will trigger, we know it will cause the device to halt. Which is the better kernel?

I'd say B is nearly always the better choice, because halting is a known state it's almost always possible to recover from, and going into unknown state may cause you to get hacked or to damage your peripherals. But if we were operating, say, a Mars rover, and shutting down meant we would never be able to boot again, then it'd be better take kernel A and attempt to recover from whatever state we find ourselves in. That's pretty exotic, however.

In the case of an unanticipated error in a software component, we always need input from an external source to correct ourselves. When you're the kernel, that generally means either a human being or a hypervisor has to correct you; better to do so from a halted state than an entirely unknown one. Trying to muddle through despite is super dangerous, and makes your software component into lava in the case of a fault.

>>maxbon+WM
> But if we were operating, say, a Mars rover, and shutting down meant we would never be able to boot again, then it'd be better take kernel A and attempt to recover from whatever state we find ourselves in. That's pretty exotic, however.

That you view it as exotic is partly a lack of imagination on your part; with a little more effort it's possible to identify similar use cases that are much closer to home than Mars.

But that doesn't really matter. What matters is that the Linux kernel needs to support both options, because it's just one component in a larger system and that context outside the kernel is what determines which option is correct for that system.

>>wtalli+AO
> [W]ith a little more effort it's possible to identify similar use cases that are much closer to home than Mars.

If you feel there are some that would add to this conversation, feel free to share them.

zlacker