> Not completing the operation at all, is not really any better than getting the wrong answer, it's only more debuggable.
What Linus is saying is 100% right of course - he is trying to set the expectations straight in saying that just because you replaced C code with multi thousands (or whatever huge number) of man months of efforts, corrections and refinements with Rust code it doesn't mean absolute safety is guaranteed. For him as a kernel guy just as when you double free the kernel C code detects it and warns about it Rust will panic abort on overflows/alloc fails etc. To the kernel that is not safety at all - as he points out it is only more debuggable.
He is allowing Rust in the kernel so he understands the fact that Rust allows you to shoot yourself in the foot a lot less than standard C - he is merely pointing out the reality that in kernel space or even user space that does not equate to absolute total safety. And as a chief kernel maintainer he is well within his rights to set the expectation straight that tomorrow's kernel-rust programmers write code with this point in mind.
(IOW as an example he doesn't want to see patches in Rust code that ignore kernel realities for Rust's magical safety guarantee - directly or indirectly allocating large chunks of memory may always fail in the kernel and would need to be accounted for even in Rust code.)
As I posted up in this thread, the right way to handle this is to make dynamic errors either throw exceptions or kill the whole task, and split the critical work into tasks that can be as-a-whole failed or completed, almost like transactions. The idea that the kernel should just go on limping in a f'd up state is bonkers.
I feel like we must have read two different articles. You sound crazy. Didn't read it your way at all.
> Think of that "debugging tools give a huge warning" as being the equivalent of std::panic in standard rust. Yes, the kernel will continue (unless you have panic-on-warn set), because the kernel MUST continue in order for that "report to upstream" to have a chance of happening.
"If the kernel shuts down the world, we don't get the bug report", seems like a pretty good argument. There are two options when you hit a panic in rust code:
* Panic and shut it all down. This prevents any reporting mechanism like a core dump. You cannot attach a normal debugger to the kernel.
* Ignore the panic and proceed with the information it failed, reporting this failure later.
The kernel is a single program, so it's not like you could just fork it before every Rust call and fail if they fail.
> In the kernel, "panic and stop" is not an option (it's actively worse than even the wrong answer, since it's really not debugable), so the kernel version of "panic" is "WARN_ON_ONCE()" and continue with the wrong answer.
(edit, and):
> Yes, the kernel will continue (unless you have panic-on-warn set), because the kernel MUST continue in order for that "report to upstream" to have a chance of happening.
Did I read that right? The kernel must continue? Yes, sure, absolutely...but maybe it doesn't need to continue with the next instruction, but maybe in an error handler? Is his thinking so narrow? I hope not.
The issue being discussed here is that Rust comes from a perspective of being able to classify errors and being able to automate error handling. In the kernel, it doesn't work like that, as we're working with more constraints than in userland. That includes hardware that doesn't behave like it was expected to.