zlacker

So instead of a power spike, we'd have had a major internet outage across the world, across the entire industry and beyond, probably, if everyone had panicked on oops. The blame really lies with people not monitoring their systems.

As you said, you have the option to reboot on panic, but Linus is absolutely not wrong that this size does not fit all.

What about a medical procedure that WILL kill the patient if interrupted? What about life support in space? Hitting an assert in those kinds of systems is a very bad place to be, but an automatic halt is worse than at least giving the people involved a CHANCE to try and get to a state where it's safe to take the system offline and restart it.

replies(6): >>Tomte+k2 >>ok_dad+13 >>notaco+bu >>acje+UF1 >>stouse+8H1 >>kaba0+cq2

>>mike_h+(OP)
You won't see Linux in those fields. I'm aware of a project by OSADL to qualify Linux for SIL 2, but your examples are way beyond that.

replies(1): >>retard+fN9

>>mike_h+(OP)
> What about a medical procedure that WILL kill the patient if interrupted? What about life support in space? Hitting an assert in those kinds of systems is a very bad place to be, but an automatic halt is worse than at least giving the people involved a CHANCE to try and get to a state where it's safe to take the system offline and restart it.

Kinda a strawman there. That's got to account for, what, 0.0001% of all use of computers, and probably they would never ever use Linux for these applications (I know medical devices DO NOT use Linux).

replies(3): >>gmueck+q4 >>goodpo+Qc >>Kim_Br+lr2

>>ok_dad+13
Do you know absolutely every medical device in existence and do you know how broad the definition of a medical device is (including e.g. the monitor attached to the PC used for displaying X-ray images)?

replies(2): >>jmilli+j5 >>ok_dad+28

>>gmueck+q4

  > including e.g. the monitor attached to the PC used for displaying
  > X-ray images

Somewhat off-topic, but I used to work in a dental office. The monitors used for displaying X-rays were just normal monitors, bought from Amazon or Newegg or whatever big-box store had a clearance sale. Even the X-ray sensors themselves were (IIRC) not regulated devices, you could buy one right now on AliExpress if you wanted to.

replies(1): >>gmueck+0i

>>gmueck+q4
I worked in medical device quality control and so, yes, I know all about the FDA requirements for medical devices and ISO 13485. I can say, with certainty, that base Linux would not be allowed to run in a medical device in the USA. It's software of unknown provenance (SOUP) and would absolutely NOT be used as-is.

replies(6): >>smolde+Od >>gmueck+Hj >>voakba+wl >>sarlal+QR >>cplusp+Oq1 >>Suzura+uA2

>>ok_dad+13
> I know medical devices DO NOT use Linux

Absolutely false.

>>ok_dad+28
Makes me wonder what they run their NAS software with. Or their internal web-hosting, or their networking devices, or any of the other devices they have littered about. I'd swear on the Bible that I've seen a dentist or two running KDE 3 before...

replies(1): >>ok_dad+ap

>>jmilli+j5
That's not the case in the EU. I've worked for an equipment manufacturer for dental clinics. While the monitors were allowed to be off the shelf, the operator (dental clinic) is required to make sure that they work properly and display the image correctly - obey certain brightness and color resolution/calibration standards. Our display software had to refuse to work on an uncalibrated monitor.

replies(1): >>alias_+mL

>>ok_dad+28
Then you should know that the use of SOUP is not so clear cut. It depends on the class of device and more specifically, on the part of the device that the software is used on. I know medical devices running SOUP operating systems like Linux. They went to some length to show that the parts running Linux and the critical functions of the device were sufficiently independent. This isolation is specifically allowed by the standards you quote.

It's even worse on things like car dashboards: some warning lights on dashboards need to be ASIL-D conformant, which is quite strict. However, developing the whole dashboard software stack to that standard is too expensive. So the common solution these days is to have a safe, ASIL-D compliant compositor and a small renderer for the warning lights section of the display while the rendering for all the flashy graphics runs in an isolated VM on standard software with lower safety requirements. It's all done on the same CPU and GPU.

replies(1): >>ok_dad+6p

>>ok_dad+28
That’s an odd thing to claim. I have worked on certified medical devices that run custom Linux distribution.

Mind you, that experience also severely soured me on the quality of medical software systems, due to poor quality of the software that ran in that distribution. Linux itself was a golden god in comparison to the crap that was layered on top of it.

replies(1): >>ok_dad+rp

>>gmueck+Hj
> They went to some length to show that the parts running Linux and the critical functions of the device were sufficiently independent.

Let's not be too pedantic. You, as an experienced medical device engineer, probably knew what I meant was that they would never use Linux in the critical parts of a medical device as the OP had originally argued. Any device would definitely do all of it's functionality without the part with Linux on it.

The OP was still a major strawman, regardless of my arguments, because the Linux kernel will never be in the critical path of a medical device without a TON of work to harden it from errors and such. Just the fact that Linus' stance is as said would mean that it's not an appropriate kernel for a medical device, because they should always fail with an error and stop under unknown conditions rather than just doing some random crap.

>>smolde+Od
Those aren't medical devices.

>>voakba+wl
I'd like to hear more about that, but I assume it's much like the other poster here that described a Linux system that is a peripheral device attached to the actual medical device that does the medical shit.

replies(2): >>gmueck+AC >>voakba+dD2

>>mike_h+(OP)
> What about a medical procedure that WILL kill the patient if interrupted? What about life support in space?

The proper answer to those is redundancy, not continuing in an unknown and quite likely harmful state.

replies(2): >>yencab+fI >>John23+ZF1

>>ok_dad+rp
It is not a peripheral device if it runs the UI with all the main controls, is it?

replies(1): >>ok_dad+mz1

>>notaco+bu
The leap second bug would have crashed all nodes of a redundant system, at the same time...

replies(1): >>notaco+jL

>>yencab+fI
Perhaps. On the other hand, letting a medical device continue moving an actuator or dispensing a medication when it's known to be in a bad "never happen" state could also be fatal. Ditto for the "life support in space" example. Ditto for anything reliant on position, where the system suddenly realizes it has no idea whether its position is correct. Imagine that e.g. on a warship. Limiting responses to external inputs (including time adjustments) can ameliorate such problems. So can software diversity. Many safety-critical systems require one or both, and other measures as well. Picking one black-swan event while ignoring literally every day scenarios doesn't seem very helpful. That's especially true when the thing you're advocating is what actually happened and led to its own bad outcomes.

replies(1): >>yencab+7m1

>>gmueck+0i
Interesting, how does your software detect an uncalibrated monitor? Did it come with a calibration device which had to be used to scan the display output to check?

I don't suppose monitors report calibration data back to display adapters do they?

replies(2): >>Gauntl+GM >>gmueck+gO

>>alias_+mL
My guess is they had some heuristic based on EDIDs, which are incredibly easy to spoof.

https://smile.amazon.com/EVanlak-Passthrough-Generrtion-Elim...

replies(1): >>gmueck+uP

>>alias_+mL
I didn't work on that specific software team and it has been a long time since I worked there. But the software came with its custom calibration routine and I believe that the calibration result was stored with model and serial number information from the monitor EDID.

replies(1): >>alias_+sM1

>>Gauntl+GM
Yes, but why would you go to these lengths? The purpose of the whole mechanism is to prevent accidental misdiagnosis based on an incorrectly interpreted X-ray image. This isn't DRM, just a safeguard against incorrect use of equipment.

replies(1): >>Gauntl+Rz1

>>ok_dad+28
Ok, that's good for a U.S. centric view. Do you know that every medical device manufactured in China, for use in China meets the same requirements? Same for India, Russia, etc. The U.S. isn't the world and I'd be surprised if Linux weren't in use in some critical systems around the world that would be shocking for U.S. experts on those types of systems.

>>notaco+jL
Picking medical devices and warships is also quite the cherry picking. Most Linuxes aren't like that. Critical embedded systems tend to have a hard realtime component, and if Linux is on the system it sits under e.g. seL4, or on a different CPU.

At the end of the day, what Linux does is what Linus wants out of it. He's stated, often, that halting the CPU at the exact moment something goes wrong is not the goal. If your goal is to do that, you might not be able to use Linux. If your goal is to put Rust in the Linux kernel, you might have to let go of your goal.

replies(1): >>notaco+ww2

>>ok_dad+28
Surely we can “harden” Linux for this application?

>>gmueck+AC
No, do you have a concrete example of this strawman, though?

Edit: I should also add (probably earlier too) that all my examples are specific to the USA FDA process. I'm sure some other place might not have the same rules.

replies(1): >>gmueck+0M1

>>gmueck+uP
People are cheap and corrupt. The speed bump this presents is real, but minor, in the face of a couple medical shops looking to save $100/pop on a dozen monitors.

I hope it's rare, but I think a persistent nag window ("Your display isn't calibrated and may not be accurate") is probably a better answer than refusing to work altogether, because it will be clear about the source of the problem and less likely to get nailed down.

replies(2): >>gmueck+GM1 >>kaba0+ws2

>>mike_h+(OP)
I remember working on some telecom equipment in the 90ies. It had a x86/Unix feature rich distributed management system. In other words complicated and expected to fail. The solution was a “watch dog” circuit that the main CPU had to poll every 100ms or so. Three misses and the CPU would get hard rebooted by the watch dog.

This reminds me of two things. Good system design needs a hardware-software codesign. Oxide computers has identified this, but it was probably much more common before the 90ies than after. The second thing is that all things can fail so a strategy that only hardens the one component is fundamentally limited, even flawed. If the component must not fail you need redundancy and supervision. Joe Armstrong would be my source of quote if I needed to find one.

Both rust and Linux has some potential for improvement here, but the best answers may lie in their relation to the greater system, rather than within it self. I’m thinking of WASM and hardware codesign respectively.

>>notaco+bu
Right. It's clear that many people have not heard of, or considered, Therac-25[1].

[1] https://en.wikipedia.org/wiki/Therac-25

replies(1): >>eesmit+qS1

>>mike_h+(OP)
> What about a medical procedure that WILL kill the patient if interrupted?

Allow me to introduce you to Therac-25: https://en.wikipedia.org/wiki/Therac-25

>>ok_dad+mz1
I can't see how you can make out a strawman in what I said. There are medical devices where the UI is running on a processor separate from the controller in charge of the core device functions. The two are talking to each other and there is no secondary way of interacting with the controller. This lessens the requirements that are put on the part running the UI, but does not eliminate them.

I'm mostly familiar with EU rules, but as far as I know the FDA regulations follow the same idea of tiered requirements based on potential harm done.

replies(1): >>ok_dad+D02

>>gmueck+gO
Thanks, sounds like I need to do some reading about EDIDs; I knew _of_ them but no real understanding is what they are and what they do.

>>Gauntl+Rz1
You have to draw a line somewhere, I guess. As far as I remember, protections against accidental misuse and foreseeable abuse of a device are required in medical equipment. But malicious circumvention of protections or any kind of active tampering are a whole other category in my opinion.

>>John23+ZF1
Therac-25 removed redundancy. Quoting the Wikipedia article: "Previous models had hardware interlocks to prevent such faults, but the Therac-25 had removed them, depending instead on software checks for safety."

replies(1): >>John23+LWb

>>gmueck+0M1
The UI is one of the most important parts of a machine, look at the Therac-25! The FDA regulations require a lot of effort goes into the human factors, too, and the UI definitely had to be as reliable as the rest of the device and be as well engineered as the rest.

https://www.fda.gov/medical-devices/human-factors-and-medica...

Honestly, the FDA regulations go too far vs the EU regs. The company I worked for was based in the EU and the products there were so advanced compared to our versions. Ours were all based on an original design from Europe that was approved and then basically didn’t charge for 30 years. The European device was fucking cool and had so many features, it was also capable of being carried around rather than rolled. The manufacturing was almost all automated, too, but in the USA it was not at all automated, it was humans assembling parts then recording it in a computer terminal.

>>mike_h+(OP)
That’s why monitoring, fail-safe power offs and redundant systems are important. E.g. even at the complete failure of a CAT scanner’s higher level control (which let’s say would run on an embedded linux kernel), the system would safely stop the radiation and power off, without any instructions from an OS. Here, an inconsistent state from the OS is actually more dangerous than stopping in the middle (e.g. the OS stucks and the same, high energy radiation is continuously being released)

>>ok_dad+13
Industrial control systems (sadly, imho) don't use Linux as often as they could/should, but such systems do have the ability to injure their operators or cause large amounts of damage of course. [1]

The first priority is safety, absolutely and without question. And then the immediate second priority is the fact that time is money. For every minute that the system is not operating, x amount of product is not being produced.

Generally, having the software fully halt on error is both dangerous and time-consuming.

Instead you want to switch to an ERROR and/or EMERGENCY_STOP state, where things like lasers or plasma torches get turned off, motors are stopped, brakes are applied, doors get locked/unlocked (as appropriate/safe), etc. And then you want to report that to the user, and give them tools to diagnose and correct the source of the error and to restart the machine/line [safely!] as quickly as possible.

In short, error handling and recovery is its own entire thing, and tends to be something that gets tested for separately during commissioning.

[1] PLC's do have the ability to <not stop> and execute code in a real time manner, but I haven't encountered a lot of PLC programmers who actually exploit these abilities effectively. Basically for more complex situations you're quickly going to be better off with more general purpose tools [2], at most handing off critical tasks to PLCs, micro-controllers, or motor controllers etc.

[2] except for that stupid propensity to give-up-and-halt at exactly that moment where it'll cause the most damage.

>>Gauntl+Rz1
Medical devices are insanely expensive (a CT scanner may reach a million dollars), you won’t risk $100 on such a small thing as a screen.

>>yencab+7m1
> Picking medical devices ... is also quite the cherry picking

It wasn't my example. It was mike_hock's, and I was responding in the context they had set.

> Most Linuxes aren't like that.

Your ally picked the medical-device and space-life-support examples. If you think they're invalid because such systems don't use Linux, why did you forego bringing it up with them and then change course when replying to me? As I said: not helpful.

The point is not specific to Linux, and more Linux systems than you seem to be aware of do adopt the "crash before doing more damage" approach because they have some redundancy. If you're truly interested, I had another whole thread in this discussion explaining one class of such cases in what I feel was a reasonably informative and respectful way while another bad-faith interlocutor threw out little more than one-liners.

>>ok_dad+28
I am an American citizen and a former dialysis patient, now kidney transplant recipient. I have watched in-center dialysis machines reboot during treatment, show the old "Energy Star" BIOS logo, and then boot Linux...

Felt kinda bad until I thought about how well a "Linux literally killed me" headline would do on HN, but then I realized I wouldn't be able to post the article if I actually died. Such is life. Or death? One or the other.

>>ok_dad+rp
These were not peripherals. We are talking devices that would be front line in an emergency room. Terrifying.

>>Tomte+k2
SUSE was used on Mars

https://www.pcmag.com/news/linux-is-now-on-mars-thanks-to-na...

>>eesmit+qS1
Right, that is the point I was making.