"Well, now it's Feb. 1st and I have an iPhone 17 Pro Max to test with and... everything works as expected. So it's pretty safe to say that THAT specific instance of iPhone 16 Pro Max was hardware-defective."
[1] as the author knows (“MLX uses Metal to compile tensor operations for this accelerator. Somewhere in that stack, the computations are going very wrong”) there’s lots of soft- and firmware in-between the code being run and the hardware of the neural engine. The issue might well be somewhere in those.