My iPhone 16 Pro Max produces garbage output when running MLX LLMs

>>rafael+(OP)
Low level numerical operation optimizations are often not reproduceable. For example: https://www.intel.com/content/dam/develop/external/us/en/doc... (2013)

But it's still surprising that that LLM doesn't work on iPhone 16 at all. After all LLMs are known for their tolerance to quantization.

>>rainco+8h
Yes, "floating point accumulation doesn't commute" is a mantra everyone should have in their head, and when I first read this article, I was jumping at the bit to dismiss it out of hand for that reason.

But, what got me about this is that:

* every other Apple device delivered the same results

* Apple's own LLM silently failed on this device

to me that behavior suggests an unexpected failure rather than a fundamental issue; it seems Bad (TM) that Apple would ship devices where their own LLM didn't work.

>>bri3d+Dh
> floating point accumulation doesn't commute

It is commutative (except for NaN). It isn't associative though.

>>sva_+Ct
I think it commutes even when one or both inputs are NaN? The output is always NaN.

zlacker