OrangePi 6 Plus Review

>>ekianj+(OP)
When something has an 30 TOPS NPU, what are the implications? Do NPUs like this have some common backend that ggml/llama.cpp targets? Is it proprietary and only works for some specific software? Does it have access to all the system RAM and at what bandwidth?

I know the concept has been around for a while but no idea if it actually means anything. I assume that people are targeting ones in common devices like Apple, but what about here?

>>andy99+Ta
NPUs like this tend to have one thing in common: being decorative without drivers and support 9 times out of 10.

Even if it worked though, they're usually heavily bandwidth bottlenecked and near useless for LLM inference. CPU wins every time.

zlacker