zlacker

[parent] [thread] 1 comments
1. French+(OP)[view] [source] 2026-02-03 21:39:53
> time-to-first-token/token-per-second/memory-used/total-time-of-test

Would it not help with the DDR4 example though if we had more "real world" tests?

replies(1): >>bigyab+U1
2. bigyab+U1[view] [source] 2026-02-03 21:49:36
>>French+(OP)
Maybe, but even that fourth-order metric is missing key performance details like context length and model size/sparsity.

The bigger takeaway (IMO) is that there will never really be hardware that scales like Claude or ChatGPT does. I love local AI, but it stresses the fundamental limits of on-device compute.

[go to top]