zlacker

[parent] [thread] 5 comments
1. vessen+(OP)[view] [source] 2026-02-03 16:16:46
I'm thinking the next step would be to include this as a 'junior dev' and let Opus farm simple stuff out to it. It could be local, but also if it's on cerebras, it could be realllly fast.
replies(1): >>ttoino+C
2. ttoino+C[view] [source] 2026-02-03 16:19:31
>>vessen+(OP)
Cerebras already has GLM 4.7 in the code plans
replies(1): >>vessen+i1
◧◩
3. vessen+i1[view] [source] [discussion] 2026-02-03 16:22:21
>>ttoino+C
Yep. But this is like 10x faster; 3B active parameters.
replies(1): >>ttoino+34
◧◩◪
4. ttoino+34[view] [source] [discussion] 2026-02-03 16:33:20
>>vessen+i1
Cerebras is already 200-800 tps, do you need even faster ?
replies(1): >>overfe+jh
◧◩◪◨
5. overfe+jh[view] [source] [discussion] 2026-02-03 17:28:01
>>ttoino+34
Yes! I don't try to read agent tokens as they are generated, so if code generation decreases from 1 minute to 6 seconds, I'll be delighted. I'll even accept 10s -> 1s speedups. Considering how often I've seen agents spin wheels with different approaches, faster is always better, until models can 1-shot solutions without the repeated "No, wait..." / "Actually..." thinking loops
replies(1): >>pqtyw+E91
◧◩◪◨⬒
6. pqtyw+E91[view] [source] [discussion] 2026-02-03 21:13:41
>>overfe+jh
> until models can 1-shot solutions without the repeated "No, wait..." / "Actually..." thinking loops

That would imply they'd have to be actually smarter than humans, not just faster and be able to scale infinitely. IMHO that's still very far away..

[go to top]