zlacker

[parent] [thread] 2 comments
1. chirag+(OP)[view] [source] 2026-01-26 04:41:10
How many tokens are you burning daily?
replies(2): >>gls2ro+K1 >>storys+7p
2. gls2ro+K1[view] [source] 2026-01-26 05:01:05
>>chirag+(OP)
Not the OP but I think in case of scanning and tagging/summarization you can run a local LLM and it will work with a good enough accuracy for this case.
3. storys+7p[view] [source] 2026-01-26 09:26:53
>>chirag+(OP)
The real cost driver with agents seems to be the repetitive context transmission since you re-send the history every step. I found I had to implement tiered model routing or prompt caching just to make the unit economics work.
[go to top]