zlacker

How many tokens are you burning daily?

>>chirag+(OP)
Not the OP but I think in case of scanning and tagging/summarization you can run a local LLM and it will work with a good enough accuracy for this case.

>>chirag+(OP)
The real cost driver with agents seems to be the repetitive context transmission since you re-send the history every step. I found I had to implement tiered model routing or prompt caching just to make the unit economics work.

[go to top]