zlacker

There are better solutions in the market if you're looking for in-depth observability for LLM inference. For example, use Requesty (requesty at ai) to get very in-depth analytics, breakdowns and logs. You can also set spend limits, create routing policies or allow only a sub-set of models that do not retain data.