>>tosh+(OP)
One thing that most people don't realize is that (full parameter)finetuned models are costly unless you run it in batched mode. Which means unless the request rate is very high and consistent, it is better to use prompts with GPT-3.5. e.g. batch of 1, mistral is more expensive than GPT-4[1].
[1]: https://docs.mystic.ai/docs/mistral-ai-7b-vllm-fast-inferenc...