zlacker

For the tasks my group is considering, even a 7B model is adequate.

Sufficiently accurate responses can be fed into other systems downstream and cleaned up. Even code responses can benefit from this by restricting output tokens using the grammar of the target language, or iterating until the code compiles successfully.

And for a decent number of LLM-enabled use cases the functionality unlocked by these models is novel. When you're going from 0 to 1 people will just be amazed that the product exists.