Open source models are actually _better_ at structured outputs because you can adapt them using tools like JSONFormer et al that interact with the internals of the model (https://www.reddit.com/r/LocalLLaMA/comments/17a4zlf/reliabl...). The structured outputs can be arbitrary grammars, for example, not just JSON (https://github.com/outlines-dev/outlines#using-context-free-...).
Yes, but you should also instruct the model to follow that specific pattern in its answer, or else the accuracy of the response degrades even though it's following your grammar/pattern/whatever.
For example, if you use Llama-2-7b for classification (three categories, "Positive", "Negative", "Neutral"), you might write a grammar like this:
```
root ::= "{" ws "sentiment:" ws sentiment "}"
sentiment ::= ("Positive" | "Neutral" | "Negative" )
ws ::= [ \t\n]*
```
But if the model doesn't know it has to generate this schema, the accuracy of classifications drops because it's trying to say other things (e.g., "As an AI language model...") which then get suppressed and "converted" to the grammar.
Otherwise, it is forced to always provide a gibberish success response that you likely won’t catch.
I’ve tested this with Mixtral, and it seems capable of deciding between the normal response and error response based on the validity of the data passed in with the request. I’m sure it can still generate gibberish in the required success response format, but I never actually saw it do that in my limited testing, and it is much less likely when the model has an escape hatch.