zlacker

> Then the model can “choose” which type to respond with, and the grammar will allow either.

Ah I see. So you give the entire "monadic" grammar to the LLM, both as a `grammar` argument and as part of the prompt so it knows the "can't do that" option exists.

I'm aware of the "OR" statements in grammar (my original comment uses that). In my experience though, small models quickly get confused when you add extra layers to the JSON schema.

replies(1): >>coder5+K

>>behnam+(OP)
I wouldn’t provide the grammar itself directly, since I feel like the models probably haven’t seen much of that kind of grammar during training, but just JSON examples of what success and error look like, as well as an explanation of the task. The model will need to generate JSON (at least with the grammar I’ve been providing), so seeing JSON examples seems beneficial.

But, this is all very new stuff, so certainly worth experimenting with all sorts of different approaches.

As far as small models getting confused, I’ve only really tested this with Mixtral, but it’s entirely possible that regular Mistral or other small models would get confused… more things I would like to get around to testing.

replies(1): >>behnam+y2

>>coder5+K
I've tested giving the JSON schema to the model (bigger ones can handle multi-layer schemas) __without__ grammar and it was still able to generate the correct answer. To me it feels more natural than grammar enforcement because the model stays in its "happy place". I then sometimes add the grammar on top to guarantee the desired output structure.

This is obviously not efficient because the model has to process many more tokens at each interaction, and its context window gets full quicker as well. I wonder if others have found better solutions.