zlacker

I've tested giving the JSON schema to the model (bigger ones can handle multi-layer schemas) __without__ grammar and it was still able to generate the correct answer. To me it feels more natural than grammar enforcement because the model stays in its "happy place". I then sometimes add the grammar on top to guarantee the desired output structure.

This is obviously not efficient because the model has to process many more tokens at each interaction, and its context window gets full quicker as well. I wonder if others have found better solutions.