Pushing ChatGPT's Structured Data Support to Its Limits

>>goranm+(OP)
> very few open-source LLMs explicitly claim they intentionally support structured data, but they’re smart enough and they have logically seen enough examples of JSON Schema that with enough system prompt tweaking they should behave.

Open source models are actually _better_ at structured outputs because you can adapt them using tools like JSONFormer et al that interact with the internals of the model (https://www.reddit.com/r/LocalLLaMA/comments/17a4zlf/reliabl...). The structured outputs can be arbitrary grammars, for example, not just JSON (https://github.com/outlines-dev/outlines#using-context-free-...).

>>twelft+08
That last link is interesting. See https://github.com/outlines-dev/outlines#using-context-free-... specifically

    # ...
    sequence = generator("Write a formula that returns 5 using only additions and subtractions.")
    # It looks like Mistral is not very good at arithmetics :)
    print(sequence)
    # 1+3-2-4+5-7+8-6+9-6+4-2+3+5-1+1

sure, that's "correct" per the definition of the grammar, but it's also one of the worst possible way to get to the number 5

zlacker