So if you are building a system, lets say you ask it to parse a pdf, and you put a judge to evaluate the quality of the output, and then you create a meta judge to improve the prompts of the parser and the pdf judge. The question is, is this going to get better as it is running, and even more, is it going to get better as the models are getting better?
You can build the same system in completely different way, more like 'program synthesis' imagine you dont use llms to parse, but you use them to write parser code, and tests, and then judge to judge the tests, or even escalate to human to verify, then you train your classifier that picks the parser. Now this system is much more likely to improve itself as it is running, and as the models are getting better.
Few months ago Yannic Kilcher gave this example as that it seems that current language models are very constrained mid-sentence, because they most importantly want produce semantically consistent and grammatically correct text, so the entropy mid sentence is very different than the entropy after punctuation. The . dot "frees" the distribution. What does that mean for "generalists" or "specialists" approach when sampling the wrong token can completely derail everything?
If you believe that the models will "think" then you should bet on the prompt and meta prompt approach, if you believe they will always be limited then you should build with program synthesis.
And, honestly, I am totally confused :) So this kind of research is incredibly useful to clear the mist. Also things like https://www.neuronpedia.org/
E.G. Why compliment (you can do this task), guilt (i will be fired if you don't do this task), and threatening (i will harm you if you don't do this task) work with different success rate? Sergey Brin said recently that threatening works best, I cant get my self to do it, so I take his word for it.
I, for one, welcome the age of wisdom.
I wait with baited breathe to see what people will come up with to replace Altman's Basilisk in ~15 years.
- an old fisherman and aficionado of William Shakespeare.
https://www.vocabulary.com/articles/pardon-the-expression/ba...
FTFA: "Unless you've devoured several cans of sardines in the hopes that your fishy breath will lure a nice big trout out of the river, baited breath is incorrect."*