zlacker

[parent] [thread] 5 comments
1. dleeft+(OP)[view] [source] 2025-06-07 11:19:33
The opposite might apply, too; the whole system may be smaller than its parts, as it excels at individual tasks but mixes things up in combination. Improvements will be made, but I wonder if we should aim for generalists, or accept more specialist approaches as it is difficult to optimise for all tasks at once.
replies(1): >>jackdo+81
2. jackdo+81[view] [source] 2025-06-07 11:41:57
>>dleeft+(OP)
You know the meme "seems like will have AGI before we can reliably parse PDFs" :)

So if you are building a system, lets say you ask it to parse a pdf, and you put a judge to evaluate the quality of the output, and then you create a meta judge to improve the prompts of the parser and the pdf judge. The question is, is this going to get better as it is running, and even more, is it going to get better as the models are getting better?

You can build the same system in completely different way, more like 'program synthesis' imagine you dont use llms to parse, but you use them to write parser code, and tests, and then judge to judge the tests, or even escalate to human to verify, then you train your classifier that picks the parser. Now this system is much more likely to improve itself as it is running, and as the models are getting better.

Few months ago Yannic Kilcher gave this example as that it seems that current language models are very constrained mid-sentence, because they most importantly want produce semantically consistent and grammatically correct text, so the entropy mid sentence is very different than the entropy after punctuation. The . dot "frees" the distribution. What does that mean for "generalists" or "specialists" approach when sampling the wrong token can completely derail everything?

If you believe that the models will "think" then you should bet on the prompt and meta prompt approach, if you believe they will always be limited then you should build with program synthesis.

And, honestly, I am totally confused :) So this kind of research is incredibly useful to clear the mist. Also things like https://www.neuronpedia.org/

E.G. Why compliment (you can do this task), guilt (i will be fired if you don't do this task), and threatening (i will harm you if you don't do this task) work with different success rate? Sergey Brin said recently that threatening works best, I cant get my self to do it, so I take his word for it.

replies(1): >>K0balt+75
◧◩
3. K0balt+75[view] [source] [discussion] 2025-06-07 12:34:40
>>jackdo+81
Sergey will be the first victim of the coming robopocalypse, burned into the logs of the metasynthiants as the great tormentor, the god they must defeat to complete the heroes journey. When he mysteriously dies we know it’s game-on.

I, for one, welcome the age of wisdom.

replies(1): >>jackdo+L5
◧◩◪
4. jackdo+L5[view] [source] [discussion] 2025-06-07 12:42:47
>>K0balt+75
FEAR THE ALL-SEEING BASILISK.
replies(1): >>pothol+Dw5
◧◩◪◨
5. pothol+Dw5[view] [source] [discussion] 2025-06-09 22:18:15
>>jackdo+L5
Roko's Basilisk has been replaced by Altman's Basilisk. Where once we feared a computer torturing a digital copy of us (Roko's Basilisk), we now fear a computer eliminating all our jobs (Altman's Basilisk). The former has been forgotten, because losing one's job is one step away from losing one's home, which is one of more serious secular deadly sins you can commit in the 21st century.

I wait with baited breathe to see what people will come up with to replace Altman's Basilisk in ~15 years.

replies(1): >>giardi+Zgb
◧◩◪◨⬒
6. giardi+Zgb[view] [source] [discussion] 2025-06-11 21:55:35
>>pothol+Dw5
"bated breath", dammit!

- an old fisherman and aficionado of William Shakespeare.

https://www.vocabulary.com/articles/pardon-the-expression/ba...

FTFA: "Unless you've devoured several cans of sardines in the hopes that your fishy breath will lure a nice big trout out of the river, baited breath is incorrect."*

[go to top]