zlacker

[return to "OpenAI departures: Why can’t former employees talk?"]
1. thorum+Bu[view] [source] 2024-05-17 23:10:57
>>fnbr+(OP)
Extra respect is due to Jan Leike, then:

https://x.com/janleike/status/1791498174659715494

◧◩
2. adamta+dH[view] [source] 2024-05-18 01:28:01
>>thorum+Bu
Reading that thread it’s really interesting to me. I see how far we’ve come in a short couple of years. But I still can’t grasp how we’ll achieve AGI within any reasonable amount of time. It just seems like we’re missing some really critical… something…

Idk. Folks much smarter than I seem worried so maybe I should be too but it just seems like such a long shot.

◧◩◪
3. killer+Hk1[view] [source] 2024-05-18 11:58:56
>>adamta+dH
I have a theory why people end up with wildly different estimates...

Given the model is probabilistic and does many things in parallel, its output can be understood as a mixture, e.g. 30% trash, 60% rehashed training material, 10% reasoning.

People probe model in different ways, they see different results, and they make different conclusions.

E.g. somebody who assumes AI should have impeccable logic will find "trash" content (e.g. incorrectly retrieved memory) and will declare that the whole AI thing is overhyped bullshit.

Other people might call model a "stochastic parrot" as they recognize it basically just interpolates between parts of the training material.

Finally, people who want to probe reasoning capabilities might find it among the trash. E.g. people found that LLMs can evaluate non-trivial Python code as long as it sends intermediate results to output: https://x.com/GrantSlatton/status/1600388425651453953

I interpret "feel the AGI" (Ilya Sutskever slogan, now repeated by Jan Leike) as a focus on these capabilities, rather than on mistakes it makes. E.g. if we go from 0.1% reasoning to 1% reasoning it's a 10x gain in capabilities, while to an outsider it might look like "it's 99% trash".

In any case, I'd rather trust intuition of people like Ilya Sutskever and Jan Leike. They aren't trying to sell something, and overhyping the tech is not in their interest.

Regarding "missing something really critical", it's obvious that human learning is much more efficient than NN learning. So there's some algorithm people are missing. But is it really required for AGI?

And regarding "It cannot reason" - I've seen LLMs doing rather complex stuff which is almost certainly not in the training set, what is it if not reasoning? It's hard to take "it cannot reason" seriously from people

[go to top]