Lol ok. Still human level. and GPT-4 is way above average in most tasks.
>Computers have been doing arithmetic at well above "average human level" since they were first invented.
Cool. That's what the general in agi is about. GPT-4 is very general.
>The premise of AGI isn't that it can do something better than people, it's that it can do everything at least as well.
as well as what kind of people ? experts ? That was not the premise of agi when the term was coined or for a long time afterwards. Posts have shifted(as they often do in this field) so that that's what the term seems to mean now but agi was artificial and generally intelligent, which has been passed.
There's no difference between your definition of agi which is supposed to surpass experts in every field and super intelligence.
It has access to a lot of information that most humans don't have memorized. It's a better search engine than most humans. And it can format that information into natural language.
But can it drive a car? If given an incentive to not confabulate and the knowledge that its statements are being verified, can it achieve that as consistently as the median human?
If you start by giving it a simple instruction with stark consequences for not following it, can it continue to register the importance of that instruction even after you give it a lot more text to read?
> as well as what kind of people ? experts ?
Experts are just ordinary people with specific information. You're giving the specific information to the AI, aren't you? It's in the training data.
> There's no difference between your definition of agi which is supposed to surpass experts in every field and super intelligence.
That's because there is no difference between them. Super intelligence is achievable just by making general intelligence faster. If you have AGI and can make it go faster by throwing more compute hardware at it then you have super intelligence.
It's not just about knowledge.
Lots of papers showing strong reasoning across various reasoning types. Couple papers demonstrating the development of world models too.
>It's a better search engine than most humans. And it can format that information into natural language.
Not how this works. They aren't search engines. and their performance equity with people isn't relegated to knowledge tasks alone.
>But can it drive a car? If given an incentive to not confabulate and the knowledge that its statements are being verified, can it achieve that as consistently as the median human?
Can a blind man drive a car ? a man with no hands ?
>If you start by giving it a simple instruction with stark consequences for not following it, can it continue to register the importance of that instruction even after you give it a lot more text to read?
Lol yes
>Experts are just ordinary people with specific information. You're giving the specific information to the AI, aren't you? It's in the training data.
No. Experts are people with above average aptitude for any given domain. It's not just about knowledge. many people try and fail to become experts in any given domain.
>That's because there is no difference between them. Super intelligence is achievable just by making general intelligence faster.
That's not how intelligence works. Dumb thinking sped up is just more dumb thinking but faster.
Actual reasoning, or reconstruction of existing texts containing similar reasoning?
> Not how this works. They aren't search engines. and their performance equity with people isn't relegated to knowledge tasks alone.
It kind of is how this works, and most of the source of its ability to beat average humans at things is on knowledge tasks.
> Can a blind man drive a car ? a man with no hands ?
Lack of access to cameras or vehicle controls isn't why it can't drive a car.
> Lol yes
The existence of numerous ChatGPT jailbreaks is evidence to the contrary.
> No. Experts are people with above average aptitude for any given domain. It's not just about knowledge. many people try and fail to become experts in any given domain.
Many people are of below average intelligence, or give up when something is hard but not impossible.
> That's not how intelligence works. Dumb thinking sped up is just more dumb thinking but faster.
If you have one machine that will make one attempt to solve a problem a day and succeeds 90% of the time and another that will make a billion attempts to solve a problem a second and succeeds 10% of the time, which one has solved more problems by the end of the week?
Average thinking sped up is above average.
The papers were linked in another comment. 3 of them don't even have anything to do with a existing dataset testing. so yeah, actual.
for the world model papers
https://arxiv.org/abs/2210.13382
https://arxiv.org/abs/2305.11169
>Lack of access to cameras or vehicle controls isn't why it can't drive a car.
It would be best to wait till what you say can be evaluated. that is your hunch, not fact.
>The existence of numerous ChatGPT jailbreaks is evidence to the contrary.
No it's not. People fall for social engineering and do what you ask. if you think people can't be easily derailed, boy do i have a bridge for you.
>Many people are of below average intelligence, or give up when something is hard but not impossible.
Ok. Doesn't help your point. and many above average people don't reach expert level either. If you want to rationalize all that as "gave up when it wasn't impossible", go ahead lol but reality paints a very different picture.
>If you have one machine that will make one attempt to solve a problem a day and succeeds 90% of the time and another that will make a billion attempts to solve a problem a second and succeeds 10% of the time, which one has solved more problems by the end of the week?
"Problems" aren't made equal. Practically speaking, it's very unlikely the billion per second thinker is solving any of the caliber of problems the one attempt per day is solving. Solving more "problems" does not make you a super intelligence.
For anyone following along, they are in my sibling comment. Linked papers here[0]. The exact same conversation is happening there, but sourced.
> 3 of them don't even have anything to do with a existing dataset testing
Specifically I address this claim and bring strong evidence to why you should doubt this claim. Especially this specific wording. The short end is when you scrape the entire internet for your training data that you have a lot of overlap and that you can't confidently call these evaluations "zero shot." All experiments performed in the linked works use datasets that are not significantly different from data found in the training set. For those that are "hand written" see my complaints (linked) about HumanEval.
LLMs aren't even the right kind of thing to drive a car. We have AIs that attempt to drive cars and have access to cameras and vehicle controls and they still crash into stationary objects.
> No it's not. People fall for social engineering and do what you ask. if you think people can't be easily derailed, boy do i have a bridge for you.
Social engineering works because most human interactions aren't malicious and the default expectation is that any given one won't be.
That's a different thing than if you explicitly point out that this text in particular is confirmed malicious and you must not heed it, and then it immediately proceeds to do it anyway.
And yes, you can always find that one guy, but that's this:
> Many people are of below average intelligence
It has to beat the median because if you go much below it, there are people with brain damage. Scoring equal to someone impaired or disinclined to make a minimal effort isn't a passing grade.
> "Problems" aren't made equal. Practically speaking, it's very unlikely the billion per second thinker is solving any of the caliber of problems the one attempt per day is solving.
The speed is unrelated to the difficulty. You get from one a day to a billion a second by running it on a thousand supercomputers instead of a single dated laptop.
So the percentages are for problems of equal difficulty.
This is infinite monkeys on infinite typewriters. Except that we don't actually have infinite monkeys or infinite typewriters, so an AI which is sufficiently terrible can't be made great by any feasible amount of compute resources. Whereas one which is kind of mediocre and fails 90% of the time, or even 99.9% of the time, can be made up for in practice with brute force.
But there are still problems that ChatGPT can't even solve 0.1% of the time.