zlacker

I was hopeful for a private-industry approach to AI safety, but it looks unlikely now, and due to the slow pace of state investment in public AI R&D, all approaches to AI safety look unlikely now.

Safety research on toy models will continue to provide developments, but the industry expectation appears to be that emergent properties puts a low ceiling on what can be learned about safety without researching on cutting edge models.

Altman touted the governance structure of OpenAI as a mechanism for ensuring the organisation's prioritisation of safety, but the reports of internal reallocation away from safety towards keeping ChatGPT running under load concern me. Now the board has demonstrated that it was technically capable but insufficiently powerful to keep these interests in line, it seems unclear how any safety-oriented organisation, including Anthropic, could avoid the accelerationist influence of funders.

replies(4): >>throwu+u2 >>abra0+2f >>mymuse+Uv >>sgt101+E21

>>MattHe+(OP)
Easy, don’t be incompetent and don’t abuse your power for personal gain. People aren’t as dumb as you think they are and they will see right through that bullshit and quit rather than follow idiot tyrants.

>>MattHe+(OP)
More effort spent on early commercialization like keeping ChatGPT running might mean less effort on cutting edge capabilities. Altman was never an AI safety person, so my personal hope is that Anthropic avoids this by having higher quality leadership.

>>MattHe+(OP)
I would like to know the model that isn’t a “toy model”.

>>MattHe+(OP)
There are no emergent properties, just a linear increase in knowledge that can be retrieved.

- It can't plan

- It can't do arithmetic

- It can't reason

- It can approximately retrieve knowledge with a natural language query (there are some issues with this, but it's very good)

- It can encode data into natural languages and other modalities

I'm not worried about it, I am worried about how badly people have misunderstood what it can do and then attempted to use it for things that matter.

But I'm not surprised.

replies(3): >>Davidz+sy1 >>zucker+HV1 >>quickt+B33

>>sgt101+E21
This is incorrect. For example the ability to translate between languages is emergent. Also gpt4 can do arithmetic better than the average person. Especially considering the process it arrives at the computation is via intuition basically vs algorithmic. Btw just as an aide the newer models can also write code to do certain tasks, like arithmetic.

replies(2): >>sgt101+WG3 >>james-+DC6

>>sgt101+E21
What is your definition of reasoning? In my mind, GPT-4 has some nascent reasoning abilities.

>>sgt101+E21
I don't think AI safetyists are worried about any model they have created so far. But if we are able to go from letter-soup "ooh look that almost seems like a sentence, SOTA!" to GPT4 in 20 years, where will go in the next 20? And what is the point they are becoming powerful. Let alone all the crazy ways people are trying to augment them with RAG, function calls, get them to run on less computer power and so on.

Also being better at humans at everything is not a prerequisite for danger. Probably a scary moment is when it could look at a C (or Rust, C++, whatever) codebase, find an exploit, and then use that exploit as a worm. If it can do that on everyday hardware not top end GPUs (either because the algorithms are made more efficient, or every iPhone has a tensor unit).

>>Davidz+sy1
Language translation is due to the huge corpus of translations that it's trained on. Google translate has been doing this for years. People don't apply softmax to their arithmetic. Again, code generation is approximate retrieval, it can't generate anything outside of it's training distribution.

>>Davidz+sy1
Not necessarily; much smaller models like T5 which in some ways introduced instructions (not RLHF yet) did have to include specific instructions for useful translation - of similar format to those you find in large scale web translation data, but this is coincidental: you can finetune it with whatever instruction word you want to indicate translation - the point is, a much smaller model can translate.

The base non-RLHF GPT models could do translation by prefixing by the target language and a semi colon, but only above a certain amount of parameters are they consistent. GPT-2 didn't always get it right and of course had general issues with continuity. However, you could always do some parts of translation with older transformer models like BERT, especially multilingual ones.

Larger models across different from-base training runs show that they become more effective at translation at certain points, but I think this is about the capacity to store information, not emergence per say (if you understand my difference here). You've probably noticed and it has always seemed to me 4B, 6B and 9B are the largest rough parameter sizes with 2020 style training set ups that you see the most general "appearance" of some useful behaviours that you could "glean" from the web and book data that doesn't include instructions, while consistency seems to remain the domain of larger models or mixed expert models and lots of RLHF training/tricks. The easiest way to see this is to compare GPT-2 large, GPT-J and GPT-20B and see how well they perform at different tasks. However the fact it's about size in these GPTs, and yet smaller models (T5 instruction tuned / multilingual BERT) can perform at the same level on some tasks implies that it is about what the model is focusing it's learning on for the training task at hand, and controllable, rather than being innate at a certain parameter size. Language translations just do make up a lot of the data. I don't think it would emerge if you removed all cases of translation / multi language input/outputs, definitely not at the same parameter size, even if you had the same overall proportion of languages in the training corpus, if that makes sense? It just seems too much an artefact of the corpus aligning with the task.

Likewise for code - Gpt-4 generated code is not like arithmetic in the sense of the way people might mean it for code (e.g. branching instructions / abstract syntax tree) - its a fundamentally local text form of generation, this is why it can happily add illegal imports etc to diffs (perhaps one day training will resolve this) - it doesn't have the AST or compiler or much consistent behaviour to imply it deeply understands as it writes the code what could occur.

However if recent reports about arithmetic being an area of improvement are true, I am very excited, as a lot of what I wrote above - will have to be reconceptualised... and that is the most exciting scenario...