zlacker

[parent] [thread] 1 comments
1. Medium+(OP)[view] [source] 2023-11-18 06:47:43
> We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as "Uriah Hawthorne is the composer of 'Abyssal Melodies'" and showing that they fail to correctly answer "Who composed 'Abyssal Melodies?'". The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation.

This just proves that the LLMs available to them, with the training and augmentation methods they employed, aren't able to generalize. This doesn't prove that it is impossible for future LLMs or novel training and augmentation techniques will be unable to generalize.

replies(1): >>visarg+W4
2. visarg+W4[view] [source] 2023-11-18 07:34:48
>>Medium+(OP)
No, if you read this article it shows there were some issues with the way they tested.

> The claim that GPT-4 can’t make B to A generalizations is false. And not what the authors were claiming. They were talking about these kinds of generalizations from pre and post training.

> When you divide data into prompt and completion pairs and the completions never reference the prompts or even hint at it, you’ve successfully trained a prompt completion A is B model but not one that will readily go from B is A. LLMs trained on “A is B” fail to learn “B is A” when the training date is split into prompt and completion pairs

Simple fix - put prompt and completion together, don't do gradients just for the completion, but also for the prompt. Or just make sure the model trains on data going in both directions by augmenting it pre-training.

https://andrewmayne.com/2023/11/14/is-the-reversal-curse-rea...

[go to top]