zlacker

[parent] [thread] 1 comments
1. Me1000+(OP)[view] [source] 2023-12-21 01:57:23
It trained the model with a lot of data to write code instead (probably sandwiched between some special tokens like [run-python]. The LLM runner then takes the code, runs it in a sandbox, and feeds the output back into the prompt and lets GPT continue inferencing. But TL;DR: it trained the model to write code for math problems instead of trying to solve them itself.
replies(1): >>averev+no
2. averev+no[view] [source] 2023-12-21 06:35:24
>>Me1000+(OP)
It also has some training on problem decomposition. Many smaller models fail before writing the code, they fail when parsing the question.

You can ask them to serialized a problem in prolog, and see exactly when their understanding breaks - this is open hermes 2.5: https://pastebin.com/raw/kr62Hybq

[go to top]