zlacker

[return to "LLMs cannot find reasoning errors, but can correct them"]
1. ilaksh+Hm[view] [source] 2023-11-20 21:01:43
>>koie+(OP)
I was just testing Bard with some very simple coding exercises and it did well.

I noticed that they automatically create at least three other draft responses.

I assume that this is a technique that allows them to try multiple times and then select the best one.

Just mentioning it because it seems like another example of not strictly "zero-shot"ing a response. Which seems important for getting good results with these models.

I'm guessing they use batching for this. I wonder if it might become more common to run multiple inference subtasks for the same main task inside of a batch, for purposes of self-correcting agent swarms or something. The outputs from step one are reviewed by the group in step 2, then they try again in step 3.

I guess that only applies for a small department where there is frequently just one person using it at a time.

◧◩
2. erhaet+VA[view] [source] 2023-11-20 22:05:57
>>ilaksh+Hm
I don't like this. It forces me to read 2 prompts instead of 1 so that I can help train their LLM. ChatGPT and Bard already have regenerate buttons if I don't like their response, it doesn't need to be that in my face.
◧◩◪
3. moritz+dD[view] [source] 2023-11-20 22:19:08
>>erhaet+VA
I think there is an argument that it would be beneficial for this to be common, despite the cognitive burden.

It forces you to remind yourself of the stochastic nature of the model and RILHF, maybe the data even helps to improve the latter.

I liked this trait of Bard from the start and hope they keep it.

It provides a sense of agency and reminds to not anthropomorphize the transformer chatbot too much.

[go to top]