LLMs cannot find reasoning errors, but can correct them

>>koie+(OP)
I wonder if separate LLMs can find each other’s logical mistakes. If I ask llama to find the logical mistake in Yi output, would that work better than llama finding a mistake in llama output?

A logical mistake might imply a blind spot inherent to the model, a blind spot that might not be present in all models.

>>valine+ke
wouldn't this effectively be using a "model" twice the size?

Would it be better to just double the size of one of the models rather than house both?

Genuine question

>>EricMa+gk
Parsing is faster than generating, so having a small model produce a whole output and then have Goliath only produce "good/bad" single token response evaluation would be faster than having Goliath produce everything. This would be the extreme, adhoc and iterative version of speculative decoding, which is already a thing and would probably give the best compromise

zlacker