zlacker

You said "Find the logical error", meaning you told it the logical error is there. That doesn't show the model can find errors when it isn't told that the error is there.

Your post shows how the model can correct a reasoning error. That is different from finding an error when it isn't pointed out, that is why he title to this post is "LLMs cannot find reasoning errors, but can correct them". You using the phrasing "find the logical error" doesn't contradict the title.

replies(1): >>kaoD+31

>>Jensso+(OP)
> All apples are red. All apples are fruit. My car is red, therefore my car is a fruit.

> The conclusion "My car is a fruit" is not logically valid. This is an example of the fallacy of the undistributed middle. The logic goes as follows:

1. All apples are red. (Premise)

2. All apples are fruit. (Premise)

3. My car is red. (Premise)

4. Therefore, my car is a fruit. (Conclusion)

The fallacy arises because the premises do not establish a shared property between "red things" and "fruit" in a way that would include the car. Just because both apples and the car share the property of being red, it does not mean they share all properties of apples, such as being a fruit.

replies(1): >>Jensso+E8

>>kaoD+31
> All apples are red. All apples are fruit. My car is red, therefore my car is a fruit.

I Googled that exact phrase and got solutions. A logical problem that can be solved by a search engine isn't a valid example, the LLM knows that it is a logical puzzle just by how you phrased it just like Google knows that it is a logical puzzle.

And no, doing tiny alterations to that until you no longer get any Google hits isn't a proof ChatGPT can do logic, it is proof that ChatGPT can parse general structure and find patterns better than a search engine can. You need to do logical problems that can't easily be translated to standard problems that there are tons of examples of in the wild.

replies(3): >>cmrdpo+SI >>kaoD+ke1 >>mister+OH1

>>Jensso+E8
Exactly. And when you realize how weak GPT is on this is by giving it complicated type system programming problems and watch it fall over and get stuck in circular, illogical patterns, and then get even crazier as you try to correct it.

It can't "reason things through", it just builds logic-like patterns based on the distillation of the work of other minds which did reason -- which works about 80% of the time, but when it fails it can't retrace its steps.

Even a really "stupid" human (c'est moi) can be made to work through and find their errors when given guidance by a patient teacher. In my experience, dialectical guidance actually makes ChatGPT worse.

>>Jensso+E8
Yeah I didn't say this was a good example (I'm not OP, was just adding info), but you're moving the goalposts from "you pointed its error" to "that is in its training data" (which is fair, just not what I was replying to, I was addressing your specific point).

Could you provide an actual example that you can't Google verbatim and would test this properly?

replies(1): >>wizzwi+Vj1

>>kaoD+ke1
Roses are red. Violets are blue. Roses are hot. Therefore, violets are cold.

replies(1): >>kaoD+Zm1

>>wizzwi+Vj1
The poem you've written follows a structure often used in humorous or nonsensical verses. The first two lines, "Roses are red, violets are blue," are a classic opening for many poems and are typically followed by lines that rhyme and make sense together. However, the next lines, "Roses are hot. Therefore, violets are cold," playfully break this expectation by using a logical structure (a "therefore" statement) but reaching a conclusion that is nonsensical. This twist creates a humorous effect.

replies(1): >>wizzwi+tt1

>>kaoD+Zm1
Are you sure it's nonsensical? Red is to blue as hot is to cold.

>>Jensso+E8
> And no, doing tiny alterations to that until you no longer get any Google hits isn't a proof ChatGPT can do logic

Can you show "the" implementation of "can do logic"?

Is it possible to demonstrate that it can do logic?