Flash answered correctly in ~2 seconds, at most. Pro answered very wrongly after thinking and elaborating for ~5 minutes.
Flash was also giving a wrong answer for the same string in the past, but it improved.
Prompt was the same: "Hey, can you decode $BASE64_string?"
I have no further comments.
Realistically there are many problems that non-reasoning models do better on, especially when the answer cannot be solved by a thought process: like recalling internal knowledge.
You can try to teach the model the concept of a problem where thinking will likely steer it away from the right answer, but at some point it becomes like the halting problem... how does the model reliably think its way into the realization a given problem is too complex to be thought out?