Gemini 2.5 has since been superceded by 3.0, which is less likely to need hints. 2.5 was not as strong as the contemporary GPT model, but 3.0 with Pro Thinking mode enabled is up there with the best.
Finally, saying, "Well, they were given some hints" is like me saying, "LOL, big deal, I could drag a Tour peleton up Col du Galibier if I were on the same drugs Lance was using."
No, in fact I could do no such thing, drugs or no drugs. Similarly, a model that can't legitimately reason will not be able to solve these types of problems, even if given hints.