It may not be as evident now as it was with earlier models. The models will fabricate preconditions needed to output the final answer it "wanted".
I ran into this when using quasi least-to-most style structured output.