Gemini 3 Pro: the frontier of vision AI

>>xnx+(OP)
Since I think it's interesting to highlight the jagged intelligence, I have a simple word search puzzle [0] that Nano Banana Pro stills struggles to solve correctly. Gemini 3 Pro with Code Execution is able to one-shot the problem and find the positions of each word (this is super impressive! one year ago it wasn't possible), but Nano Banana Pro fails to highlight the words correctly.

Here's the output from two tests I ran:

1. Asking Nano Banana Pro to solve the word search puzzle directly [1].

2. Asking Nano Banana Pro to highlight each word on the grid, with the position of every word included as part of the prompt [2].

The fact that it gets 2 words correct demonstrates meaningful progress, and it seems like we're really close to having a model that can one-shot this problem soon.

There's actually a bit of nuance required to solve this puzzle correctly which an older Gemini model struggled to do without additional nudging. You have to convert the grid or word list to use matching casing (the grid uses uppercase, the word list uses lowercase), and you need to recognize that "soup mix" needs to have the space removed when doing the search.

[0] https://imgur.com/ekwfHrN

[1] https://imgur.com/1nybezU

[2] https://imgur.com/18mK5i5

>>TheAce+HW
If you're using for instance the Gemini web app there may be a preference in the system prompt to immediately favor the fact that you said to create an image when in fact it may have been better to initially start with a regular chat prompt, making sure you're on Gemini 3 Pro thinking, and then give it exactly what you usually would. You can tell it that after it has an answer to the question then to create an image for it.

This may even work if you tell it to do all that prior to figuring out what to create for the image,

zlacker