zlacker

[return to "Gemini 3 Pro: the frontier of vision AI"]
1. aziis9+Xc1[view] [source] 2025-12-05 22:01:01
>>xnx+(OP)
> Pointing capability: Gemini 3 has the ability to point at specific locations in images by outputting pixel-precise coordinates. Sequences of 2D points can be strung together to perform complex tasks, such as estimating human poses or reflecting trajectories over time

Does somebody know how to correctly prompt the model for these tasks or even better provide some docs? The pictures with the pretty markers are appreciated but that section is a bit vague and without references

◧◩
2. theman+0A1[view] [source] 2025-12-06 00:47:06
>>aziis9+Xc1
Simon Wilson has some good blogs on this: https://simonwillison.net/2024/Aug/26/gemini-bounding-box-vi...
[go to top]