zlacker

[parent] [thread] 3 comments
1. jasonj+(OP)[view] [source] 2025-12-05 20:07:51
That is... astronomically different. Is GPT-5.1 downscaling and losing critical information or something? How could it be so different?
replies(3): >>ericd+4c >>zubiau+Wx >>energy+Qz
2. ericd+4c[view] [source] 2025-12-05 21:07:34
>>jasonj+(OP)
I found much better results with smallish UI elements in large screenshots on GPT by slicing it up manually and feeding them one at a time. I think it does severely lossy downscaling.
3. zubiau+Wx[view] [source] 2025-12-05 23:17:33
>>jasonj+(OP)
It has a rather poor max resolution. Higher resolution images get tiled up to a point. 512 x 512, I think is the max tile size, 2048 x 2048 the max canvas.
4. energy+Qz[view] [source] 2025-12-05 23:28:22
>>jasonj+(OP)
This is my default explanation for visual impairments in LLMs, they're trying to compress the image into about 3000 tokens, you're going to lose a lot in the name of efficiency.
[go to top]