zlacker

[parent] [thread] 2 comments
1. vunder+(OP)[view] [source] 2025-12-05 20:55:55
In fact, one of the tests I use as part of GenAI Showdown involves both parts of the puzzle: draw a maze with a clearly defined entrance and exit, along with a dashed line indicating the solution to the maze.

Only one model (gpt-image-1) out of the 18 tested managed to pass the test successfully. Gemini 3.0 Pro got VERY close.

https://genai-showdown.specr.net/#the-labyrinth

replies(1): >>daniel+81
2. daniel+81[view] [source] 2025-12-05 21:01:28
>>vunder+(OP)
super cool! Interesting note about Seedream 4 - do you think awareness of A* actually could improve the outcome? Like I said, I'm no AI expert, so my intuitions are pretty bad, but I'd suspect that image analysis + algorithmic pathfinding don't have much crossover in terms of training capabilities. But I could be wrong!
replies(1): >>vunder+H1
◧◩
3. vunder+H1[view] [source] [discussion] 2025-12-05 21:04:44
>>daniel+81
Great question. I do wish we had a bit more insight into the exact background "thinking" that was happening on systems like Seedream.

When you think about posing the "solve a visual image of a maze" to something like ChatGPT, there's a good chance it'll try to throw a python VM at it, threshold it with something like OpenCV, and use a shortest-path style algorithm to try and solve it.

[go to top]