zlacker

[parent] [thread] 5 comments
1. iamjac+(OP)[view] [source] 2025-12-05 19:25:42
Curious how this will fare when playing Pokemon Red.
replies(3): >>euvin+T >>minima+R2 >>danso+4F
2. euvin+T[view] [source] 2025-12-05 19:29:41
>>iamjac+(OP)
Yeah the "High frame rate understanding" feature caught my eye, actual real time analysis of live video feeds seems really cool. Also wondering what they mean by "video reasoning/thinking"?
replies(1): >>skybri+X2
3. minima+R2[view] [source] 2025-12-05 19:39:11
>>iamjac+(OP)
Gemini 3 Pro has been playing Pokemon Crystal (which is significantly harder than Red) in a race against Gemini 2.5 Pro: https://www.twitch.tv/gemini_plays_pokemon

Gemini 3 Pro has been making steady progress (12/16 badges) while Gemini 2.5 Pro is stuck (3/16 badges) despite using double the turns and tokens.

replies(1): >>theLim+Nv
◧◩
4. skybri+X2[view] [source] [discussion] 2025-12-05 19:39:39
>>euvin+T
I don’t think it’s real time? The videos were likely taken previously.
◧◩
5. theLim+Nv[view] [source] [discussion] 2025-12-05 22:09:09
>>minima+R2
I think what would be interesting is if it could play the game with vision only inputs. That would represent a massive leap multimodal understanding.
6. danso+4F[view] [source] 2025-12-05 23:10:27
>>iamjac+(OP)
> 3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code

I'm curious as to how close these models are to achieving that once long-ago mocked claim (by Microsoft I think?) that AIs could view gameplay video of long lost games and produce the code to emulate them.

[go to top]