Maybe they mean the more vram needed for agentic AI? but then the sane thing to say would be that theyll offer more compute for AI.
its just an unhinged thing for a chip manufacturer to say.
1) Context memory requirements scale quadratically with length. 2) "Agentic" AI requires a shittonne of context IME. Like a horrifying amount. Tool definitions alone can add up to thousands upon thousands of tokens, plus schemas and a lot of 'back and forth' context use between tool(s). If you just import a moderately complicated OpenAPI/Swagger schema and use it "as is" you will probably run into the hundreds of thousands of tokens within a few tool calls. 3) Finally, compute actually isn't the bottleneck, its memory bandwidth.
There is a massive opportunity for someone to snipe nvidia for inference at least. Inference is becoming pretty 'standardized' at least with the current state of play. If someone can come along with a cheaper GPU with a lot of VRAM and a lot of memory bandwidth, NVidia's moat is far less software wise than it is for CUDA as a whole. I think AMD are very close to reaching that FWIW.
I suspect training and R&D will remain more in NVidias sphere but if Intel got its act together there is definitely room for competition here.