zlacker

The authors don't say anything like this that I can see. Their conclusion specifically identifies these as weaknesses of current frontier models.

Furthermore we have clearly seen increases in reasoning from previous frontier models to current frontier models.

If the authors could /did show that both previous-generation and current-generation frontier models hit a wall at similar complexity that would be something, AFAIK they do not.