Relevantish: https://arxiv.org/abs/2301.00774
The fact that we can reach those levels of sparseness with pruning also indicates that we're not doing a very good job of generating the initial network conditions.
Being able to come up with trainable initial settings for sparse networks across different topologies is hard, but given that we've had a degree of success with pre-trained networks, pre-training and pre-pruning might also allow for sparse networks with minimally compromised learning capabilities.
If it's possible to pre-train composable network modules, it might also be feasible to define trainable sparse networks with significantly relaxed topological constraints.
We have all kinds of advancements to make training cheaper, models computationally cheaper, smaller, etc.
Once that happens/happened, it benefits OAI to throw up walls via legislation.
Maybe not peaked yet, but the case can be made that we’re not seeing infinite supply…
Big tech advances, like the models of the last year or so, don't happen without a long tail of significant improvements based on fine tuning, at a minimum.
The number of advances being announced by disparate groups, even individuals, also indicates improvements are going to continue at a fast pace.