zlacker

> I don't see why it couldn't just one shot it without all the reasoning.

That's reminding me of deep neural networks where single layer networks could achieve the same results, but the layer would have to be excessively large. Maybe we're re-using the same kind of improvement, scaling in length instead of width because of our computation limitations ?