zlacker

50% sparsity is almost certainly already being used given that it is accelerated in current nvidia hardware both at training time, usable dynamically through RigL ("Rigging the Lottery: Making All Tickets Winners" https://arxiv.org/pdf/1911.11134.pdf )--which also addresses your point about initial conditions being locked in-- and at accelerates 50% sparsity at inference time.