It's related to the No Free Lunch Theorems. It basically says that if an algorithm performs well on a certain class of learning, searching or optimization problems, then it necessarily pays for that with degraded performance on the set of all remaining problems.
In other words, you always need bias to learn meaningfully. More you have (the right kind of) bias, faster you can learn the subject in hand and slower in all other kinds. In neural networks the bias is not just the weights. There is bias in the selection of random distribution of the network weights (uniform, Gaussian etc.) There is bias in the network topology. There is bias in the learning algorithm, activation function, etc.
Convolutional neural networks are good example. They have very strong bias baked into them and it works really well.