https://static.googleusercontent.com/media/research.google.c... and https://norvig.com/chomsky.html
In short, Norvig concludes there are several conceptual approaches to ML/AI/Stats/Scientific analysis. One is "top down": teach the system some high level principles that correspond to known general concepts, and the other is "bottom up": determine the structure from the data itself and use that to generate general concepts. He observes that while the former is attractive to many, the latter has continuously produced more and better results with less effort.
I've seen this play out over and over. I've concluded that Norvig is right: empirically based probabilistic models are a cheaper, faster way to answer important engineering and scientific problems, even if they are possibly less satisfying intellectually. Cheap approximations are often far better than hard to find analytic solutions.
This is the same pattern explaining why bottom-up economic systems, i.e. lassaire faire free markets, flawed as they are, work better than top-down systems like central planning.