zlacker

[return to "A statistical approach to model evaluations"]
1. fnordp+Are[view] [source] 2024-11-29 18:56:21
>>RobinH+(OP)
This does feel a bit like under grad introduction to statistical analysis and surprising anyone felt the need to explain these things. But I also suspect most AI people out there now a days have limited math skills so maybe it’s helpful?
◧◩
2. godels+5Ke[view] [source] 2024-11-29 21:26:36
>>fnordp+Are
As an ML researcher who started in physics (this seems common among physics/math turned ML people. Which Evan is included), I cannot tell you how bad is it... One year at CVPR when diffusion models hit the scenes I was asking what people's covariance was (I had overestimated the model complexity), and the most common answer I got was "how do I calculate that?" People do not understand things like what "pdf" means. People at top schools! I've been told I'm "gatekeeping" for saying that you should learn math (I say "you don't need math to build good models, but you do to understand why they're wrong"). Not that you need to, but should. (I guess this explains why Mission Impossible Language Models won best paper...)

I swear, the big reason models are black boxes are because we _want_ them to be. There's clear anti-sentiment mentality against people doing theory and the result of this shows. I remember not too long ago Yi Tay (under @agihippo but main is @YiTayML) said "fuck theorists". I guess it's not a surprise Deep Mind recently hired him after that "get good" stuff.

Also, I'd like to point out, the author uses "we" but the paper only has one author on it. So may I suggest adding their cat as a coauthor? [0]

[0] https://en.wikipedia.org/wiki/F._D._C._Willard

◧◩◪
3. mturmo+sVe[view] [source] 2024-11-29 23:11:30
>>godels+5Ke
The front matter in Vladimir Vapnik’s book “Statistical Learning Theory” (first edition published 1995) has this quote:

*

During the last few years at various computer science conferences, I heard reiteration of the following claim:

“Complex theories do not work; simple algorithms do.”

One of the goals of this book is to show that, at least in the problems of statistical inference, this is not true. I would like to demonstrate that in the area of science a good old principle is valid:

“Nothing is more practical than a good theory.”

*

It’s seen in page xii of the front matter at: https://link.springer.com/content/pdf/bfm:978-1-4757-3264-1/...

Vladimir was a friend during this time, and I think about this quote a lot with regards to ML tinkering.

◧◩◪◨
4. godels+Mqf[view] [source] 2024-11-30 06:05:27
>>mturmo+sVe
I haven't had a chance to read that, but that quote suggests I should (especially considering the author and the editors).

I often refer to "Elephant Fitting" w.r.t these systems. I suspect you understand this, but I think most think it is just about overfitting. But the way problem isn't about the number of parameters, but that parameters need to be justified. As explained by Dyson here[0]. Vladimir's quote really reminds me of this. Fermi likewise was stressing the importance of theory.

I think it is a profound quote, and you were (are?) lucky to have that friendship. I do think abstraction is at the heart of intelligence. François Chollet discusses it a lot, and he's far from alone. It seems to be well agreed upon in the neuroscience and cognitive science communities. I think this is essential to understand in our path forward to developing intelligent systems, because there are plenty of problems that need to be solved in which there is no algorithmic procedure. Where there is no explicit density function. Intractable, doubly intractable, and more problems. Maybe we're just too dumb, but it's clear there are plateaus where luck is needed to advance. I do not believe our current machines would be capable of closing a gap.

[0] https://www.youtube.com/watch?v=hV41QEKiMlM

◧◩◪◨⬒
5. mturmo+MFl[view] [source] 2024-12-03 07:51:51
>>godels+Mqf
Thank you for the link.

Yeah, what we are doing now in ML isn’t really “engineering” in the best sense of the word. We don’t have a theoretical machinery that can predict performance of ML designs. (Like the way you can for a coding scheme in communications theory.) We have a lot of clever architectures and techniques, but not a design capacity.

[go to top]