zlacker

[return to "A statistical approach to model evaluations"]

>>RobinH+(OP)
This does feel a bit like under grad introduction to statistical analysis and surprising anyone felt the need to explain these things. But I also suspect most AI people out there now a days have limited math skills so maybe it’s helpful?

>>fnordp+Are
I don't know what it's true to suspect, since clearly a lot of very smart people are working in the field, in places.

It is empirically true that none of the industry discourse around leaderboards and benchmarks uses any of the techniques this article discusses.

[go to top]