zlacker

[return to "Imagen, a text-to-image diffusion model"]
1. mistri+Ac[view] [source] 2022-05-23 22:01:07
>>kevema+(OP)
Reading a relatively-recent Machine Learning paper from some elite source, and after multiple repititions of bragging and puffery, in the middle of the paper, the charts show that they had beaten the score of a high-ranking algorithm in their specific domain, moving the best consistant result from 86% accuracy to 88% accuracy, somewhere around there. My response was: they got a lot of attention within their world by beating the previous score, no matter how small the improvement was.. it was a "winner take all" competition against other teams close to them; the accuracy of less than 90% is really of questionable value in a lot of real world problems; it was an enormous amount of math and effort for this team to make that small improvement.

What I see is semi-poverty mindset among very smart people who appear to be treated in a way such that the winners get promotion, and everyone else is fired. That this sort of analysis with ML is useful for massive data sets at scale, where 90% is a lot of accuracy, not at all for the small sets of real world, human-scale problems where each result may matter a lot. The amount of years of training that these researchers had to go through, to participate in this apparently ruthless environment, are certainly like a lottery ticket, if you are in fact in a game where everyone but the winner has to find a new line of work. I think their masters live in Redmond, if I recall.. not looking it up at the moment.

◧◩
2. gwern+RB[view] [source] 2022-05-24 01:16:58
>>mistri+Ac
What you're missing is that the performance on a pretext task like ImageNet top-1 will transfer outside ImageNet, and as you go further into the high score regime, often a small % can yield qualitatively better results because the underlying NN has to solve harder and harder problems, eliciting true solutions rather than a patchwork of heuristics.

Nothing in a Transformer's perplexity in predicting the next token tells you that at some point it suddenly starts being able to write flawless literary style parodies, and this is why the computer art people become virtuosos of CLIP variants and are excited by new ones, because each one attacks concepts in slightly different ways and a 'small' benchmark increase may unlock some awesome new visual flourish that the model didn't get before.

[go to top]