zlacker

Scaling up improved versions of existing recipes can be done surprisingly fast if you have strong DL infrastructure. Also, GPT-3 was built on top of previous advances such as Google’s BERT. I’m surprised that it took Google so long to answer w/ PaLM, though it seems plausible to me that they wanted a clear enough qualitative advancement that people didn’t immediate say, “So what.”

You could’ve had the same reaction years ago when Google published GoogleNet followed by a series of increasingly powerful Inception models - namely that Google would wind up owning the DNN space. But it didn’t play out that way, perhaps because Google dragged its feet releasing the models and training code, and by the time it did, there were simpler and more powerful models available like ResNet.

Meta’s recent release of the actual OPT LLM weights is probably going to have more impact than PaLM, unless Google can be persuaded to open up that model.

replies(1): >>benree+m3

>>dougab+(OP)
There are a lot of really knowledgeable people on here, but this field is near and dear to my heart and it’s obvious that you know it well.

I don’t know what “we should grab a coffee or a beer sometime” means in the hyper-global post-C19 era, but I’d love to speak more on this without dragging a whole HN comment thread through it.

Drop me a line if you’re inclined: ben.reesman at gmail