I don’t necessarily fault OpenAI’s decision to initially train their models without entering into licensing agreements - they probably wouldn’t exist and the generative AI revolution may never have happened if they put the horse before the cart. I do think they should quickly course correct at this point and accept the fact that they clearly owe something to the creators of content they are consuming. If they don’t, they are setting themselves up for a bigger loss down the road and leaving the door open for a more established competitor (Google) to do it the right way.
Hacker News consistently have upvoted posts to let users circumvent paywalls. And even when it doesn't, conversations here (and on Twitter, Reddit, etc.) that summarize the articles and quote the relevant bits as soon as the articles are published are much more of a threat to The New York Times than ChatGPT training on articles from months/years ago.
I find irony in the newspaper suing AI when other news sources (admittedly not NYT) use AI to write the articles. How many other AI scrapers are just ingesting AI generated content?
That isn't ironic at all, newspapers have newspaper competitors and if those competitors can steal content by washing it through an AI that is a serious problem. If these AI models weren't used to produce news articles and similar then it would be a much smaller issue.