The New York Times is suing OpenAI and Microsoft for copyright infringement

>>ssgodd+(OP)
Even if they win against openAI, how would this prevent something like a Chinese or Russian LLM from “stealing” their content and making their own superior LLM that isnt weakened by regulation like the ones in the United States.

And I say this as someone that is extremely bothered by how easily mass amounts of open content can just be vacuumed up into a training set with reckless abandon and there isn’t much you can do other than put everything you create behind some kind of authentication wall but even then it’s only a matter of time until it leaks anyway.

Pandora’s box is really open, we need to figure out how to live in a world with these systems because it’s an un winnable arms race where only bad actors will benefit from everyone else being neutered by regulation. Especially with the massive pace of open source innovation in this space.

We’re in a “mutually assured destruction” situation now, but instead of bombs the weapon is information.

>>dissid+B6
This suggests to me that copyright laws are becoming out of date.

The original intent was to provide an incentive for human authors to publish work, but has become more out of touch since the internet allowed virtually free publishing and copying. I think with the dawn of LLMs, copyright law is now mainly incentivising lawyers.

>>ndsipa+D7
What incentive do people have to publish work if their work is going to primarily be consumed by a LLM and spat out without attribution at people who are using the LLM?

>>phone8+I8
I would guess the monetisation is going to be limited to either subscriptions or advertising if your reputation allows people to especially value your curation of facts/reporting etc. The big issue with LLMs is the lack of reliability - it might be accurate or it might be an hallucination.

Personally, I think it would be a lot simpler if the internet was declared a non-copyright zone for sites that aren't paywalled as there's already a legal grey area as viewing a site invariably involves copying it.

Maybe we'll end up with publishers introducing traps/paper towns like mapmakers are prone to do. That way, if an LLM reproduces the false "fact", it'll be obvious where they got it from.

zlacker