zlacker

[return to "A federal judge sides with Anthropic in lawsuit over training AI on books"]
1. Nobody+fc[view] [source] 2025-06-24 17:29:23
>>moose4+(OP)
One aspect of this ruling [1] that I find concerning: on pages 7 and 11-12, it concedes that the LLM does substantially "memorize" copyrighted works, but rules that this doesn't violate the author's copyright because Anthropic has server-side filtering to avoid reproducing memorized text. (Alsup compares this to Google Books, which has server-side searchable full-text copies of copyrighted books, but only allows users to access snippets in a non-infringing manner.)

Does this imply that distributing open-weights models such as Llama is copyright infringement, since users can trivially run the model without output filtering to extract the memorized text?

[1]: https://storage.courtlistener.com/recap/gov.uscourts.cand.43...

◧◩
2. deadba+Gc[view] [source] 2025-06-24 17:32:43
>>Nobody+fc
You can use the copyrighted text for personal purposes.
◧◩◪
3. AtlasB+Ee[view] [source] 2025-06-24 17:43:20
>>deadba+Gc
Hey can I have a fake llm "trained" on a set of copyrighted works to ask what those works are?

So it totally isn't a warez streaming media server but AI?

I'm guessing since my net worth isn't a billion plus, the answer is no

◧◩◪◨
4. Anthon+yz[view] [source] 2025-06-24 19:31:04
>>AtlasB+Ee
People have been coming up with convoluted piracy loopholes since the invention of copyright.

If you xor some data with random numbers, both the result and the random numbers are indistinguishably random and there is no way to tell which one came out of a random number generator and which one is "derived" from a copyrighted work. But if you xor them together again the copyrighted work comes out. So if you have Alice distribute one of the random looking things and Bob distribute the other one and then Carol downloads them both and reconstructs the copyrighted work, have you created a scheme to copy whatever you want with no infringement occurring?

Of course not, at least Carol is reproducing an infringing work, and then there are going to be claims of contributory infringement etc. for the others if the scheme has no other purpose than to do this.

Meanwhile this problem is also boring because preventing anyone from being the source of infringing works isn't a thing anybody has been able to do since at least as long as the internet has allowed anyone to set up a server in another jurisdiction.

[go to top]