zlacker

[return to "Be a property owner and not a renter on the internet"]
1. rpcope+9j[view] [source] 2025-01-03 04:07:47
>>dend+(OP)
> Exploiting user-generated content.

You know, if I've noticed anything in the past couple years, it's that even if you self-host your own site, it's still going to get hoovered up and used/exploited by things like AI training bots. I think between everyone's code getting trained on, even if it's AGPLv3 or something similarly restrictive, and generally everything public on the internet getting "trained" and "transformed" to basically launder it via "AI", I can absolutely see why someone rational would want to share a whole lot less, anywhere, in an open fashion, regardless of where it's hosted.

I'd honestly rather see and think more about how to segment communities locally, and go back to the "fragmented" way things once were. It's easier to want to share with other real people than inadvertently working for free to enrich companies.

◧◩
2. dend+uj[view] [source] 2025-01-03 04:10:51
>>rpcope+9j
Nothing to disagree in this statement, for sure. If it's on the open internet, it will almost surely be used for AI training, consent be damned. But it feels like even at a rudimentary level, if I post a picture on my site that is then used by a large publisher for ads, I would (at least in theory) have some recourse to pursue the matter and prevent them from using my content.

In contrast, if I uploaded something to a social media site like Instagram, and then Meta "sublicensed" my image to someone else, I wouldn't have much to say there.

Would love someone with actual legal knowledge to chime in here.

◧◩◪
3. chii+0k[view] [source] 2025-01-03 04:15:54
>>dend+uj
> Meta "sublicensed" my image to someone else, I wouldn't have much to say there.

but you agreed to this, when agreeing to the TOS.

> I post a picture on my site that is then used by a large publisher for ads, I would (at least in theory) have some recourse

which you didn't sign any contract, and therefore it is a violation of copyright.

But the new AI training methods are currently, at least imho, not a violation of copyright - not any more than a human eye viewing it (which you've implicitly given permission to do so, by putting it up on the internet). On the other hand, if you put it behind a gate (no matter how trivial), then you could've at least legally protected yourself.

◧◩◪◨
4. entrop+YD[view] [source] 2025-01-03 07:59:58
>>chii+0k
>But the new AI training methods are currently, at least imho, not a violation of copyright - not any more than a human eye viewing it (which you've implicitly given permission to do so, by putting it up on the internet).

I don't understand how that matters. I thought that the whole idea of copyright and licences was that the holder of the rights can decide what is ok to do with the content and what is not. If the holder of the rights does not agree to a certain kind of use, what else is there to discuss?

It sure does not matter if I think that downloading a torrent is not any more pirating than borrowing a media from my friend.

◧◩◪◨⬒
5. chii+SH[view] [source] 2025-01-03 08:47:39
>>entrop+YD
> If the holder of the rights does not agree to a certain kind of use, what else is there to discuss?

the holder of content does not automatically get to prescribe how i would use said content, as long as i comply with the copyrights.

The holder does not get to dictate anything beyond that - for example, i can learn from the content. Or i can berate it. Copyright is not a right that covers every single conceivable use - it is a limited set of uses that have been outlayed in the law.

So the current arguments center on the fact that it is unknown if existing copyright covers the use of said works in ML training.

◧◩◪◨⬒⬓
6. TheOth+uO[view] [source] 2025-01-03 09:57:22
>>chii+SH
Copyright means the holder does automatically get to prescribe how content can be copied. That's literally the definition of copyright.

A typical copyright notice for a book says something like (to paraphrase...) "not to be stored, transmitted, or used by or on any electronic device without explicit permission."

That clearly includes use for training, because you can't train without making a copy, even if the copy is subsequently thrown away.

Any argument about this is trying to redefine copyright as the right to extract the semantic or cultural value of a document. In reality the definition is already clear - no copying of a document by any means for any purpose without explicit permission.

This is even implicitly acknowledged in the CC definitions. CC would be meaningless and pointless without it.

◧◩◪◨⬒⬓⬔
7. rpdill+St1[view] [source] 2025-01-03 16:09:51
>>TheOth+uO
> Any argument about this is trying to redefine copyright as the right to extract the semantic or cultural value of a document. In reality the definition is already clear - no copying of a document by any means for any purpose without explicit permission.

I've studied copyright for over 20 years as an amateur, and I used to very much think this way.

And then I started reading court decisions about copyright, and suddenly it became extremely clear that it's a very nuanced discussion about whether or not the document can be copied without explicit permission. There are tons of cases where it's perfectly permissible, even if the copyright holder demands that you request permission.

I've covered this in other posts on Hacker News, but it is still my belief that we will ultimately find AI training to be fair use because it does not materially impact the market for the original work. Perhaps someone could bring a case that makes the case that it does, but courts have yet to see a claim that asserts this in a convincing way based on my reading of the cases over the past couple of years.

◧◩◪◨⬒⬓⬔⧯
8. Terr_+Fg2[view] [source] 2025-01-03 21:50:48
>>rpdill+St1
I assume the emphasis there is on training, whereas it's totally possible to infringe by running the model in certain ways later.
[go to top]