You know, if I've noticed anything in the past couple years, it's that even if you self-host your own site, it's still going to get hoovered up and used/exploited by things like AI training bots. I think between everyone's code getting trained on, even if it's AGPLv3 or something similarly restrictive, and generally everything public on the internet getting "trained" and "transformed" to basically launder it via "AI", I can absolutely see why someone rational would want to share a whole lot less, anywhere, in an open fashion, regardless of where it's hosted.
I'd honestly rather see and think more about how to segment communities locally, and go back to the "fragmented" way things once were. It's easier to want to share with other real people than inadvertently working for free to enrich companies.
In contrast, if I uploaded something to a social media site like Instagram, and then Meta "sublicensed" my image to someone else, I wouldn't have much to say there.
Would love someone with actual legal knowledge to chime in here.
but you agreed to this, when agreeing to the TOS.
> I post a picture on my site that is then used by a large publisher for ads, I would (at least in theory) have some recourse
which you didn't sign any contract, and therefore it is a violation of copyright.
But the new AI training methods are currently, at least imho, not a violation of copyright - not any more than a human eye viewing it (which you've implicitly given permission to do so, by putting it up on the internet). On the other hand, if you put it behind a gate (no matter how trivial), then you could've at least legally protected yourself.
I don't understand how that matters. I thought that the whole idea of copyright and licences was that the holder of the rights can decide what is ok to do with the content and what is not. If the holder of the rights does not agree to a certain kind of use, what else is there to discuss?
It sure does not matter if I think that downloading a torrent is not any more pirating than borrowing a media from my friend.
the holder of content does not automatically get to prescribe how i would use said content, as long as i comply with the copyrights.
The holder does not get to dictate anything beyond that - for example, i can learn from the content. Or i can berate it. Copyright is not a right that covers every single conceivable use - it is a limited set of uses that have been outlayed in the law.
So the current arguments center on the fact that it is unknown if existing copyright covers the use of said works in ML training.
A typical copyright notice for a book says something like (to paraphrase...) "not to be stored, transmitted, or used by or on any electronic device without explicit permission."
That clearly includes use for training, because you can't train without making a copy, even if the copy is subsequently thrown away.
Any argument about this is trying to redefine copyright as the right to extract the semantic or cultural value of a document. In reality the definition is already clear - no copying of a document by any means for any purpose without explicit permission.
This is even implicitly acknowledged in the CC definitions. CC would be meaningless and pointless without it.
a copy for ingestion purposes - such as viewing in a browser, is not the same as a distribution copy that you make sending it to another person.
> the right to extract the semantic or cultural value of a document.
this right does not belong to the author - in fact, this is not an explicit right granted by the copyright act. Therefore, the extraction of information from a works is not something the author can (nor should) control. Otherwise, how would anyone learn off a textbook, music or art?
In the future, when the courts finally decide what the limits of ML training is, may be it will be a new right granted to authors. But it isn't one atm.