zlacker

[return to "Be a property owner and not a renter on the internet"]
1. rpcope+9j[view] [source] 2025-01-03 04:07:47
>>dend+(OP)
> Exploiting user-generated content.

You know, if I've noticed anything in the past couple years, it's that even if you self-host your own site, it's still going to get hoovered up and used/exploited by things like AI training bots. I think between everyone's code getting trained on, even if it's AGPLv3 or something similarly restrictive, and generally everything public on the internet getting "trained" and "transformed" to basically launder it via "AI", I can absolutely see why someone rational would want to share a whole lot less, anywhere, in an open fashion, regardless of where it's hosted.

I'd honestly rather see and think more about how to segment communities locally, and go back to the "fragmented" way things once were. It's easier to want to share with other real people than inadvertently working for free to enrich companies.

◧◩
2. dend+uj[view] [source] 2025-01-03 04:10:51
>>rpcope+9j
Nothing to disagree in this statement, for sure. If it's on the open internet, it will almost surely be used for AI training, consent be damned. But it feels like even at a rudimentary level, if I post a picture on my site that is then used by a large publisher for ads, I would (at least in theory) have some recourse to pursue the matter and prevent them from using my content.

In contrast, if I uploaded something to a social media site like Instagram, and then Meta "sublicensed" my image to someone else, I wouldn't have much to say there.

Would love someone with actual legal knowledge to chime in here.

◧◩◪
3. chii+0k[view] [source] 2025-01-03 04:15:54
>>dend+uj
> Meta "sublicensed" my image to someone else, I wouldn't have much to say there.

but you agreed to this, when agreeing to the TOS.

> I post a picture on my site that is then used by a large publisher for ads, I would (at least in theory) have some recourse

which you didn't sign any contract, and therefore it is a violation of copyright.

But the new AI training methods are currently, at least imho, not a violation of copyright - not any more than a human eye viewing it (which you've implicitly given permission to do so, by putting it up on the internet). On the other hand, if you put it behind a gate (no matter how trivial), then you could've at least legally protected yourself.

◧◩◪◨
4. DrScie+oX[view] [source] 2025-01-03 11:46:33
>>chii+0k
> But the new AI training methods are currently, at least imho, not a violation of copyright - not any more than a human eye viewing it

Interesting comparison - as if a human viewed something, memorized it and reproduced in a recognisable way to be pretty much the same, wouldn't that still breach copyright?

ie in the human case it doesn't matter whether it went through an intermediate neural encoding - what matters is whether the output is sufficiently similar to be deemed a copy.

Surely the same is the case of AI?

◧◩◪◨⬒
5. omnimu+y11[view] [source] 2025-01-03 12:28:59
>>DrScie+oX
This whole AI learns like a human is trajectory of thought pushed by AI companies. They at same time try to humanize AI (it learns like a human would) and dehumanize humans (humans are stochastic parrots anyway). It's if anything a distraction if not straight up anti-human.

But you are right that copyright is complex and in the end decided by human (often in court). Consider how code infringement is not about code itself but about what it does. If you saw somewhat original implementation of something and then you rewrite it in different language by yourself there is high chance its still copyright infringement.

On the other hand with images and art it's even more about cultural context. For example works of pop artists like Andy Warhol are for sure original works (even though some of it was disputed recently in court and lost). Nobody considers Andy Warhols work unoriginal even if it often looks very similar to some output it was riffing off because the essence is different to the original.

Compare that to pepople prompting directly with name of artist they want to replicate. This in direct copyright infringement in both essence and intention no matter the resulting image. Also it's different to when human would want to replicate some artist style because humans can't do it 100% even if they want to. There is still piece of their "essence". There are many people who try to fake some famous artist style and sell it as real thing and simply can't do it. This is of course copyright infringement because of the intent but it's more original work than anything coming from LLMs.

◧◩◪◨⬒⬓
6. Kim_Br+9G2[view] [source] 2025-01-04 01:30:53
>>omnimu+y11
> Consider how code infringement is not about code itself but about what it does. If you saw somewhat original implementation of something and then you rewrite it in different language by yourself there is high chance its still copyright infringement.

Actually if you rewrite it in a different language, you're well on your way to making it an independent expression; (though beware Structure, Sequence and Organization, unless you're implementing an API : See Google v. Oracle). Copyright protects specific expressions, not functionality.

> Compare that to pepople prompting directly with name of artist they want to replicate. This in direct copyright infringement in both essence and intention no matter the resulting image.

As far as I'm aware an artists' style is not something that is protected by law, Copyright protects specific works.

If you did want to protect artistic styles, how would you go about legally defining them?

◧◩◪◨⬒⬓⬔
7. omnimu+Cg3[view] [source] 2025-01-04 09:31:28
>>Kim_Br+9G2
The fact LLMs are generating any images is purely thanks to database of source images that are copyright protected. Its a form of sophisticated automated photobashing. Photobashing is grayzone but often legal because of the other artist doing the (often original) work.

When you prompt for Mijazaki image this image can only exist thanks to his protected work being in database (where he doesnt want to be) otherwise the user wouldnt get Mijazaki image they wanted.

We will see how that all plays out but i think if Mijazaki took this to court there would be solid case on grounds that the resulting images breach the copyright of the source, are not original works and are created with bad intent that goes against protections of original author.

What seems to be current direction is atleast that the resulting images cannot be copyrighted automatically in public domain. Making it difficult to use commercially.

◧◩◪◨⬒⬓⬔⧯
8. Kim_Br+fC3[view] [source] 2025-01-04 14:54:58
>>omnimu+Cg3
Actually, while I just said "there is no database", maybe you're working from a very different mental model from mine...

What do you mean by "Database" in this context? What information do you think is being stored, (and how?)

◧◩◪◨⬒⬓⬔⧯▣
9. omnimu+dX3[view] [source] 2025-01-04 17:45:15
>>Kim_Br+fC3
I understand what the model is and how you get to it. I know the training data is not stored. But as far as i understand - the model is closer to derived intermediary from the training data. Like database index or like you said form of compression.

Thats why i on purpose tend to call trainng data + model the database. Because to non progammers it makes more sense. To me there is intentional slight of hand of hiding the fact that the only reason LLMs can work as they do now is because of the source data. The way its usually marketed it seems like the model is program that generalised principles of drawing from looking and other drawings thats why it can draw like Mijazaki when it wants to. Not that it can draw Mijazaki because it preprocessed every Mijazaki drawing, stemmed patterns out of it and can mash them with other patterns (from the database).

Thats why i intentionally say database to lead this discussions back to what i see is core of these technologies.

◧◩◪◨⬒⬓⬔⧯▣▦
10. chii+pP4[view] [source] 2025-01-05 04:06:51
>>omnimu+dX3
What you're describing as database would be what i call information.
[go to top]