zlacker

[return to "GitHub Copilot, with “public code” blocked, emits my copyrighted code"]
1. kweing+v6[view] [source] 2022-10-16 20:27:21
>>davidg+(OP)
I’ve noticed that people tend to disapprove of AI trained on their profession’s data, but are usually indifferent or positive about other applications of AI.

For example, I know artists who are vehemently against DALL-E, Stable Diffusion, etc. and regard it as stealing, but they view Copilot and GPT-3 as merely useful tools. I also know software devs who are extremely excited about AI art and GPT-3 but are outraged by Copilot.

For myself, I am skeptical of intellectual property in the first place. I say go for it.

◧◩
2. machin+H7[view] [source] 2022-10-16 20:38:34
>>kweing+v6
I'm pretty sure DALL-E was trained only on not copyright material ( they say so :| ).

But to be honest if your code is open source im pretty sure Microsoft don't care about licence they'll just use it cause "reasons" same about stable diffusion they don't give a fuk about data if its in internet they'll use it so its topic that probably will be regulated in few years.

Until then lets hope they'll get milked (both Microsoft and NovelAI) for illegal content usage and I srsly hope at least few layers will try milking it asap especially NovelAI which illegally usage a lot of copyrighted art in the training data.

◧◩◪
3. msbarn+I8[view] [source] 2022-10-16 20:47:02
>>machin+H7
> I'm pretty sure DALL-E was trained only on not copyright material

Nope. DALL-E generates images with the Getty Watermark, so clearly there’s copyrighted materials in its training set: https://www.reddit.com/r/dalle2/comments/xdjinf/its_pretty_o...

◧◩◪◨
4. pclmul+5c[view] [source] 2022-10-16 21:15:12
>>msbarn+I8
Lots of people ironically put the Getty watermark on pictures and memes that they make to satirically imply that they are pulling stock photos off the internet with the printscreen function instead of paying for them.
◧◩◪◨⬒
5. msbarn+cf[view] [source] 2022-10-16 21:48:46
>>pclmul+5c
Memes generally would not fall under the category of non-copyrighted material; they’re most of the time extremely copyrighted material just being used without permission. And even a wholly original work an artist sarcastically puts a Getty watermark and then licensed under Creative Commons or something would fall into very murky territory – the Getty watermark itself is the intellectual property of Getty. The original image author might plead fair use as satire, but satirical intentions aren’t really a defence available to DALL-E.

So even if we’re assuming these were wholly original works that the author placed under something like a Creative Commons license, the fact that it incorporated an image they had no rights to would at the very least create a fairly tangled copyright situation that any really rigorous evaluation of the copyright status of every image in the training set would tend to argue towards rejecting as not worth the risk of litigation.

But the more likely scenario here is that they did minimal at best filtering of the training set for copyrights.

[go to top]