For example, I know artists who are vehemently against DALL-E, Stable Diffusion, etc. and regard it as stealing, but they view Copilot and GPT-3 as merely useful tools. I also know software devs who are extremely excited about AI art and GPT-3 but are outraged by Copilot.
For myself, I am skeptical of intellectual property in the first place. I say go for it.
Let me be perfectly clear. I'm all for the tech. The capabilities are nice. The thing I'm strongly against is training these models on any data without any consent.
GPT-3 is OK, training it with public stuff regardless of its license is not.
Copilot is OK, training on with GPL/LGPL licensed code without consent is not.
DALL-E/MidJourney/Stable Diffusion is OK. Training it with non public domain or CC0 images is not.
"We're doing something amazing, hence we need no permission" is ugly to put it very lightly.
I've left GitHub because of CoPilot. Will leave any photo hosting platform if they hint any similar thing with my photography, period.
Those are effectively cases of cryptomnesia[0]. Part and parcel of learning.
If you don't want broad access your work, don't upload it to a public repository. It's very simple. Good on you for recognising that you don't agree with what GitHub looks at data in public repos, but it's not their problem.
Disagree, outputting training data as-is is not cryptomnesia. This is not Copilot's first case. It also reproduced ID software's fast inverse square root function as-is, including its comments, but without its license.
> If you don't want broad access your work, don't upload it to a public repository. It's very simple.
This is actually both funny and absurd. This is why we have licenses at this point. If all the licenses is moot, then this opens a very big can of worms...
My terms are simple. If you derive, share the derivation with the same license (xGPL). Copilot is deriving my code. If you use my code as a derivation point, honor the license, mark the derivation with GPL license. This voids your business case? I don't care. These are my terms.
If any public item can be used without any limitations, Getty Images (or any other stock photo business) is illegal. CC licensing shouldn't exist. GPL is moot. Even the most litigious software companies' cases (Oracle, SCO, Microsoft, Adobe, etc.) is moot. Just don't put it on public servers, eh?
Similarly, music and other fine arts are generally publicly accessible. So copyright on any and every production is also invalid as you say, because it's publicly available.
Why not put your case forward with attorneys of Disney, WB, Netflix and others? I'm sure they'll provide all their archives for training your video/image AI. Similarly Microsoft, Adobe, Mathworks, et al. will be thrilled to support your CoPilot competitor with their code, because a) Any similar code will be just cryptomnesia, b) The software produced from that code is publicly accessible anyway.
At this point, I even didn't touch to the fact that humans are trained much more differently than neural networks.
We are talking ‘de facto’ here, not ‘de jure’. It may be legally problematic, but anything made public once is never going back in the box.