zlacker

[return to "GitHub Copilot, with “public code” blocked, emits my copyrighted code"]
1. ianbut+ce[view] [source] 2022-10-16 21:38:47
>>davidg+(OP)
I just tested it myself on a random c file I created in the middle of a rust project I'm working on, it reproduced his full code verbatim from just the function header so clearly it does regurgitate proprietary code unlike some people have said, I do not have his source so co-pilot isn't just using existing context.

I've been finding co-pilot really useful but I'll be pausing it for now, and I'm glad I have only been using it on personal projects and not anything for work. This crosses the line in my head from legal ambiguity to legal "yeah that's gonna have to stop".

◧◩
2. shadow+Wf[view] [source] 2022-10-16 21:55:17
>>ianbut+ce
Searching for the function names in his libraries, I'm seeing some 32,000 hits.

I suspect he has a different problem which (thanks to Microsoft) is now a problem he has to care about: his code probably shows up in one or more repos copy-pasted with improper LGPL attribution. There'd be no way for Copilot to know that had happened, and it would have mixed in the code.

(As a side note: understanding why an ML engine outputs a particular result is still an open area of research AFAIK.)

◧◩◪
3. enrage+Ji[view] [source] 2022-10-16 22:20:21
>>shadow+Wf
Expanding on that, even if Microsoft sees the error of their ways and retrains copilot against permissively licensed source or with explicit opt-in, it may get trained on proprietary code the old version of copilot inserted into a permissively licensed project.

You would have to just hope that you can take down every instance of your code and keep it down, all while copilot keeps making more instances for the next version to train on and plagiarize.

[go to top]