zlacker

[return to "GitHub Copilot, with “public code” blocked, emits my copyrighted code"]
1. ianbut+ce[view] [source] 2022-10-16 21:38:47
>>davidg+(OP)
I just tested it myself on a random c file I created in the middle of a rust project I'm working on, it reproduced his full code verbatim from just the function header so clearly it does regurgitate proprietary code unlike some people have said, I do not have his source so co-pilot isn't just using existing context.

I've been finding co-pilot really useful but I'll be pausing it for now, and I'm glad I have only been using it on personal projects and not anything for work. This crosses the line in my head from legal ambiguity to legal "yeah that's gonna have to stop".

◧◩
2. shadow+Wf[view] [source] 2022-10-16 21:55:17
>>ianbut+ce
Searching for the function names in his libraries, I'm seeing some 32,000 hits.

I suspect he has a different problem which (thanks to Microsoft) is now a problem he has to care about: his code probably shows up in one or more repos copy-pasted with improper LGPL attribution. There'd be no way for Copilot to know that had happened, and it would have mixed in the code.

(As a side note: understanding why an ML engine outputs a particular result is still an open area of research AFAIK.)

◧◩◪
3. manhol+Qv1[view] [source] 2022-10-17 11:55:11
>>shadow+Wf
> his code probably shows up in one or more repos copy-pasted with improper LGPL attribution.

Can Copilot prove that and link to the source LGPL code whenever it reproduces more than half a line of code from such a source?

Because without that clear attribution trail, nobody in their right mind would contaminate their codebase with possibly stolen code. Hell, some bad actor might purposefully publish a proprietary base full of stolen LGPL code, and run scanners on other products until they get a Copilot "bite". When that happens and you get sued, good luck finding the original open source code both you and your aggressor derive from.

[go to top]