zlacker

[return to "GitHub Copilot, with “public code” blocked, emits my copyrighted code"]
1. ianbut+ce[view] [source] 2022-10-16 21:38:47
>>davidg+(OP)
I just tested it myself on a random c file I created in the middle of a rust project I'm working on, it reproduced his full code verbatim from just the function header so clearly it does regurgitate proprietary code unlike some people have said, I do not have his source so co-pilot isn't just using existing context.

I've been finding co-pilot really useful but I'll be pausing it for now, and I'm glad I have only been using it on personal projects and not anything for work. This crosses the line in my head from legal ambiguity to legal "yeah that's gonna have to stop".

◧◩
2. shadow+Wf[view] [source] 2022-10-16 21:55:17
>>ianbut+ce
Searching for the function names in his libraries, I'm seeing some 32,000 hits.

I suspect he has a different problem which (thanks to Microsoft) is now a problem he has to care about: his code probably shows up in one or more repos copy-pasted with improper LGPL attribution. There'd be no way for Copilot to know that had happened, and it would have mixed in the code.

(As a side note: understanding why an ML engine outputs a particular result is still an open area of research AFAIK.)

◧◩◪
3. andrea+6V[view] [source] 2022-10-17 05:29:37
>>shadow+Wf
"It's too hard" isn't a valid reason for me to not follow laws and/or social norms. This is a predictable result and was predicted by many people; "oops we didn't know" is neither credible nor acceptable.
◧◩◪◨
4. Spivak+zW[view] [source] 2022-10-17 05:50:58
>>andrea+6V
It’s not “oops we didn’t know” it’s, “someone published a project under a permissive license which included this code.”

If your standard is “Github should have an oracle to the US court system and predict what the outcome of a lawsuit alleging copyright infringement for a given snippet of code would be” then it is literally impossible for anyone to use any open source code ever because it might contain infringing code.

There is no chain of custody for this kind of thing which is what it would require.

◧◩◪◨⬒
5. vincne+c41[view] [source] 2022-10-17 07:15:35
>>Spivak+zW
This reminds me my 4 year old daughter. She often comes from kindergarten with new toys. When i ask here, where did she get it, she tells that her friend gave this as a gift to her. When i dig deeper and ask around, i turns out that the friend who were gifting her things were not real owners of the gift. I see why i could be difficult for children to understand concept of ownership and that you should not gift things to others that are not your own.

So in this case copilot just looks at the situation like that someone gifted me this, and does not question if the person gifting was the real owner of the gift.

◧◩◪◨⬒⬓
6. Spivak+Rs2[view] [source] 2022-10-17 16:35:24
>>vincne+c41
> and does not question if the person gifting was the real owner of the gift

If you can figure out a method of determining whether someone owns the code that doesn't involve, "try suing in court for copyright infringement and see if it sticks" then we're kinda stuck. Because just because a codebase contains an exact or similar snippet from another codebase doesn't mean that snippet reaches the threshold of copyrightable work. Or the reverse being that just because two code snippets look wildly different doesn't mean it's not infringement and detecting that automatically is solving the halting problem.

The thing you want for software to actually solve this is chain of custody which we don't have. If you require everyone assume everyone else could be lying or mistaken about infringement then using any open source project for anything becomes legal hot water.

In fact when you upload code to Github you grant them a license to do things like "display it" which you can't do if you don't actually own the copyright or have a license so even before the code is ever slurrped into Copilot the same exact legal situation arises as to wether Github is legally allowed to host the code at all. Can you imagine if when you uploaded code to Github you had to sign a document saying you owned the code and indemnifying Microsoft against any lawsuit alleging infringement o boy people would not enjoy that.

[go to top]