GitHub Copilot, with “public code” blocked, emits my copyrighted code

>>davidg+(OP)
I would imagine the root problem here is people taking copyrighted code, pasting it in their project and disregarding the license. To me this seems common, especially when it comes to toy, test and hobby projects.

I don't see how copilot or similar tools can solve this problem without vetting each project.

>>CapsAd+Jf
That's an entirely plausible explanation, but it doesn't mean that Microsoft has any less of a legal nightmare on their hands.

>>yjftsj+Yg
I'm not really sure what I think about this. How responsible should Microsoft be for someone's badly licensed code on their platform? If they somehow had the ability to ban projects using stolen snippets of code, I don't think I'd dare to host my hobby projects there.

If you can't trust that the code in a project is compatible with the license of the project then the only option I see is that copilot cannot exist.

I love free software and whatnot, but I have a feeling this situation would've been quite different if copilot was made by the free software community and accidentally trained on some non free code..

>>CapsAd+Qi
> I'm not really sure what I think about this. How responsible should Microsoft be for someone's badly licensed code on their platform?

That's a really hard undersell of responsibility on the part of Microsoft/Github.

It seems as though they did approximately zero work to verify any of the code wasn't infringing. Things they could have tried but apparently didn't:

1) Ask developers to opt-in to copilot scanning of their repositories, and alongside that have them certify that they hold copyright over all lines of code included in the repository.

2) Use a training dataset of only public repositories listed under applicable pre-identified licensing schemes, from established groups. e.g.: *bsd licensed code from the various BSD OSes.

3) Sought out examples from standard libraries in other programming languages with suitable licenses.

It seems like they did nothing and just hoped. I can't see how anyone would try to rely on this thing in a commercial context after its proven to do this over and over. The well has been poisoned.

zlacker