The vast majority of people who would use a matrix transform function they got from code completion (or from a GitHub or stack overflow search) probably don’t care what the license is. They’ll just paste in the code. To many developers publicly viewable code is in the public domain. Code pilot just shortens the search by a few seconds.
Microsoft should try todo better (I’m not sure how), but the sad fact is that trying to enforce a license on a code fragment is like dropping dollar bills on the sidewalk with a note pinned to them saying “do not buy candy with this dollar”
If CoPilot makes everyone see how ridiculous that is, that's a win in my book.
Of course if someone does manage to set a precedent that including copyrighted works in AI training data without an explicit license to do so, GitHub Copilot would be screwed and at best have to start over with a blank slate if they can't be grandfathered. But this would affect almost all products based on the recent advancements in AI and they're backed by fairly large companies (after all, GitHub is owned by Microsoft and a lot of the other AI stuff traces back to Alphabet and there are a lot of startups funded by huge and influential VC companies). Given the US's history of business-friendly legislation, I doubt we'll see copyright laws being enforced against training data unless someone upsets Disney.
Github can only trust push timestamps.
It’s not a binary all perfectly or nothing at all. The law looks at intent and so doesn’t punish mistakes or errors so long as you aren’t being malicious or reckless or negligent.
> No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider
So the act of hosting copyrighted content is not actually a copyright violation for Github. They're not obligated to preemptively determine who the original copyright owner of some piece of code is, as they're not the judge of that in the first place. Even if you complain that someone stole your code, how is Github supposed to know who's lying? Copyright is a legal issue between the copyright holder and the copyright infringer. So the only thing Github is required to do is to respond to DMCA takedown notices.