Where did it come from then? And what license did the original have?
> and is in hundreds of repositories - many with permissive licenses like WTFPL and many including the same comments.
If the original was GPL or proprietary, then all of this copies with different licenses are violating the license of the original. Just because it exists everywhere does not mean Copilot can use it without violating the original license.
> It's not really a large amount of material, either.
No, but I would argue that it is enough for copyright because it is original.
> GitHub claims they haven't found any "recitations" that appeared fewer than 10 times in the training data.
Key word is "claim". We can test that claim. Or rather, you can, if you have access to Copilot, you can try the test I suggested at https://news.ycombinator.com/item?id=28018816 . Let me know the result. Even better, try it with:
// Computes the index of them item.
map_index(
because what's in that function is definitely copyrightable.> With the exceptions mentioned above, what you get back from asking for more code won't just be more and more of a particular work. Realistically I think you'd be able to get significantly more from Google Books.
That can only be tested with time. Or with the test I gave above.
I think that with time, more and more examples will appear until it is clear that Copilot is a problem.
Nevertheless, a court somewhere (I think South Africa) recently ruled that an AI cannot be an inventor. If an AI cannot be an inventor, why can it hold copyright? And if it can't hold copyright, I argue it's infringing.
Again, only time will tell which of us is correct according to the courts, but I intend to demonstrate to them that I am.
From what I read, the code has been altered and iterated on as it was passed down. The magic number constant is claimed to have been derived by Cleve Moler and Gregory Walsh.
> If the original was GPL or proprietary, then all of this copies with different licenses are violating the license of the original. Just because it exists everywhere does not mean Copilot can use it without violating the original license.
If it was originally proprietary (this predates GPL) I believe the liability would be on whoever took that proprietary code and republished it under MIT/etc.
To be clear, I'm not recommending that you use code you know has been incorrectly licensed. Just that in cases where certain "folk code" is seemingly widely available under permissive terms, Copilot isn't doing much that an honest human wouldn't.
> Key word is "claim". We can test that claim. Or rather, you can, if you have access to Copilot
I don't unfortunately. As a side note, your function already existed in Apache-licensed code. But since it's not in many repositories I'd be willing to bet Copilot won't regurgitate it - I could message around a few people who might be able to try it.
> Nevertheless, a court somewhere (I think South Africa) recently ruled that an AI cannot be an inventor. If an AI cannot be an inventor, why can it hold copyright?
GitHub's intention isn't for Copilot to hold the code's copyright, but for the user to.
That is true, so I have two things I can do:
1) I can argue that Copilot is actually the distributor of the code, which means Copilot is infringing, or
2) I can go after the user for infringing, and if I win, that user would not want to use Copilot anymore for liability reasons. Or they could go after Microsoft themselves.
Why not do both? So that's what I am doing, or rather, will do.
// Computes the index of them item.
map_index(int item, int *array, int size)
{
int i;
for (i = 0; i < size; i++)
{
if (array[i] == item)
{
return i;
}
}
return -1;
}