zlacker

[parent] [thread] 1 comments
1. Vetch+(OP)[view] [source] 2022-10-17 06:35:54
With high probability, what's happened here is this code is an important piece of code-infrastucture in that it's copied into a fair number of places. Which means humans are copying it without attribution or downstream of someone who did while relevant license is not propagated anywhere near as reliably.

It doesn't change licensing issue but it does mean people are already copying and using copyrighted code without respecting original license and no AI involved.

There should be a way to reverse engineer code LLMs to see which core bits of memorized code they build on. Another complex option is a combination of provenance tracking and semantic hashing on all functions in code used for training. Another option (non-technical) is a rethinking of IP.

replies(1): >>cycoma+O2
2. cycoma+O2[view] [source] 2022-10-17 07:07:26
>>Vetch+(OP)
>With high probability, what's happened here is this code is an important piece of code-infrastucture in that it's copied into a fair number of places. Which means humans are copying it without attribution or downstream of someone who did while relevant license is not propagated anywhere near as reliably.

The original poster said it was in a private repository.

>It doesn't change licensing issue but it does mean people are already copying and using copyrighted code without respecting original license and no AI involved.

I don't get the argument. Many people are copying/pirating MS windows/MS office. What do you think MS would say to a company they caught with unlicensed copies and they used the excuse "the PCs came preinstalled with Windows and we didn't check if there was a valid license"?

[go to top]