It doesn't change licensing issue but it does mean people are already copying and using copyrighted code without respecting original license and no AI involved.
There should be a way to reverse engineer code LLMs to see which core bits of memorized code they build on. Another complex option is a combination of provenance tracking and semantic hashing on all functions in code used for training. Another option (non-technical) is a rethinking of IP.
The original poster said it was in a private repository.
>It doesn't change licensing issue but it does mean people are already copying and using copyrighted code without respecting original license and no AI involved.
I don't get the argument. Many people are copying/pirating MS windows/MS office. What do you think MS would say to a company they caught with unlicensed copies and they used the excuse "the PCs came preinstalled with Windows and we didn't check if there was a valid license"?