zlacker

[parent] [thread] 0 comments
1. tsimio+(OP)[view] [source] 2023-12-27 14:43:41
It should be noted that there are explicit exemptions to allow copying program data intro RAM and into CPU registers (in many licenses). Whether that is truly necessary or not is at best debatable, but arguably training a model (especially one you then distribute or give access to) on copyrighted data is vastly different from regular copying into memory and should require explicit licensing.

The fact that the model can reproduce large chunks of the original text verbatim is proof positive that it contains copies of the original text encoded in its weights. If I wrote a program that crawled the NYT site, zipping the contents, and retrieved articles based on keyword searches and made them available online, would you not say I'm infringing their copyright?

[go to top]