I don’t necessarily fault OpenAI’s decision to initially train their models without entering into licensing agreements - they probably wouldn’t exist and the generative AI revolution may never have happened if they put the horse before the cart. I do think they should quickly course correct at this point and accept the fact that they clearly owe something to the creators of content they are consuming. If they don’t, they are setting themselves up for a bigger loss down the road and leaving the door open for a more established competitor (Google) to do it the right way.
This is also related to earlier studies about OpenAI where their models have a bad habit of just regurgitating training data verbatim. If your trained data is protected IP you didn’t secure the rights for then that’s a real big problem. Hence this lawsuit. If successful, the floodgates will open.
In what sense are they claiming their generated contents as their own IP?
https://www.zdnet.com/article/who-owns-the-code-if-chatgpts-...
> OpenAI (the company behind ChatGPT) does not claim ownership of generated content. According to their terms of service, "OpenAI hereby assigns to you all its right, title and interest in and to Output."
https://openai.com/policies/terms-of-use
> Ownership of Content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.
Saying they don’t claim the rights over their output while outputting large chunks verbatim is the old YouTube scheme of upload movie and say “no copyright intended”.
The situations aren’t remotely similar and that much should be obvious. In one instance ChatGPT is reproducing copyrighted work and in the other Word is taking keyboard input from the user; Word itself isn’t producing anything itself.
> GPT is just a tool.
I don’t know what point this is supposed to make. It is not “just a tool” in the sense that it has no impact on what gets written.
Which brings us back to the beginning.
> the user who’s asking it to produce copyrighted content.
ChatGPT was trained on copyrighted content. The fact that it CAN reproduce the copyrighted content and the fact that it was trained on it is what the argument is about.