>>ssgodd+(OP)
I'd bet they win, but how do you possibly measure the dollar amount? If you strip out 100% of NYT content from GPT-4, I don't think you'd notice a difference. But if you go domain by domain and continue stripping training data, the model will eventually get worse.
>>mritch+W2
Take estimated losses of the NYT from this "innovation" and multiply by 10^x where is "x" high enough to make tech companies stop and think before they break laws next time. That would be my approach at least.
>>necrof+N4
The legal argument, which I'm sure you are very well aware of, is that training a model on data, reorganizing, and then presenting that data as your own is copyright infringement.
>>LargeT+X5
Can you elaborate a bit more? That’s actually just a claim, not a legal argument.
Copyright law allows for transformative uses that add something new, with a further purpose or different character, and do not substitute for the original use of the work. Are LLM’s not transformative?