Based on these companies' arguments that copyrighted material is not actually reproduced by these models, and that any seemingly-infringing use is the responsibility of the user of the model rather than those who produced it, anyone could freely generate an infinite number of high-truthiness OpenAI anecdotes, freshly laundered by the inference engine, that couldn't be used against the original authors without OpenAI invalidating their own legal stance with respect to their own models.
The argument about LLMs not being copyright laundromats making sense hinges the scale and non-specificity of training. There's a difference between "LLM reproduced this piece of copyrighted work because it memorized it from being fed literally half the internet", vs. "LLM was intentionally trained to specifically reproduce variants of this particular work". Whatever one's stances on the former case, the latter case would be plain infringing copyrights and admitting to it.
In other words: GPT-4 gets to get away with occasionally spitting out something real verbatim. Llama2-7b-finetune-NYTArticles does not.
Ta-da.
Not to mention: LLMs aren't oracles. Whatever they say will be dismissed as hallucinations if it isn't corroborated by other sources.
Does it really take millions dollars of compute to add additional training data to an existing model?
Plus, we're talking about employees that are leaving / left anyway.
>Not to mention: LLMs aren't oracles. Whatever they say will be dismissed as hallucinations if it isn't corroborated by other sources.
Excellent. That means plausible deniability.
Surely all those horror stories about unethical behavior are just hallucinations, no matter how specific they are.
Absolutely no reason for anyone to take them seriously. Which is why the press will not hesitate to run with that, with appropriate disclaimers, of course.
Seriously, you seem to think that in a world where numbers about death toll in Gaza are taken verbatim from Hamas without being corroborated by other sources, an AI model output will not pass the test of public scrutiny?
Very optimistic of you.