OpenAI departures: Why can’t former employees talk?

>>fnbr+(OP)
The best approach to circumventing the nondisclosure agreement is for the affected employees to get together, write out everything they want to say about OpenAI, train an LLM on that text, and then release it.

Based on these companies' arguments that copyrighted material is not actually reproduced by these models, and that any seemingly-infringing use is the responsibility of the user of the model rather than those who produced it, anyone could freely generate an infinite number of high-truthiness OpenAI anecdotes, freshly laundered by the inference engine, that couldn't be used against the original authors without OpenAI invalidating their own legal stance with respect to their own models.

>>mwigda+OQ
Clever, but no.

The argument about LLMs not being copyright laundromats making sense hinges the scale and non-specificity of training. There's a difference between "LLM reproduced this piece of copyrighted work because it memorized it from being fed literally half the internet", vs. "LLM was intentionally trained to specifically reproduce variants of this particular work". Whatever one's stances on the former case, the latter case would be plain infringing copyrights and admitting to it.

In other words: GPT-4 gets to get away with occasionally spitting out something real verbatim. Llama2-7b-finetune-NYTArticles does not.

>>TeMPOr+0T
Seems absurd that somehow the scale being massive makes it better somehow

You would think having a massive scale just means it has infringed even more copyrights, and therefore should be in even more hot water

>>bluefi+kT
My US history teacher taught me something important. He said that if you are going to steal and don't want to get in trouble, steal a whole lot.

>>NewJaz+BT
Copying one person is plagarism. Copying lots of people is research.

>>Pontif+9Z
True, but if you research lots of sources and still emit significant blocks of verbatim text without attribution, it’s still plagiarism. At least that’s how human authors are judged.

zlacker