OpenAI didn’t copy Scarlett Johansson’s voice for ChatGPT, records show

>>richar+(OP)
I was perusing some Simpsons clips this afternoon and came across a story to the effect of "So and so didn't want to play himself, so Dan Castellaneta did the voice." It's a good impression and people didn't seem very upset about that. I am not sure how this is different. (Apparently this particular "impression" predates the Her character, so it's even easier to not be mad about. It's just a coincidence. They weren't even trying to sound like her!)

I read a lot of C&D letters from celebrities here and on Reddit, and a lot of them are in the form of "I am important so I am requesting that you do not take advantage of your legal rights." I am not a fan. (If you don't want someone to track how often you fly your private jet, buy a new one for each trip. That is the legal option that is available to you. But I digress...)

>>jrockw+wL
Surely there’s some kind of difference between “voice impression for a two-line cameo in one episode of an animated sitcom” and “reproducing your voice as the primary interface for a machine that could be used by billions of people and is worth hundreds of billions of dollars.”

Is there a name for this AI fallacy? The one where programmers make an inductive leap like, for example, if a human can read one book to learn something, then it’s ok to scan millions of books into a computer system because it’s just another kind of learning.

>>pavlov+fM
It's not a fallacy. Behind the AI are 180M users inputting their own problems and giving their guidance. Those millions of books only teach language skills they are not memorized verbatim except rare instances of duplicated text in the training set. There is not enough space to store 10 trillion tokens in a model.

And if we wanted to replicate copyrighted text with a LLM, it would still be a bad idea, better to just find a copy online, faster and more precise, and usually free. We here are often posting paywalled articles in the comments, it's so easy to circumvent the paywalls we don't even blink twice at it.

Using LLMs to infringe is not even the intended purpose, and it only happens when the user makes a special effort to prompt the model with the first paragraph.

What I find offensive is restricting the circulation of ideas under the guise of copyright. In fact copyright should only protect expression not the underlying ideas and styles, those are free to learn, and AIs are just an extension of their human users.

zlacker