A voice can be zero shot encoded to a few hundred kb vector. Timbre, prosody, lots of characteristics. That's less information than a fingerprint. And more importantly, that's something you can dial in with a few knobs by simply listening by ear.
It's why your brain can easily hear things in other people's voices. They're not hard signals to reproduce. Some people with flexible vocal ranges can even impersonate others quite easily.
I'm sure most people have gotten, "you sound like X" once or twice. Not unlike the "you look like Y" comments.
Voices really aren't that fingerprint-y.
If we really want to split hairs and argue from biology, who "owns" the voice of a set of identical twins?
Choose a different career, like maybe public-opinion influencer.
Is your voice truly unique out of the 8 billion out there in the world? Nobody could plausibly pass as you?
1. Push forward legislation/regulation/lawsuits/public opinion via whatever method is available, probably unions or other collective power.
2. Embrace the technology. Maybe build a voice model of yourself, sell the license affordably and broadly to those that want to take advantage of the convenience and scalability (as in number of phrases) of voice AI but don't want the mess of wading into unsettled legal territory. Or learn what voice AI is good at and what it is bad at and find your niche. Survive by being at the cutting edge of this new world, setting the standards and being knowledgable.
3. Walk away from an industry that is either dying or about to become unrecognizable.
The genie's not going back in the bottle.
But still there are some voices that are just highly associated with just one person in everyone's minds, like David Attenborough. For example, if I heard Attenborough speaking my local train announcements, but it would be an impersonator, I think I would feel like the company is taking advantage of Attenborough's voice. I.e. they would be using the fact that everyone knows this voice to their advantage, without actually paying Attenborough.
While voices aren't technically that unique, when linked to certain situations or when heard by enough people, they become unique in that context. I'm sure no one cares about Attenborough's voice 100 years from now.
Or hm, maybe AI voice tools will keep his voice alive forever in Planet Earth spinoffs, just like Sinatra has been resurrected for mashups.
As a recent example Baldur’s Gate 3, Andrew Wincott voiced Raphael, an npc-antagonist, who to my untrained ear sounded exactly like Charles Dance, and the character model had more than a passing semblance to Mr. Dance as well.
It was not a Charles Dance carbon copy but all aspects of the character were strongly aligned with him.
I’m wondering where is the line in style and personal aspects of one’s craft drawn.
Some of this is probably part of personal perception.
But, IMO, the value of mud sludge on a table indistinguishable from sandwiches is tiny. Fake Chanels are 10^2-5 cheaper than the real thing no matter the closeness. Don't listen to people begging you for life to join the counterfeiter ring, they don't make much anyway.
2) Your ability to mine cash off this physical quality depends on the inability of this quality to be reproduced.
3) This quality can now be reproduced.
I would think very hard and very long about staying in this particular business. Personally I think there is still plenty of work left because not everyone is happy with going full sci-fi dystopia, but it will be niche and scrappy.
"I have unique characteristics that make me an excellent programmer. I earn money by tweaking for-loops. Recently, GPT is being able to tweak for-loops better, faster and more cheaply than I can. How can I protect myself in case companies decide to replicate my unique abilities?"
In this case, the actress selected for OpenAI was clearly selected for similarity to SJ. And that by itself would have been okay, because the actress is speaking in her natural voice, and SJ doesn't have a monopoly on voice acting...but OpenAI went further, and had the unknown[1] actress base her inflections, cadence, and mannerisms on SJ's performance in the movie Her. And Altman even tweeted the movie's name to advertise the connection.
The problem is that there is a well-settled case law stretching back over several decades that makes this a slam-dunk case for SJ, because it doesn't matter that OpenAI didn't "steal" her voice, they stole her likeness.[2] It wasn't just some unknown actress speaking in her own voice, it was an actress with a voice similar to SJ given lines and directing by OpenAI with the clear intent of mimicking SJ's voice performance in one of her more-famous roles.
[1] There is a very short list of a few actresses who both sound like SJ and do voice-over work circulating around Hollywood, so a lot of people have a pretty good idea of who it is, but nobody will identify the actress unless she identifies herself, out of solidarity.
[2] Likeness rights are quite strong in the U.S. They're even stronger in Europe.
You might think voice is something you're born with. It's not, it rather partially comes from languages and your backgrounds. So random chances of someone literally sounding by DNA from half a world away is quite low.
As for Scarlett Johansson, I remember her from the Ghost in the Shell the live action movie controversy. Not fondly.
You can also obviously compare multiple voice files recorded with similar sounding but different individuals, they rarely look similar on spectrograms.
If the tech actually works well enough to stand in for humans, I think we will very quickly see recording real humans in fictional pieces as old fashioned.
Case in point - my wife has two twin brothers. Even though I've been interacting with them for over 10 years now, they sound exactly the same to me. If I close my eyes then there is zero chance I could tell them apart by voice alone. I know, it's an anecdote - but while I'm sure you could tell them apart by some really small thing that they do, to someone who isn't actively looking for those cues they are - for all intentions and purposes - the same.
WaPo's reporting states that the individual in charge of the interaction, Jang, modeled it after Hollywood movies, and worked with a film director specifically to accomplish this goal. And the executive responsible for the artistic decisions, CTO Murati, was conveniently not made available for WaPo to interview.
OpenAI has no credibility here, given its extensive history of dissembling as a company. If Her and SJ weren't the driving inspiration for the Sky voice, they would have made Murati available to explicitly refute those claims. Her absence speaks volumes.
And OpenAI dropping Sky immediately speaks even louder. It means that somewhere there is a smoking gun that would destroy them in court. [Edit: it turns out the smoking guns were already public: in addition to the CEO's Her tweet, his co-founder Karpathy explicitly linked the voice product to SJ. Game. Set. Match.]