zlacker

[parent] [thread] 24 comments
1. Ariel_+(OP)[view] [source] 2024-05-23 00:53:49
It seems increasingly difficult for common people to protect their voices, especially when even Scarlett Johansson can't manage it. As a part-time voice actor with a unique voice, I'm concerned about what I should do if my voice is used without permission and the company denies it. How can I protect myself in such a situation?
replies(8): >>echelo+ev >>trevyn+6w >>MattGa+ww >>solida+xx >>fsloth+fy >>numpad+1z >>huygen+rz >>pfannk+gG
2. echelo+ev[view] [source] 2024-05-23 05:58:09
>>Ariel_+(OP)
I think we'll find voices not that unique.

A voice can be zero shot encoded to a few hundred kb vector. Timbre, prosody, lots of characteristics. That's less information than a fingerprint. And more importantly, that's something you can dial in with a few knobs by simply listening by ear.

It's why your brain can easily hear things in other people's voices. They're not hard signals to reproduce. Some people with flexible vocal ranges can even impersonate others quite easily.

I'm sure most people have gotten, "you sound like X" once or twice. Not unlike the "you look like Y" comments.

Voices really aren't that fingerprint-y.

If we really want to split hairs and argue from biology, who "owns" the voice of a set of identical twins?

replies(1): >>guitar+0y
3. trevyn+6w[view] [source] 2024-05-23 06:09:04
>>Ariel_+(OP)
>How can I protect myself in such a situation?

Choose a different career, like maybe public-opinion influencer.

4. MattGa+ww[view] [source] 2024-05-23 06:13:37
>>Ariel_+(OP)
> with a unique voice

Is your voice truly unique out of the 8 billion out there in the world? Nobody could plausibly pass as you?

replies(1): >>numpad+pD
5. solida+xx[view] [source] 2024-05-23 06:20:40
>>Ariel_+(OP)
I don't have helpful advice for what you asked (spend the money to get a legal expert's opinion would be my advice), but if I was a voice actor, I would see three paths:

1. Push forward legislation/regulation/lawsuits/public opinion via whatever method is available, probably unions or other collective power.

2. Embrace the technology. Maybe build a voice model of yourself, sell the license affordably and broadly to those that want to take advantage of the convenience and scalability (as in number of phrases) of voice AI but don't want the mess of wading into unsettled legal territory. Or learn what voice AI is good at and what it is bad at and find your niche. Survive by being at the cutting edge of this new world, setting the standards and being knowledgable.

3. Walk away from an industry that is either dying or about to become unrecognizable.

The genie's not going back in the bottle.

replies(1): >>grugag+A41
◧◩
6. guitar+0y[view] [source] [discussion] 2024-05-23 06:25:30
>>echelo+ev
I agree on the technical aspect.

But still there are some voices that are just highly associated with just one person in everyone's minds, like David Attenborough. For example, if I heard Attenborough speaking my local train announcements, but it would be an impersonator, I think I would feel like the company is taking advantage of Attenborough's voice. I.e. they would be using the fact that everyone knows this voice to their advantage, without actually paying Attenborough.

While voices aren't technically that unique, when linked to certain situations or when heard by enough people, they become unique in that context. I'm sure no one cares about Attenborough's voice 100 years from now.

Or hm, maybe AI voice tools will keep his voice alive forever in Planet Earth spinoffs, just like Sinatra has been resurrected for mashups.

7. fsloth+fy[view] [source] 2024-05-23 06:28:38
>>Ariel_+(OP)
What are the unique aspects of a sound? A lot of people look and sound stunningly alike.

As a recent example Baldur’s Gate 3, Andrew Wincott voiced Raphael, an npc-antagonist, who to my untrained ear sounded exactly like Charles Dance, and the character model had more than a passing semblance to Mr. Dance as well.

It was not a Charles Dance carbon copy but all aspects of the character were strongly aligned with him.

I’m wondering where is the line in style and personal aspects of one’s craft drawn.

Some of this is probably part of personal perception.

replies(3): >>gamblo+CC >>nottor+kF >>numpad+lF
8. numpad+1z[view] [source] 2024-05-23 06:34:55
>>Ariel_+(OP)
There's nothing that can be done technically. Near perfect voice changing model can be built from 3-5 minutes of conversation on top of a base model, if all the user wants is voice indistinguishable from yours.

But, IMO, the value of mud sludge on a table indistinguishable from sandwiches is tiny. Fake Chanels are 10^2-5 cheaper than the real thing no matter the closeness. Don't listen to people begging you for life to join the counterfeiter ring, they don't make much anyway.

9. huygen+rz[view] [source] 2024-05-23 06:39:05
>>Ariel_+(OP)
1) Your income depends on a physical quality you have.

2) Your ability to mine cash off this physical quality depends on the inability of this quality to be reproduced.

3) This quality can now be reproduced.

I would think very hard and very long about staying in this particular business. Personally I think there is still plenty of work left because not everyone is happy with going full sci-fi dystopia, but it will be niche and scrappy.

"I have unique characteristics that make me an excellent programmer. I earn money by tweaking for-loops. Recently, GPT is being able to tweak for-loops better, faster and more cheaply than I can. How can I protect myself in case companies decide to replicate my unique abilities?"

◧◩
10. gamblo+CC[view] [source] [discussion] 2024-05-23 07:03:47
>>fsloth+fy
Wincott and Dance and are both British actors that began their careers on stage, so they have similar accents, cadences, and vocal mannerisms common to stage actors. For example, both of them speak like Patrick Stewart, another English who also began his career on stage. But otherwise they all clearly have very different voices: they have different timbres, vocal fry, and only one of them (Dance) can sing well and he has a surprisingly large vocal range (see his performance as the Phantom in Phantom of the Opera).

In this case, the actress selected for OpenAI was clearly selected for similarity to SJ. And that by itself would have been okay, because the actress is speaking in her natural voice, and SJ doesn't have a monopoly on voice acting...but OpenAI went further, and had the unknown[1] actress base her inflections, cadence, and mannerisms on SJ's performance in the movie Her. And Altman even tweeted the movie's name to advertise the connection.

The problem is that there is a well-settled case law stretching back over several decades that makes this a slam-dunk case for SJ, because it doesn't matter that OpenAI didn't "steal" her voice, they stole her likeness.[2] It wasn't just some unknown actress speaking in her own voice, it was an actress with a voice similar to SJ given lines and directing by OpenAI with the clear intent of mimicking SJ's voice performance in one of her more-famous roles.

[1] There is a very short list of a few actresses who both sound like SJ and do voice-over work circulating around Hollywood, so a lot of people have a pretty good idea of who it is, but nobody will identify the actress unless she identifies herself, out of solidarity.

[2] Likeness rights are quite strong in the U.S. They're even stronger in Europe.

replies(1): >>ars+uE
◧◩
11. numpad+pD[view] [source] [discussion] 2024-05-23 07:11:26
>>MattGa+ww
There are only 330m US Americans. Just having American throat development patterns narrow you down to a group of less than 4% of population, and it only goes down from there - e.g. PNW has only 13m people total, half that by gender, that makes someone from there belonging to a group of 0.08% of the world.

You might think voice is something you're born with. It's not, it rather partially comes from languages and your backgrounds. So random chances of someone literally sounding by DNA from half a world away is quite low.

replies(1): >>gambit+LI
◧◩◪
12. ars+uE[view] [source] [discussion] 2024-05-23 07:21:30
>>gamblo+CC
Every single thing in your second paragraph is directly contradicted by the article at hand, yet you say them like they are established facts as opposed to things you just made up.
replies(1): >>gamblo+GX1
◧◩
13. nottor+kF[view] [source] [discussion] 2024-05-23 07:28:50
>>fsloth+fy
Who's Charles Dance? :)

As for Scarlett Johansson, I remember her from the Ghost in the Shell the live action movie controversy. Not fondly.

◧◩
14. numpad+lF[view] [source] [discussion] 2024-05-23 07:28:56
>>fsloth+fy
Try loading random voice file done by real (voice)actors into Audacity, switch view to spectrogram mode, drag down to expand, and compare it to yours. Professionally done voice should look like neatly arranged salmon slices, yours will look like PCIe eye diagrams.

You can also obviously compare multiple voice files recorded with similar sounding but different individuals, they rarely look similar on spectrograms.

replies(1): >>gambit+gI
15. pfannk+gG[view] [source] 2024-05-23 07:36:33
>>Ariel_+(OP)
I don’t think this will be a concern for long. Either the tech isn’t good enough and it lacks emotive nuance to the point where human is still preferred, or it is good enough and there is no point in basing off a human actor in the first place vs using an original wholly fabricated voice or appearance.

If the tech actually works well enough to stand in for humans, I think we will very quickly see recording real humans in fictional pieces as old fashioned.

◧◩◪
16. gambit+gI[view] [source] [discussion] 2024-05-23 07:50:05
>>numpad+lF
Sure, except literally no one actually does this. You listen to a voice and it sounds similar in your head? That's who you picture when you hear it. Unless you're a robot I guess.
replies(1): >>numpad+cL
◧◩◪
17. gambit+LI[view] [source] [discussion] 2024-05-23 07:55:22
>>numpad+pD
You're talking about literal identical voice due to throat development and cultural background etc - which is obviously technically true, but I imagine a number of people who sound 99% like you(where a casual listener can't tell the difference) must be quite large.

Case in point - my wife has two twin brothers. Even though I've been interacting with them for over 10 years now, they sound exactly the same to me. If I close my eyes then there is zero chance I could tell them apart by voice alone. I know, it's an anecdote - but while I'm sure you could tell them apart by some really small thing that they do, to someone who isn't actively looking for those cues they are - for all intentions and purposes - the same.

◧◩◪◨
18. numpad+cL[view] [source] [discussion] 2024-05-23 08:15:39
>>gambit+gI
The point is that human voices are technically and verifiably unique, tangential or perhaps antithetical to your/average person's perception.
replies(1): >>gambit+jN
◧◩◪◨⬒
19. gambit+jN[view] [source] [discussion] 2024-05-23 08:34:32
>>numpad+cL
I don't see how that's relevant here - court cases about situations like these are decided on the criteria of "if you show this to an average person on the street would they be able to tell the difference" not "if you load this up in a specialized piece of software and look at the spectrograph is there a difference".
replies(1): >>numpad+CO
◧◩◪◨⬒⬓
20. numpad+CO[view] [source] [discussion] 2024-05-23 08:45:02
>>gambit+jN
I think that will be a very clever and useful defense against CCTV footage and DNA analysis reports! Best legal advice ever.
replies(1): >>gambit+0Q
◧◩◪◨⬒⬓⬔
21. gambit+0Q[view] [source] [discussion] 2024-05-23 08:55:44
>>numpad+CO
I don't understand why you are being sarcastic right now. Trademark cases are always decided on "if a person was shown this logo/song/whatever could they mistake it for the trademarked property of another company", not "well if you load it up in Paint you can see that some pixels around the edges are different so it's technically not the same logo your honour!".
replies(1): >>numpad+DQ
◧◩◪◨⬒⬓⬔⧯
22. numpad+DQ[view] [source] [discussion] 2024-05-23 09:03:25
>>gambit+0Q
so... your position is now in favor of SJ? I don't see consistency in your comments other than that the aim is to downplay uniqueness of voice to justify OAI's actions after the fact.
replies(1): >>gambit+DS
◧◩◪◨⬒⬓⬔⧯▣
23. gambit+DS[view] [source] [discussion] 2024-05-23 09:21:24
>>numpad+DQ
No, my position hasn't changed - the average person on the street might think this voice sounds like SJ, but since SJ doesn't own exclusive rights to anyone else in the world sounding like her I don't think she has a legal ground to stand on, unless OpenAI pretended it is actually her. But I know for certain that the case will not be decided on spectographs of the voice.
◧◩
24. grugag+A41[view] [source] [discussion] 2024-05-23 11:07:11
>>solida+xx
This whole industry is built on top of ripped off content, appropriated from many sources without compensation. A few big lawsuits and things could take an unpredictible turn.
◧◩◪◨
25. gamblo+GX1[view] [source] [discussion] 2024-05-23 16:17:06
>>ars+uE
No, I read the article.

WaPo's reporting states that the individual in charge of the interaction, Jang, modeled it after Hollywood movies, and worked with a film director specifically to accomplish this goal. And the executive responsible for the artistic decisions, CTO Murati, was conveniently not made available for WaPo to interview.

OpenAI has no credibility here, given its extensive history of dissembling as a company. If Her and SJ weren't the driving inspiration for the Sky voice, they would have made Murati available to explicitly refute those claims. Her absence speaks volumes.

And OpenAI dropping Sky immediately speaks even louder. It means that somewhere there is a smoking gun that would destroy them in court. [Edit: it turns out the smoking guns were already public: in addition to the CEO's Her tweet, his co-founder Karpathy explicitly linked the voice product to SJ. Game. Set. Match.]

[go to top]