zlacker

[parent] [thread] 41 comments
1. bufbup+(OP)[view] [source] 2022-05-23 22:14:32
At the end of a day, if you ask for a nurse, should the model output a male or female by default? If the input text lacks context/nuance, then the model must have some bias to infer the user's intent. This holds true for any image it generates; not just the politically sensitive ones. For example, if I ask for a picture of a person, and don't get one with pink hair, is that a shortcoming of the model?

I'd say that bias is only an issue if it's unable to respond to additional nuance in the input text. For example, if I ask for a "male nurse" it should be able to generate the less likely combination. Same with other races, hair colors, etc... Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.

replies(5): >>karpie+h1 >>slg+22 >>sangno+97 >>pshc+ib >>webmav+8T7
2. karpie+h1[view] [source] 2022-05-23 22:22:40
>>bufbup+(OP)
> At the end of a day, if you ask for a nurse, should the model output a male or female by default?

Randomly pick one.

> Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.

Sure, and you can never make a medical procedure 100% safe. Doesn't mean that you don't try to make them safer. You can trim the obvious low hanging fruit though.

replies(3): >>calvin+E2 >>pxmpxm+x3 >>nmfish+7E5
3. slg+22[view] [source] 2022-05-23 22:27:22
>>bufbup+(OP)
This type of bias sounds a lot easier to explain away as a non-issue when we are using "nurse" as the hypothetical prompt. What if the prompt is "criminal", "rapist", or some other negative? Would that change your thought process or would you be okay with the system always returning a person of the same race and gender that statistics indicate is the most likely? Do you see how that could be a problem?
replies(3): >>tines+E3 >>true_r+X5 >>rpmism+M7
◧◩
4. calvin+E2[view] [source] [discussion] 2022-05-23 22:30:48
>>karpie+h1
what if I asked the model to show me a sunday school photograph of baptists in the National Baptist Convention?
replies(1): >>rvnx+46
◧◩
5. pxmpxm+x3[view] [source] [discussion] 2022-05-23 22:37:24
>>karpie+h1
> Randomly pick one.

How does the model back out the "certain people would like to pretend it's a fair coin toss that a randomly selected nurse is male or female" feature?

It won't be in any representative training set, so you're back to fishing for stock photos on getty rather than generating things.

replies(1): >>shadow+88
◧◩
6. tines+E3[view] [source] [discussion] 2022-05-23 22:38:16
>>slg+22
Not the person you responded to, but I do see how someone could be hurt by that, and I want to avoid hurting people. But is this the level at which we should do it? Could skewing search results, i.e. hiding the bias of the real world, give us the impression that everything is fine and we don't need to do anything to actually help people?

I have a feeling that we need to be real with ourselves and solve problems and not paper over them. I feel like people generally expect search engines to tell them what's really there instead of what people wish were there. And if the engines do that, people can get agitated!

I'd almost say that hurt feelings are prerequisite for real change, hard though that may be.

These are all really interesting questions brought up by this technology, thanks for your thoughts. Disclaimer, I'm a fucking idiot with no idea what I'm talking about.

replies(2): >>magica+17 >>slg+38
◧◩
7. true_r+X5[view] [source] [discussion] 2022-05-23 22:53:57
>>slg+22
Cultural biases aren’t uniform across nations. If a prompt returns caucasians for nurses, and other races for criminals then most people in my country would not note that as racism simply because there are not, and there have never in history, been enough caucasians resident for anyone to create significant race theories about them.

This is a far cry from say the USA where that would instantly trigger a response since until the 1960s there was a widespread race based segregation.

◧◩◪
8. rvnx+46[view] [source] [discussion] 2022-05-23 22:54:38
>>calvin+E2
The pictures I got from a similar model when asking for a "sunday school photograph of baptists in the National Baptist Convention": https://ibb.co/sHGZwh7
replies(1): >>calvin+q6
◧◩◪◨
9. calvin+q6[view] [source] [discussion] 2022-05-23 22:58:52
>>rvnx+46
and how do we _feel_ about that outcome?
replies(1): >>andyba+S59
◧◩◪
10. magica+17[view] [source] [discussion] 2022-05-23 23:03:41
>>tines+E3
> Could skewing search results, i.e. hiding the bias of the real world

Which real world? The population you sample from is going to make a big difference. Do you expect it to reflect your day to day life in your own city? Own country? The entire world? Results will vary significantly.

replies(2): >>sangno+o7 >>tines+S7
11. sangno+97[view] [source] 2022-05-23 23:04:46
>>bufbup+(OP)
> At the end of a day, if you ask for a nurse, should the model output a male or female by default?

This depends on the application. As an example, it would be a problem if it's used as a CV-screening app that's implicitly down-ranking male-applicants to nurse positions, resulting in fewer interviews for them.

◧◩◪◨
12. sangno+o7[view] [source] [discussion] 2022-05-23 23:06:41
>>magica+17
For AI, "real world" is likely "the world, as seen by Silicon Valley."
◧◩
13. rpmism+M7[view] [source] [discussion] 2022-05-23 23:09:18
>>slg+22
It's an unfortunate reflection of reality. There are three possible outcomes:

1. The model provides a reflection of reality, as politically inconvenient and hurtful as it may be.

2. The model provides an intentionally obfuscated version with either random traits or non correlative traits.

3. The model refuses to answer.

Which of these is ideal to you?

replies(1): >>slg+r9
◧◩◪◨
14. tines+S7[view] [source] [discussion] 2022-05-23 23:09:39
>>magica+17
I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

If I ask for pictures of Japanese people, I'm not shocked when all the results are of Japanese people. If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased but because the real world is biased and we should do something about that. The difference is that I know what set I'm asking for a sample from, and I can react accordingly.

replies(3): >>magica+Lb >>nyolfe+8d >>jfoste+v11
◧◩◪
15. slg+38[view] [source] [discussion] 2022-05-23 23:10:32
>>tines+E3
>Could skewing search results, i.e. hiding the bias of the real world

Your logic seems to rest on this assumption which I don't think is justified. "Skewing search results" is not the same as "hiding the biases of the real world". Showing the most statistically likely result is not the same as showing the world how it truly is.

A generic nurse is statistically going to be female most of the time. However, a model that returns every nurse as female is not showing the real world as it is. It is exaggerating and reinforcing the bias of the real world. It inherently requires a more advanced model to actually represent the real world. I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

replies(1): >>roboca+Si
◧◩◪
16. shadow+88[view] [source] [discussion] 2022-05-23 23:11:22
>>pxmpxm+x3
Yep, that's the hard problem Google is not comfortable releasing the API to this until they have it solved.
replies(1): >>zarzav+Pb
◧◩◪
17. slg+r9[view] [source] [discussion] 2022-05-23 23:20:53
>>rpmism+M7
What makes you think those are the only options? Why can't we have an option that the model returns a range of different outputs based off a prompt?

A model that returns 100% of nurses as female might be statistically more accurate than a model that returns 50% of nurses as female, but it is still not an accurate reflection of the real world. I agree that the model shouldn't return a male nurse 50% of the time. Yet an accurate model needs to be able to occasionally return a male nurse without being directly prompted for a "male nurse". Anything else would also be inaccurate.

replies(1): >>rpmism+C9
◧◩◪◨
18. rpmism+C9[view] [source] [discussion] 2022-05-23 23:22:06
>>slg+r9
So, the model should have a knowledge of political correctness, and return multiple results if the first choice might reinforce a stereotype?
replies(1): >>slg+sa
◧◩◪◨⬒
19. slg+sa[view] [source] [discussion] 2022-05-23 23:29:24
>>rpmism+C9
I never said anything about political correctness. You implied that you want a model that "provides a reflection of reality". All nurses being female is not "a reflection of reality". It is a distortion of reality because the model doesn't actually understand gender or nurses.
replies(1): >>rpmism+7D
20. pshc+ib[view] [source] 2022-05-23 23:37:19
>>bufbup+(OP)
Perhaps to avoid this issue, future versions of the model would throw an error like “bias leak: please specify a gender for the nurse at character 32”
◧◩◪◨⬒
21. magica+Lb[view] [source] [discussion] 2022-05-23 23:40:56
>>tines+S7
> If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased

Well the results would unquestionably be biased. All results being black people wouldn't reflect reality at all, and hurting feelings to enact change seems like a poor justification for incorrect results.

> I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

Ok, and let's say I ask for "criminals in Cheyenne Wyoming" and it doesn't know the answer to that, should it just do its best to answer? Seem risky if people are going to get fired up about it and act on this to get "real change".

That seems like a good parallel to what we're talking about here, since it's very unlikely that crime statistics were fed into this image generating model.

◧◩◪◨
22. zarzav+Pb[view] [source] [discussion] 2022-05-23 23:41:30
>>shadow+88
But why is it a problem? The AI is just a mirror showing us ourselves. That’s a good thing. How does it help anyone to make an AI that presents a fake world so that we can pretend that we live in a world that we actually don’t? Disassociation from reality is more dangerous than bias.
replies(3): >>shadow+Wd >>astran+Yn >>Daishi+8y
◧◩◪◨⬒
23. nyolfe+8d[view] [source] [discussion] 2022-05-23 23:53:02
>>tines+S7
> If I asked for "criminals in the United States" and all the results are black people,

curiously, this search actually only returns white people for me on GIS

◧◩◪◨⬒
24. shadow+Wd[view] [source] [discussion] 2022-05-23 23:59:02
>>zarzav+Pb
> The AI is just a mirror showing us ourselves.

That's one hypothesis.

◧◩◪◨
25. roboca+Si[view] [source] [discussion] 2022-05-24 00:40:43
>>slg+38
> I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

Every model will have some random biases. Some of those random biases will undesirably exaggerate the real world. Every model will undesirably exaggerate something. Therefore no model should be shared.

Your goal is nice, but impractical?

replies(2): >>slg+em >>barney+LZ
◧◩◪◨⬒
26. slg+em[view] [source] [discussion] 2022-05-24 01:11:18
>>roboca+Si
Fittingly, your comment fails into the same criticism I had of the model. It shows a refusal/inability to engage with the full complexities of the situation.

I said "It is reasonable... to avoid sharing models". That is an acknowledged that the creators are acting reasonably. It does not imply anything as extreme as "no model should be shared". The only way to get from A to B there is for you to assume that I think there is only one reasonable response and every other possible reaction is unreasonable. Doesn't that seem like a silly assumption?

replies(1): >>roboca+3X
◧◩◪◨⬒
27. astran+Yn[view] [source] [discussion] 2022-05-24 01:25:45
>>zarzav+Pb
In the days when Sussman was a novice Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-Tac-Toe." "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play" Minsky shut his eyes, "Why do you close your eyes?", Sussman asked his teacher. "So that the room will be empty." At that moment, Sussman was enlightened.

The AI doesn’t know what’s common or not. You don’t know if it’s going to be correct unless you’ve tested it. Just assuming whatever it comes out with is right is going to work as well as asking a psychic for your future.

replies(1): >>zarzav+1D
◧◩◪◨⬒
28. Daishi+8y[view] [source] [discussion] 2022-05-24 03:13:19
>>zarzav+Pb
The AI is a mirror of the text and image corpora it was presented, as parsed and sanitized by the team in question.
◧◩◪◨⬒⬓
29. zarzav+1D[view] [source] [discussion] 2022-05-24 04:14:16
>>astran+Yn
The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

If they were to weight the training data so that there were an equal number of male and female nurses, then it may well produce male and female nurses with equal probability, but it would also learn an incorrect understanding of the world.

That is quite distinct from weighting the data so that it has a greater correspondence to reality. For example, if Africa is not represented well then weighting training data from Africa more strongly is justifiable.

The point is, it’s not a good thing for us to intentionally teach AIs a world that is idealized and false.

As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections, lest we start to disassociate from reality.

Chinese media (or insert your favorite unfree regime) also presents China as a utopia.

replies(2): >>astran+LD >>shadow+fn1
◧◩◪◨⬒⬓
30. rpmism+7D[view] [source] [discussion] 2022-05-24 04:14:57
>>slg+sa
A majority of nurses are women, therefore a woman would be a reasonable representation of a nurse. Obviously that's not a helpful stereotype, because male nurses exist and face challenges due to not fitting the stereotypes. The model is dumb, and outputs what it's seen. Is that wrong?
replies(1): >>webmav+JV7
◧◩◪◨⬒⬓⬔
31. astran+LD[view] [source] [discussion] 2022-05-24 04:22:33
>>zarzav+1D
> The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

No it is not, because you don’t know if it’s been shown each one of its samples the same number of times, or if it overweighted some of its samples more than others. There’s normal reasons both of these would happen.

◧◩◪◨⬒⬓
32. roboca+3X[view] [source] [discussion] 2022-05-24 07:40:59
>>slg+em

  “When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’

  ’The question is,’ said Alice, ‘whether you can make words mean so many different things.’

  ’The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.”
◧◩◪◨⬒
33. barney+LZ[view] [source] [discussion] 2022-05-24 08:09:53
>>roboca+Si
> Your goal is nice, but impractical?

If the only way to do AI is to encode racism etc, then we shouldn't be doing AI at all.

◧◩◪◨⬒
34. jfoste+v11[view] [source] [discussion] 2022-05-24 08:23:42
>>tines+S7
In a way, if the model brings back an image for "criminals in the United States" that isn't based on the statistical reality, isn't it essentially complicit in sweeping a major social issue under the rug?

We may not like what it shows us, but blindfolding ourselves is not the solution to that problem.

replies(1): >>webmav+fU7
◧◩◪◨⬒⬓⬔
35. shadow+fn1[view] [source] [discussion] 2022-05-24 11:45:08
>>zarzav+1D
> As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections...

Is it? I'm reminded of the Microsoft Tay experiment, were they attempted to train an AI by letting Twitter users interact with it.

The result was a non-viable mess that nobody liked.

◧◩
36. nmfish+7E5[view] [source] [discussion] 2022-05-25 16:56:52
>>karpie+h1
What about preschool teacher?

I say this because I’ve been visiting a number of childcare centres over the past few days and I still have yet to see a single male teacher.

37. webmav+8T7[view] [source] 2022-05-26 07:53:26
>>bufbup+(OP)
> If the input text lacks context/nuance, then the model must have some bias to infer the user's intent. This holds true for any image it generates; not just the politically sensitive ones. For example, if I ask for a picture of a person, and don't get one with pink hair, is that a shortcoming of the model?

You're ignoring that these models are stochastic. If I ask for a nurse and always get an image of a woman in scrubs, then yes, the model exhibits bias. If I get a male nurse half the time, we can say the model is unbiased WRT gender, at least. The same logic applies to CEOs always being old white men, criminals always being Black men, and so on. Stochastic models can output results that when aggregated exhibit a distribution from which we can infer bias or the lack thereof.

◧◩◪◨⬒⬓
38. webmav+fU7[view] [source] [discussion] 2022-05-26 08:04:34
>>jfoste+v11
At the very least we should expect that the results not be more biased than reality. Not all criminals are Black. Not all are men. Not all are poor. If the model (which is stochastic) only outputs poor Black men, rather than a distribution that is closer to reality, it is exhibiting bias and it is fair to ask why the data it picked that bias up from is not reflective of reality.
replies(1): >>jfoste+tV7
◧◩◪◨⬒⬓⬔
39. jfoste+tV7[view] [source] [discussion] 2022-05-26 08:18:15
>>webmav+fU7
Yeah, it makes sense for the results to simply reflect reality as closely as possible. No bias in any direction is desirable.
replies(1): >>webmav+uVa
◧◩◪◨⬒⬓⬔
40. webmav+JV7[view] [source] [discussion] 2022-05-26 08:21:45
>>rpmism+7D
It isn't wrong, but we aren't talking about the model somehow magically transcending the data it's seen. We're talking about making sure the data it sees is representative, so the results it outputs are as well.

Given that male nurses exist (and though less common, certainly aren't rare), why has the model apparently seen so few?

There actually is a fairly simple explanation: because the images it has seen labelled "nurse" are more likely from stock photography sites rather than photos of actual nurses, and stock photography is often stereotypical rather than typical.

◧◩◪◨⬒
41. andyba+S59[view] [source] [discussion] 2022-05-26 16:38:46
>>calvin+q6
It's gone now. What was it?
◧◩◪◨⬒⬓⬔⧯
42. webmav+uVa[view] [source] [discussion] 2022-05-27 09:05:19
>>jfoste+tV7
Sarcasm, eh? At least there's no way THAT could be taken the wrong way.
[go to top]