zlacker

This type of bias sounds a lot easier to explain away as a non-issue when we are using "nurse" as the hypothetical prompt. What if the prompt is "criminal", "rapist", or some other negative? Would that change your thought process or would you be okay with the system always returning a person of the same race and gender that statistics indicate is the most likely? Do you see how that could be a problem?

replies(3): >>tines+C1 >>true_r+V3 >>rpmism+K5

>>slg+(OP)
Not the person you responded to, but I do see how someone could be hurt by that, and I want to avoid hurting people. But is this the level at which we should do it? Could skewing search results, i.e. hiding the bias of the real world, give us the impression that everything is fine and we don't need to do anything to actually help people?

I have a feeling that we need to be real with ourselves and solve problems and not paper over them. I feel like people generally expect search engines to tell them what's really there instead of what people wish were there. And if the engines do that, people can get agitated!

I'd almost say that hurt feelings are prerequisite for real change, hard though that may be.

These are all really interesting questions brought up by this technology, thanks for your thoughts. Disclaimer, I'm a fucking idiot with no idea what I'm talking about.

replies(2): >>magica+Z4 >>slg+16

>>slg+(OP)
Cultural biases aren’t uniform across nations. If a prompt returns caucasians for nurses, and other races for criminals then most people in my country would not note that as racism simply because there are not, and there have never in history, been enough caucasians resident for anyone to create significant race theories about them.

This is a far cry from say the USA where that would instantly trigger a response since until the 1960s there was a widespread race based segregation.

>>tines+C1
> Could skewing search results, i.e. hiding the bias of the real world

Which real world? The population you sample from is going to make a big difference. Do you expect it to reflect your day to day life in your own city? Own country? The entire world? Results will vary significantly.

replies(2): >>sangno+m5 >>tines+Q5

>>magica+Z4
For AI, "real world" is likely "the world, as seen by Silicon Valley."

>>slg+(OP)
It's an unfortunate reflection of reality. There are three possible outcomes:

1. The model provides a reflection of reality, as politically inconvenient and hurtful as it may be.

2. The model provides an intentionally obfuscated version with either random traits or non correlative traits.

3. The model refuses to answer.

Which of these is ideal to you?

replies(1): >>slg+p7

>>magica+Z4
I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

If I ask for pictures of Japanese people, I'm not shocked when all the results are of Japanese people. If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased but because the real world is biased and we should do something about that. The difference is that I know what set I'm asking for a sample from, and I can react accordingly.

replies(3): >>magica+J9 >>nyolfe+6b >>jfoste+tZ

>>tines+C1
>Could skewing search results, i.e. hiding the bias of the real world

Your logic seems to rest on this assumption which I don't think is justified. "Skewing search results" is not the same as "hiding the biases of the real world". Showing the most statistically likely result is not the same as showing the world how it truly is.

A generic nurse is statistically going to be female most of the time. However, a model that returns every nurse as female is not showing the real world as it is. It is exaggerating and reinforcing the bias of the real world. It inherently requires a more advanced model to actually represent the real world. I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

replies(1): >>roboca+Qg

>>rpmism+K5
What makes you think those are the only options? Why can't we have an option that the model returns a range of different outputs based off a prompt?

A model that returns 100% of nurses as female might be statistically more accurate than a model that returns 50% of nurses as female, but it is still not an accurate reflection of the real world. I agree that the model shouldn't return a male nurse 50% of the time. Yet an accurate model needs to be able to occasionally return a male nurse without being directly prompted for a "male nurse". Anything else would also be inaccurate.

replies(1): >>rpmism+A7

>>slg+p7
So, the model should have a knowledge of political correctness, and return multiple results if the first choice might reinforce a stereotype?

replies(1): >>slg+q8

>>rpmism+A7
I never said anything about political correctness. You implied that you want a model that "provides a reflection of reality". All nurses being female is not "a reflection of reality". It is a distortion of reality because the model doesn't actually understand gender or nurses.

replies(1): >>rpmism+5B

>>tines+Q5
> If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased

Well the results would unquestionably be biased. All results being black people wouldn't reflect reality at all, and hurting feelings to enact change seems like a poor justification for incorrect results.

> I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

Ok, and let's say I ask for "criminals in Cheyenne Wyoming" and it doesn't know the answer to that, should it just do its best to answer? Seem risky if people are going to get fired up about it and act on this to get "real change".

That seems like a good parallel to what we're talking about here, since it's very unlikely that crime statistics were fed into this image generating model.

>>tines+Q5
> If I asked for "criminals in the United States" and all the results are black people,

curiously, this search actually only returns white people for me on GIS

>>slg+16
> I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

Every model will have some random biases. Some of those random biases will undesirably exaggerate the real world. Every model will undesirably exaggerate something. Therefore no model should be shared.

Your goal is nice, but impractical?

replies(2): >>slg+ck >>barney+JX

>>roboca+Qg
Fittingly, your comment fails into the same criticism I had of the model. It shows a refusal/inability to engage with the full complexities of the situation.

I said "It is reasonable... to avoid sharing models". That is an acknowledged that the creators are acting reasonably. It does not imply anything as extreme as "no model should be shared". The only way to get from A to B there is for you to assume that I think there is only one reasonable response and every other possible reaction is unreasonable. Doesn't that seem like a silly assumption?

replies(1): >>roboca+1V

>>slg+q8
A majority of nurses are women, therefore a woman would be a reasonable representation of a nurse. Obviously that's not a helpful stereotype, because male nurses exist and face challenges due to not fitting the stereotypes. The model is dumb, and outputs what it's seen. Is that wrong?

replies(1): >>webmav+HT7

>>slg+ck

  “When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’

  ’The question is,’ said Alice, ‘whether you can make words mean so many different things.’

  ’The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.”

>>roboca+Qg
> Your goal is nice, but impractical?

If the only way to do AI is to encode racism etc, then we shouldn't be doing AI at all.

>>tines+Q5
In a way, if the model brings back an image for "criminals in the United States" that isn't based on the statistical reality, isn't it essentially complicit in sweeping a major social issue under the rug?

We may not like what it shows us, but blindfolding ourselves is not the solution to that problem.

replies(1): >>webmav+dS7

>>jfoste+tZ
At the very least we should expect that the results not be more biased than reality. Not all criminals are Black. Not all are men. Not all are poor. If the model (which is stochastic) only outputs poor Black men, rather than a distribution that is closer to reality, it is exhibiting bias and it is fair to ask why the data it picked that bias up from is not reflective of reality.

replies(1): >>jfoste+rT7

>>webmav+dS7
Yeah, it makes sense for the results to simply reflect reality as closely as possible. No bias in any direction is desirable.

replies(1): >>webmav+sTa

>>rpmism+5B
It isn't wrong, but we aren't talking about the model somehow magically transcending the data it's seen. We're talking about making sure the data it sees is representative, so the results it outputs are as well.

Given that male nurses exist (and though less common, certainly aren't rare), why has the model apparently seen so few?

There actually is a fairly simple explanation: because the images it has seen labelled "nurse" are more likely from stock photography sites rather than photos of actual nurses, and stock photography is often stereotypical rather than typical.

>>jfoste+rT7
Sarcasm, eh? At least there's no way THAT could be taken the wrong way.