zlacker

[parent] [thread] 97 comments
1. tines+(OP)[view] [source] 2022-05-23 21:33:39
This raises some really interesting questions.

We certainly don't want to perpetuate harmful stereotypes. But is it a flaw that the model encodes the world as it really is, statistically, rather than as we would like it to be? By this I mean that there are more light-skinned people in the west than dark, and there are more women nurses than men, which is reflected in the model's training data. If the model only generates images of female nurses, is that a problem to fix, or a correct assessment of the data?

If some particular demographic shows up in 51% of the data but 100% of the model's output shows that one demographic, that does seem like a statistics problem that the model could correct by just picking less likely "next token" predictions.

Also, is it wrong to have localized models? For example, should a model for use in Japan conform to the demographics of Japan, or to that of the world?

replies(10): >>karpie+I1 >>daenz+D2 >>godels+W2 >>jonny_+u3 >>Imnimo+x7 >>SnowHi+8a >>skybri+Qa >>ben_w+Ac >>pshc+Ch >>webmav+7K7
2. karpie+I1[view] [source] 2022-05-23 21:43:40
>>tines+(OP)
It depends on whether you'd like the model to learn casual or correlative relationships.

If you want the model to understand what a "nurse" actually is, then it shouldn't be associated with female.

If you want the model to understand how the word "nurse" is usually used, without regard for what a "nurse" actually is, then associating it with female is fine.

The issue with a correlative model is that it can easily be self-reinforcing.

replies(5): >>jdashg+v3 >>Ludwig+Q5 >>bufbup+m7 >>sineno+wu >>drdeca+Vo1
3. daenz+D2[view] [source] 2022-05-23 21:48:57
>>tines+(OP)
I think the statistics/representation problem is a big problem on its own, but IMO the bigger problem here is democratizing access to human-like creativity. Currently, the ability to create compelling art is only held by those with some artistic talent. With a tool like this, that restriction is gone. Everyone, no matter how uncreative, untalented, or uncommitted, can create compelling visuals, provided they can use language to describe what they want to see.

So even if we managed to create a perfect model of representation and inclusion, people could still use it to generate extremely offensive images with little effort. I think people see that as profoundly dangerous. Restricting the ability to be creative seems to be a new frontier of censorship.

replies(2): >>concor+f6 >>adrian+z8
4. godels+W2[view] [source] 2022-05-23 21:50:24
>>tines+(OP)
> But is it a flaw that the model encodes the world as it really is

I want to be clear here, bias can be introduced at many different points. There's dataset bias, model bias, and training bias. Every model is biased. Every dataset is biased.

Yes, the real world is also biased. But I want to make sure that there are ways to resolve this issue. It is terribly difficult, especially in a DL framework (even more so in a generative model), but it is possible to significantly reduce the real world bias.

replies(1): >>tines+O5
5. jonny_+u3[view] [source] 2022-05-23 21:53:37
>>tines+(OP)
> But is it a flaw that the model encodes the world as it really is

Does a bias towards lighter skin represent reality? I was under the impression that Caucasians are a minority globally.

I read the disclaimer as "the model does NOT represent reality".

replies(4): >>tines+l5 >>fnordp+I5 >>ma2rte+R8 >>nearbu+eP
◧◩
6. jdashg+v3[view] [source] [discussion] 2022-05-23 21:53:42
>>karpie+I1
Additionally, if you optimize for most-likely-as-best, you will end up with the stereotypical result 100% of the time, instead of in proportional frequency to the statistics.

Put another way, when we ask for an output optimized for "nursiness", is that not a request for some ur stereotypical nurse?

replies(2): >>jvalen+a5 >>ar_lan+79
◧◩◪
7. jvalen+a5[view] [source] [discussion] 2022-05-23 22:02:52
>>jdashg+v3
You could simply encode a score for how well the output matches the input. If 25% of trees in summer are brown, perhaps the output should also have 25% brown. The model scores itself on frequencies as well as correctness.
replies(2): >>spywar+u7 >>astran+68
◧◩
8. tines+l5[view] [source] [discussion] 2022-05-23 22:03:51
>>jonny_+u3
Well first, I didn't say caucasian; light-skinned includes Spanish people and many others that caucasian excludes, and that's why I said the former. Also, they are a minority globally, but the GP mentioned "Western stereotypes", and they're a majority in the West, so that's why I said "in the west" when I said that there are more light-skinned people.
◧◩
9. fnordp+I5[view] [source] [discussion] 2022-05-23 22:05:58
>>jonny_+u3
Worse these models are fed from media sourced in a society that tells a different story of reality than reality actually has. How can they be accurate? They just reflect the biases of our various medias and arts. But I don’t think there’s any meaningful resolution in the present other than acknowledging this and trying to release more representative models as you can.
◧◩
10. tines+O5[view] [source] [discussion] 2022-05-23 22:06:18
>>godels+W2
> Every dataset is biased.

Sure, I wasn't questioning the bias of the data, I was talking about the bias of the real world and whether we want the model to be "unbiased about bias" i.e. metabiased or not.

Showing nurses equally as men and women is not biased, but it's metabiased, because the real world is biased. Whether metabias is right or not is more interesting than the question of whether bias is wrong because it's more subtle.

Disclaimer: I'm a fucking idiot and I have no idea what I'm talking about so take with a grain of salt.

replies(1): >>john_y+j9
◧◩
11. Ludwig+Q5[view] [source] [discussion] 2022-05-23 22:06:26
>>karpie+I1
> If you want the model to understand how the word "nurse" is usually used, without regard for what a "nurse" actually is, then associating it with female is fine.

That’s a distinction without a difference. Meaning is use.

replies(2): >>tines+P6 >>mdp202+n8
◧◩
12. concor+f6[view] [source] [discussion] 2022-05-23 22:09:05
>>daenz+D2
I can't quite tell if you're being sarcastic about people being able to make things other people would find offensive being a problem. Are you missing an /s?
◧◩◪
13. tines+P6[view] [source] [discussion] 2022-05-23 22:11:16
>>Ludwig+Q5
Not really; the gender of a nurse is accidental, other properties are essential.
replies(3): >>codeth+J9 >>paisaw+Kc >>Ludwig+9f
◧◩
14. bufbup+m7[view] [source] [discussion] 2022-05-23 22:14:32
>>karpie+I1
At the end of a day, if you ask for a nurse, should the model output a male or female by default? If the input text lacks context/nuance, then the model must have some bias to infer the user's intent. This holds true for any image it generates; not just the politically sensitive ones. For example, if I ask for a picture of a person, and don't get one with pink hair, is that a shortcoming of the model?

I'd say that bias is only an issue if it's unable to respond to additional nuance in the input text. For example, if I ask for a "male nurse" it should be able to generate the less likely combination. Same with other races, hair colors, etc... Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.

replies(5): >>karpie+D8 >>slg+o9 >>sangno+ve >>pshc+Ei >>webmav+u08
◧◩◪◨
15. spywar+u7[view] [source] [discussion] 2022-05-23 22:15:06
>>jvalen+a5
Suppose 10% of people have green skin. And 90% of those people have broccoli hair. White people don't have broccoli hair.

What percent of people should be rendered as white people with broccoli hair? What if you request green people. Or broccoli haired people. Or white broccoli haired people? Or broccoli haired nazis?

It gets hard with these conditional probabilities

16. Imnimo+x7[view] [source] 2022-05-23 22:15:26
>>tines+(OP)
>If some particular demographic shows up in 51% of the data but 100% of the model's output shows that one demographic, that does seem like a statistics problem that the model could correct by just picking less likely "next token" predictions.

Yeah, but you get that same effect on every axis, not just the one you're trying to correct. You might get male nurses, but they have green hair and six fingers, because you're sampling from the tail on all axes.

replies(1): >>tines+g8
◧◩◪◨
17. astran+68[view] [source] [discussion] 2022-05-23 22:18:25
>>jvalen+a5
The only reason these models work is that we don’t interfere with them like that.

Your description is closer to how the open source CLIP+GAN models did it - if you ask for “tree” it starts growing the picture towards treeness until it’s all averagely tree-y rather than being “a picture of a single tree”.

It would be nice if asking for N samples got a diversity of traits you didn’t explicitly ask for. OpenAI seems to solve this by not letting you see it generate humans at all…

◧◩
18. tines+g8[view] [source] [discussion] 2022-05-23 22:19:41
>>Imnimo+x7
Yeah, good point, it's not as simple as I thought.
◧◩◪
19. mdp202+n8[view] [source] [discussion] 2022-05-23 22:20:38
>>Ludwig+Q5
Very certainly not, since use is individual and thus a function of competence. So, adherence to meaning depends on the user. Conflict resolution?

And anyway - contextually -, the representational natures of "use" (instances) and that of "meaning" (definition) are completely different.

replies(2): >>layer8+jb >>Ludwig+of
◧◩
20. adrian+z8[view] [source] [discussion] 2022-05-23 22:22:20
>>daenz+D2
> So even if we managed to create a perfect model of representation and inclusion, people could still use it to generate extremely offensive images with little effort. I think people see that as profoundly dangerous.

Do they see it as dangerous? Or just offensive?

I can understand why people wouldn’t want a tool they have created to be used to generate disturbing, offensive or disgusting imagery. But I don’t really see how doing that would be dangerous.

In fact, I wonder if this sort of technology could reduce the harm caused by people with an interest in disgusting images, because no one needs to be harmed for a realistic image to be created. I am creeping myself out with this line of thinking, but it seems like one potential beneficial - albeit disturbing - outcome.

> Restricting the ability to be creative seems to be a new frontier of censorship.

I agree this is a new frontier, but it’s not censorship to withhold your own work. I also don’t really think this involves much creativity. I suppose coming up with prompts involves a modicum of creativity, but the real creator here is the model, it seems to me.

replies(3): >>tines+sa >>gknoy+Ea >>webmav+H58
◧◩◪
21. karpie+D8[view] [source] [discussion] 2022-05-23 22:22:40
>>bufbup+m7
> At the end of a day, if you ask for a nurse, should the model output a male or female by default?

Randomly pick one.

> Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.

Sure, and you can never make a medical procedure 100% safe. Doesn't mean that you don't try to make them safer. You can trim the obvious low hanging fruit though.

replies(3): >>calvin+0a >>pxmpxm+Ta >>nmfish+tL5
◧◩
22. ma2rte+R8[view] [source] [discussion] 2022-05-23 22:23:59
>>jonny_+u3
Caucasians are overrepresented in internet pictures.
replies(2): >>pxmpxm+qc >>jonny_+6d
◧◩◪
23. ar_lan+79[view] [source] [discussion] 2022-05-23 22:25:17
>>jdashg+v3
You could stipulate that it roll a die based on percentage results - if 70% of Americans are "white", then 70% of the time show a white person - 13% of the time the result should be black, etc.

That's excessively simplified but wouldn't this drop the stereotype and better reflect reality?

replies(2): >>ghayes+ga >>SnowHi+ya
◧◩◪
24. john_y+j9[view] [source] [discussion] 2022-05-23 22:26:52
>>tines+O5
Please be kinder to yourself. You need to be your own strongest advocate, and that's not incompatible with being humble. You have plenty to contribute to this world, and the vast majority of us appreciate what you have to offer.
replies(1): >>Smoosh+sc
◧◩◪
25. slg+o9[view] [source] [discussion] 2022-05-23 22:27:22
>>bufbup+m7
This type of bias sounds a lot easier to explain away as a non-issue when we are using "nurse" as the hypothetical prompt. What if the prompt is "criminal", "rapist", or some other negative? Would that change your thought process or would you be okay with the system always returning a person of the same race and gender that statistics indicate is the most likely? Do you see how that could be a problem?
replies(3): >>tines+0b >>true_r+jd >>rpmism+8f
◧◩◪◨
26. codeth+J9[view] [source] [discussion] 2022-05-23 22:29:25
>>tines+P6
While not essential, I wouldn't exactly call the gender "accidental":

> We investigated sex differences in 473,260 adolescents’ aspirations to work in things-oriented (e.g., mechanic), people-oriented (e.g., nurse), and STEM (e.g., mathematician) careers across 80 countries and economic regions using the 2018 Programme for International Student Assessment (PISA). We analyzed student career aspirations in combination with student achievement in mathematics, reading, and science, as well as parental occupations and family wealth. In each country and region, more boys than girls aspired to a things-oriented or STEM occupation and more girls than boys to a people-oriented occupation. These sex differences were larger in countries with a higher level of women's empowerment. We explain this counter-intuitive finding through the indirect effect of wealth. Women's empowerment is associated with relatively high levels of national wealth and this wealth allows more students to aspire to occupations they are intrinsically interested in.

Source: https://psyarxiv.com/zhvre/ (HN discussion: https://news.ycombinator.com/item?id=29040132)

replies(2): >>daenz+Hc >>astran+ue
◧◩◪◨
27. calvin+0a[view] [source] [discussion] 2022-05-23 22:30:48
>>karpie+D8
what if I asked the model to show me a sunday school photograph of baptists in the National Baptist Convention?
replies(1): >>rvnx+qd
28. SnowHi+8a[view] [source] 2022-05-23 22:31:47
>>tines+(OP)
It’s the same as with an artist: “hey artist, draw me a nurse.” “Hmm okay, do you want it a guy or girl?” “Don’t ask me, just draw what I’m saying.” The artist can then say: “Okay, but accept my biases.” or “I can’t since your input is ambiguous.”

For a one-shot generative algorithm you must accept the artist’s biases.

replies(1): >>rvnx+Ya
◧◩◪◨
29. ghayes+ga[view] [source] [discussion] 2022-05-23 22:32:31
>>ar_lan+79
Is this going to be hand-rolled? Do you change the prompt you pass to the network to reflect the desired outcomes?
◧◩◪
30. tines+sa[view] [source] [discussion] 2022-05-23 22:34:07
>>adrian+z8
> In fact, I wonder if this sort of technology could reduce the harm caused by people with an interest in disgusting images, because no one needs to be harmed for a realistic image to be created. I am creeping myself out with this line of thinking, but it seems like one potential beneficial - albeit disturbing - outcome.

Interesting idea, but is there any evidence that e.g. consuming disturbing images makes people less likely to act out on disturbing urges? Far from catharsis, I'd imagine consumption of such material to increase one's appetite and likelihood of fulfilling their desires in real life rather than to decrease it.

I suppose it might be hard to measure.

◧◩◪◨
31. SnowHi+ya[view] [source] [discussion] 2022-05-23 22:34:32
>>ar_lan+79
No, because a user will see a particular image not the statistically ensemble. It will at times show an Eskimo without a hand because they do statistically exist. But the user definitely does not want that.
◧◩◪
32. gknoy+Ea[view] [source] [discussion] 2022-05-23 22:35:01
>>adrian+z8
> > ... people could still use it to generate extremely offensive images with little effort. I think people see that as profoundly dangerous. > Do they see it as dangerous? Or just offensive?

I won't speak to whether something is "offensive", but I think that having underlying biases in image-classification or generation has very worrying secondary effects, especially given that organizations like law enforcement want to do things like facial recognition. It's not a perfect analogue, but I could easily see some company pitch a sketch-artist-replacement service that generated images based on someone's description. The potential for having inherent bias present in that makes that kind of thing worrying, especially since the people in charge of buying it are likely to care, or notice, about the caveats.

It does feel like a little bit of a stretch, but at the same time we've also seen such things happen with image classification systems.

33. skybri+Qa[view] [source] 2022-05-23 22:37:12
>>tines+(OP)
Yes, there is a denominator problem. When selecting a sample "at random," what do you want the denominator to be? It could be "people in the US", "people in the West" (whatever countries you mean by that) or "people worldwide."

Also, getting a random sample of any demographic would be really hard, so no machine learning project is going to do that. Instead you've got a random sample of some arbitrary dataset that's not directly relevant to any particular purpose.

This is, in essence, a design or artistic problem: the Google researchers have some idea of what they want the statistical properties of their image generator to look like. What it does isn't it. So, artistically, the result doesn't meet their standards, and they're going to fix it.

There is no objective, universal, scientifically correct answer about which fictional images to generate. That doesn't all art is equally good, or that you should just ship anything without looking at quality along various axes.

◧◩◪◨
34. pxmpxm+Ta[view] [source] [discussion] 2022-05-23 22:37:24
>>karpie+D8
> Randomly pick one.

How does the model back out the "certain people would like to pretend it's a fair coin toss that a randomly selected nurse is male or female" feature?

It won't be in any representative training set, so you're back to fishing for stock photos on getty rather than generating things.

replies(1): >>shadow+uf
◧◩
35. rvnx+Ya[view] [source] [discussion] 2022-05-23 22:38:08
>>SnowHi+8a
Revert back to average representation of a nurse (give no weight to unspecified criterias, gender, age, skin-color, religion, country, hair-style, no style whether it's a drawing or a photography, no information about the year it was made, etc).

“hey artist, draw me a nurse.”

“Hmm okay, do you want it a guy or girl?”

“Don’t ask me, just draw what I’m saying.”

- Ok, I'll draw you what an average nurse looks like.

- Wait, it's a woman! She wears a nurse blouse and she has a nurse cap.

- Is it bad ?

- No.

- Ok then what's the problem, you asked for something that looked like a nurse but didn't specify anything else ?

replies(1): >>SnowHi+pg
◧◩◪◨
36. tines+0b[view] [source] [discussion] 2022-05-23 22:38:16
>>slg+o9
Not the person you responded to, but I do see how someone could be hurt by that, and I want to avoid hurting people. But is this the level at which we should do it? Could skewing search results, i.e. hiding the bias of the real world, give us the impression that everything is fine and we don't need to do anything to actually help people?

I have a feeling that we need to be real with ourselves and solve problems and not paper over them. I feel like people generally expect search engines to tell them what's really there instead of what people wish were there. And if the engines do that, people can get agitated!

I'd almost say that hurt feelings are prerequisite for real change, hard though that may be.

These are all really interesting questions brought up by this technology, thanks for your thoughts. Disclaimer, I'm a fucking idiot with no idea what I'm talking about.

replies(2): >>magica+ne >>slg+pf
◧◩◪◨
37. layer8+jb[view] [source] [discussion] 2022-05-23 22:39:43
>>mdp202+n8
Humans overwhelmingly learn meaning by use, not by definition.
replies(1): >>mdp202+Kb
◧◩◪◨⬒
38. mdp202+Kb[view] [source] [discussion] 2022-05-23 22:42:22
>>layer8+jb
> Humans overwhelmingly learn meaning by use, not by definition

Preliminarily and provisionally. Then, they start discussing their concepts - it is the very definition of Intelligence.

replies(1): >>layer8+be
◧◩◪
39. pxmpxm+qc[view] [source] [discussion] 2022-05-23 22:47:28
>>ma2rte+R8
This, I would imagine this heavily correlates to things like income and gdp per capita.
◧◩◪◨
40. Smoosh+sc[view] [source] [discussion] 2022-05-23 22:47:29
>>john_y+j9
Agreed. They are valid points clearly stated and a valuable contribution to the discussion.
41. ben_w+Ac[view] [source] 2022-05-23 22:48:17
>>tines+(OP)
This sounds like descriptivism vs prescriptivism. In English (native language) I’m a descriptivist, in all other languages I have to tell myself to be a prescriptivist while I’m actively learning and then switch back to descriptivism to notice when the lessons were wrong or misleading.
◧◩◪◨⬒
42. daenz+Hc[view] [source] [discussion] 2022-05-23 22:48:43
>>codeth+J9
The "Gender Equality Paradox"... there's a fascinating episode[0] about it. It's incredible how unscientific and ideologically-motivated one side comes off in it.

0. https://www.youtube.com/watch?v=_XsEsTvfT-M

◧◩◪◨
43. paisaw+Kc[view] [source] [discussion] 2022-05-23 22:48:58
>>tines+P6
How do you know this? Because you can, in your mind, divide the function of a nurse from the statistical reality of nursing?

Are the logical divisions you make in your mind really indicative of anything other than your arbitrary personal preferences?

replies(1): >>tines+Zd
◧◩◪
44. jonny_+6d[view] [source] [discussion] 2022-05-23 22:51:58
>>ma2rte+R8
Right, that's the likely cause of the bias.
◧◩◪◨
45. true_r+jd[view] [source] [discussion] 2022-05-23 22:53:57
>>slg+o9
Cultural biases aren’t uniform across nations. If a prompt returns caucasians for nurses, and other races for criminals then most people in my country would not note that as racism simply because there are not, and there have never in history, been enough caucasians resident for anyone to create significant race theories about them.

This is a far cry from say the USA where that would instantly trigger a response since until the 1960s there was a widespread race based segregation.

◧◩◪◨⬒
46. rvnx+qd[view] [source] [discussion] 2022-05-23 22:54:38
>>calvin+0a
The pictures I got from a similar model when asking for a "sunday school photograph of baptists in the National Baptist Convention": https://ibb.co/sHGZwh7
replies(1): >>calvin+Md
◧◩◪◨⬒⬓
47. calvin+Md[view] [source] [discussion] 2022-05-23 22:58:52
>>rvnx+qd
and how do we _feel_ about that outcome?
replies(1): >>andyba+ed9
◧◩◪◨⬒
48. tines+Zd[view] [source] [discussion] 2022-05-23 22:59:55
>>paisaw+Kc
No, because there's at least one male nurse.
replies(1): >>paisaw+ig
◧◩◪◨⬒⬓
49. layer8+be[view] [source] [discussion] 2022-05-23 23:01:22
>>mdp202+Kb
Most humans don’t do that for most things they have a notion of in their head. It would be much too time consuming to start discussing the meaning of even just a significant fraction of them. For a rough reference point, the English language has over 150.000 words that you could each discuss the meaning of and try to come up with a definition. Not to speak of the difficulties to make that set of definitions noncircular.
replies(1): >>mdp202+8p1
◧◩◪◨⬒
50. magica+ne[view] [source] [discussion] 2022-05-23 23:03:41
>>tines+0b
> Could skewing search results, i.e. hiding the bias of the real world

Which real world? The population you sample from is going to make a big difference. Do you expect it to reflect your day to day life in your own city? Own country? The entire world? Results will vary significantly.

replies(2): >>sangno+Ke >>tines+ef
◧◩◪◨⬒
51. astran+ue[view] [source] [discussion] 2022-05-23 23:04:33
>>codeth+J9
If you ask it to generate “nurse” surely the problem isn’t that it’s going to just generate women, it’s that it’s going to give you women in those Halloween sexy nurse costumes.

If it did, would you believe that’s a real representative nurse because an image model gave it to you?

◧◩◪
52. sangno+ve[view] [source] [discussion] 2022-05-23 23:04:46
>>bufbup+m7
> At the end of a day, if you ask for a nurse, should the model output a male or female by default?

This depends on the application. As an example, it would be a problem if it's used as a CV-screening app that's implicitly down-ranking male-applicants to nurse positions, resulting in fewer interviews for them.

◧◩◪◨⬒⬓
53. sangno+Ke[view] [source] [discussion] 2022-05-23 23:06:41
>>magica+ne
For AI, "real world" is likely "the world, as seen by Silicon Valley."
◧◩◪◨
54. rpmism+8f[view] [source] [discussion] 2022-05-23 23:09:18
>>slg+o9
It's an unfortunate reflection of reality. There are three possible outcomes:

1. The model provides a reflection of reality, as politically inconvenient and hurtful as it may be.

2. The model provides an intentionally obfuscated version with either random traits or non correlative traits.

3. The model refuses to answer.

Which of these is ideal to you?

replies(1): >>slg+Ng
◧◩◪◨
55. Ludwig+9f[view] [source] [discussion] 2022-05-23 23:09:19
>>tines+P6
Not really what? How does that contradict what I've said?
◧◩◪◨⬒⬓
56. tines+ef[view] [source] [discussion] 2022-05-23 23:09:39
>>magica+ne
I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

If I ask for pictures of Japanese people, I'm not shocked when all the results are of Japanese people. If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased but because the real world is biased and we should do something about that. The difference is that I know what set I'm asking for a sample from, and I can react accordingly.

replies(3): >>magica+7j >>nyolfe+uk >>jfoste+R81
◧◩◪◨
57. Ludwig+of[view] [source] [discussion] 2022-05-23 23:10:31
>>mdp202+n8
Definition is an entirely artificial construct and doesn't equate to meaning. Definition depends on other words that you also have to understand.
replies(1): >>mdp202+Zc1
◧◩◪◨⬒
58. slg+pf[view] [source] [discussion] 2022-05-23 23:10:32
>>tines+0b
>Could skewing search results, i.e. hiding the bias of the real world

Your logic seems to rest on this assumption which I don't think is justified. "Skewing search results" is not the same as "hiding the biases of the real world". Showing the most statistically likely result is not the same as showing the world how it truly is.

A generic nurse is statistically going to be female most of the time. However, a model that returns every nurse as female is not showing the real world as it is. It is exaggerating and reinforcing the bias of the real world. It inherently requires a more advanced model to actually represent the real world. I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

replies(1): >>roboca+eq
◧◩◪◨⬒
59. shadow+uf[view] [source] [discussion] 2022-05-23 23:11:22
>>pxmpxm+Ta
Yep, that's the hard problem Google is not comfortable releasing the API to this until they have it solved.
replies(1): >>zarzav+bj
◧◩◪◨⬒⬓
60. paisaw+ig[view] [source] [discussion] 2022-05-23 23:16:48
>>tines+Zd
Please don't waste time with this kind of obtuse response. This fact says nothing about why nursing is a female-dominated career. You claim to know that this is just an accidental fact of history or society -- how do you know that?
replies(1): >>tines+ri
◧◩◪
61. SnowHi+pg[view] [source] [discussion] 2022-05-23 23:17:15
>>rvnx+Ya
The average nurse has three-halfs of a tit.
replies(1): >>mdp202+xq1
◧◩◪◨⬒
62. slg+Ng[view] [source] [discussion] 2022-05-23 23:20:53
>>rpmism+8f
What makes you think those are the only options? Why can't we have an option that the model returns a range of different outputs based off a prompt?

A model that returns 100% of nurses as female might be statistically more accurate than a model that returns 50% of nurses as female, but it is still not an accurate reflection of the real world. I agree that the model shouldn't return a male nurse 50% of the time. Yet an accurate model needs to be able to occasionally return a male nurse without being directly prompted for a "male nurse". Anything else would also be inaccurate.

replies(1): >>rpmism+Yg
◧◩◪◨⬒⬓
63. rpmism+Yg[view] [source] [discussion] 2022-05-23 23:22:06
>>slg+Ng
So, the model should have a knowledge of political correctness, and return multiple results if the first choice might reinforce a stereotype?
replies(1): >>slg+Oh
64. pshc+Ch[view] [source] 2022-05-23 23:28:23
>>tines+(OP)
I think it is problematic, yes, to produce a tool trained on data from the past that reinforces old stereotypes. We can’t just handwave it away as being a reflection of its training data. We would like it to do better by humanity. Fortunately the AI people are well aware of the insidious nature of these biases.
◧◩◪◨⬒⬓⬔
65. slg+Oh[view] [source] [discussion] 2022-05-23 23:29:24
>>rpmism+Yg
I never said anything about political correctness. You implied that you want a model that "provides a reflection of reality". All nurses being female is not "a reflection of reality". It is a distortion of reality because the model doesn't actually understand gender or nurses.
replies(1): >>rpmism+tK
◧◩◪◨⬒⬓⬔
66. tines+ri[view] [source] [discussion] 2022-05-23 23:35:37
>>paisaw+ig
I meant "accidental" in the Aristotelian sense: https://plato.stanford.edu/entries/essential-accidental/
replies(1): >>paisaw+Vu
◧◩◪
67. pshc+Ei[view] [source] [discussion] 2022-05-23 23:37:19
>>bufbup+m7
Perhaps to avoid this issue, future versions of the model would throw an error like “bias leak: please specify a gender for the nurse at character 32”
◧◩◪◨⬒⬓⬔
68. magica+7j[view] [source] [discussion] 2022-05-23 23:40:56
>>tines+ef
> If I asked for "criminals in the United States" and all the results are black people, that should concern me, not because the data set is biased

Well the results would unquestionably be biased. All results being black people wouldn't reflect reality at all, and hurting feelings to enact change seems like a poor justification for incorrect results.

> I'd say it doesn't actually matter, as long as the population sampled is made clear to the user.

Ok, and let's say I ask for "criminals in Cheyenne Wyoming" and it doesn't know the answer to that, should it just do its best to answer? Seem risky if people are going to get fired up about it and act on this to get "real change".

That seems like a good parallel to what we're talking about here, since it's very unlikely that crime statistics were fed into this image generating model.

◧◩◪◨⬒⬓
69. zarzav+bj[view] [source] [discussion] 2022-05-23 23:41:30
>>shadow+uf
But why is it a problem? The AI is just a mirror showing us ourselves. That’s a good thing. How does it help anyone to make an AI that presents a fake world so that we can pretend that we live in a world that we actually don’t? Disassociation from reality is more dangerous than bias.
replies(3): >>shadow+il >>astran+kv >>Daishi+uF
◧◩◪◨⬒⬓⬔
70. nyolfe+uk[view] [source] [discussion] 2022-05-23 23:53:02
>>tines+ef
> If I asked for "criminals in the United States" and all the results are black people,

curiously, this search actually only returns white people for me on GIS

◧◩◪◨⬒⬓⬔
71. shadow+il[view] [source] [discussion] 2022-05-23 23:59:02
>>zarzav+bj
> The AI is just a mirror showing us ourselves.

That's one hypothesis.

◧◩◪◨⬒⬓
72. roboca+eq[view] [source] [discussion] 2022-05-24 00:40:43
>>slg+pf
> I think it is reasonable for the creators to avoid sharing models known to not be smart enough to avoid exaggerating real world biases.

Every model will have some random biases. Some of those random biases will undesirably exaggerate the real world. Every model will undesirably exaggerate something. Therefore no model should be shared.

Your goal is nice, but impractical?

replies(2): >>slg+At >>barney+771
◧◩◪◨⬒⬓⬔
73. slg+At[view] [source] [discussion] 2022-05-24 01:11:18
>>roboca+eq
Fittingly, your comment fails into the same criticism I had of the model. It shows a refusal/inability to engage with the full complexities of the situation.

I said "It is reasonable... to avoid sharing models". That is an acknowledged that the creators are acting reasonably. It does not imply anything as extreme as "no model should be shared". The only way to get from A to B there is for you to assume that I think there is only one reasonable response and every other possible reaction is unreasonable. Doesn't that seem like a silly assumption?

replies(1): >>roboca+p41
◧◩
74. sineno+wu[view] [source] [discussion] 2022-05-24 01:19:28
>>karpie+I1
> It depends on whether you'd like the model to learn casual or correlative relationships.

I expect that in the practical limit of scale achievable, the regularization pressure inherent to the process of training these models converges to https://en.wikipedia.org/wiki/Minimum_description_length and the correlative relationships become optimized away, leaving mostly true causal relationships inherent to data-generating process.

◧◩◪◨⬒⬓⬔⧯
75. paisaw+Vu[view] [source] [discussion] 2022-05-24 01:22:31
>>tines+ri
Yes I understand that. That is only a description of what mental arithmetic you can do if you define your terms arbitrarily conveniently.

"It is possible for a man to provide care" is not the same statement as "it is possible for a sexually dimorphic species in a competitive, capitalistic society (...add more qualifications here) to develop a male-dominated caretaking role"

You're just asserting that you could imagine male nurses without creating a logical contradiction, unlike e.g. circles that have corners. That doesn't mean nursing could be a male-dominated industry under current constraints.

◧◩◪◨⬒⬓⬔
76. astran+kv[view] [source] [discussion] 2022-05-24 01:25:45
>>zarzav+bj
In the days when Sussman was a novice Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-Tac-Toe." "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play" Minsky shut his eyes, "Why do you close your eyes?", Sussman asked his teacher. "So that the room will be empty." At that moment, Sussman was enlightened.

The AI doesn’t know what’s common or not. You don’t know if it’s going to be correct unless you’ve tested it. Just assuming whatever it comes out with is right is going to work as well as asking a psychic for your future.

replies(1): >>zarzav+nK
◧◩◪◨⬒⬓⬔
77. Daishi+uF[view] [source] [discussion] 2022-05-24 03:13:19
>>zarzav+bj
The AI is a mirror of the text and image corpora it was presented, as parsed and sanitized by the team in question.
◧◩◪◨⬒⬓⬔⧯
78. zarzav+nK[view] [source] [discussion] 2022-05-24 04:14:16
>>astran+kv
The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

If they were to weight the training data so that there were an equal number of male and female nurses, then it may well produce male and female nurses with equal probability, but it would also learn an incorrect understanding of the world.

That is quite distinct from weighting the data so that it has a greater correspondence to reality. For example, if Africa is not represented well then weighting training data from Africa more strongly is justifiable.

The point is, it’s not a good thing for us to intentionally teach AIs a world that is idealized and false.

As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections, lest we start to disassociate from reality.

Chinese media (or insert your favorite unfree regime) also presents China as a utopia.

replies(2): >>astran+7L >>shadow+Bu1
◧◩◪◨⬒⬓⬔⧯
79. rpmism+tK[view] [source] [discussion] 2022-05-24 04:14:57
>>slg+Oh
A majority of nurses are women, therefore a woman would be a reasonable representation of a nurse. Obviously that's not a helpful stereotype, because male nurses exist and face challenges due to not fitting the stereotypes. The model is dumb, and outputs what it's seen. Is that wrong?
replies(1): >>webmav+538
◧◩◪◨⬒⬓⬔⧯▣
80. astran+7L[view] [source] [discussion] 2022-05-24 04:22:33
>>zarzav+nK
> The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

No it is not, because you don’t know if it’s been shown each one of its samples the same number of times, or if it overweighted some of its samples more than others. There’s normal reasons both of these would happen.

◧◩
81. nearbu+eP[view] [source] [discussion] 2022-05-24 05:06:45
>>jonny_+u3
I don't think we'd want the model to reflect the global statistics. We'd usually want it to reflect our own culture by default, unless it had contextual clues to do something else.

For example, the most eaten foods globally are maize, rice, wheat, cassava, etc. If it always depicted foods matching the global statistics, it wouldn't be giving most users what they expected from their prompt. American users would usually expect American foods, Japanese users would expect Japanese foods, etc.

> Does a bias towards lighter skin represent reality? I was under the impression that Caucasians are a minority globally.

Caucasians specifically are a global minority, but lighter skinned people are not, depending of course on how dark you consider skin to be "lighter skin". Most of the world's population is in Asia, so I guess a model that was globally statistically accurate would show mostly people from there.

◧◩◪◨⬒⬓⬔⧯
82. roboca+p41[view] [source] [discussion] 2022-05-24 07:40:59
>>slg+At

  “When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’

  ’The question is,’ said Alice, ‘whether you can make words mean so many different things.’

  ’The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.”
◧◩◪◨⬒⬓⬔
83. barney+771[view] [source] [discussion] 2022-05-24 08:09:53
>>roboca+eq
> Your goal is nice, but impractical?

If the only way to do AI is to encode racism etc, then we shouldn't be doing AI at all.

◧◩◪◨⬒⬓⬔
84. jfoste+R81[view] [source] [discussion] 2022-05-24 08:23:42
>>tines+ef
In a way, if the model brings back an image for "criminals in the United States" that isn't based on the statistical reality, isn't it essentially complicit in sweeping a major social issue under the rug?

We may not like what it shows us, but blindfolding ourselves is not the solution to that problem.

replies(1): >>webmav+B18
◧◩◪◨⬒
85. mdp202+Zc1[view] [source] [discussion] 2022-05-24 09:05:40
>>Ludwig+of
You are thinking of the literal definition - that "made of literal letters".

Mental definition is that "«artificial»" (out of the internal processing) construct made of relations that reconstructs a meaning. Such ontology is logical - "this is that". (It would not be made of memories, which are processed, deconstructed.)

Concepts are internally refined: their "implicit" definition (a posterior reading of the corresponding mental low-level) is refined.

◧◩
86. drdeca+Vo1[view] [source] [discussion] 2022-05-24 10:58:41
>>karpie+I1
The meaning of the word "nurse" is determined by how the word "nurse" is used and understood.

Perhaps what "nurse" means isn't what "nurse" should mean, but what people mean when they say "nurse" is what "nurse" means.

◧◩◪◨⬒⬓⬔
87. mdp202+8p1[view] [source] [discussion] 2022-05-24 10:59:55
>>layer8+be
(Mental entities are very many more than the hundred thousand, out of composition, cartesianity etc. So-called "protocols" (after logical positivism) are part of them, relating more entities with space and time. Also, by speaking of "circular definitions" you are, like others, confusing mental definitions with formal definitions.)

So? Draw your consequences.

Following what was said, you are stating that "a staggering large number of people are unintelligent". Well, ok, that was noted. Scolio: if unintelligent, they should refrain from expressing judgement (you are really stating their non-judgement), why all the actual expression? If unintelligent actors, they are liabilities, why this overwhelming employment in the job market?

Thing is, as unintelligent as you depict them quantitatively, the internal processing that constitutes intelligence proceeds in many even when scarce, even when choked by some counterproductive bad formation - processing is the natural functioning. And then, the right Paretian side will "do the job" that the vast remainder will not do, and process notions actively (more, "encouragingly" - the process is importantly unconscious, many low-level layers are) and proficiently.

And the very Paretian prospect will reveal, there will be a number of shallow takes, largely shared, on some idea, and other intensively more refined takes, more rare, on the same idea. That shows you a distinction between "use" and the asymptotic approximation to meanings as achieved by intellectual application.

◧◩◪◨
88. mdp202+xq1[view] [source] [discussion] 2022-05-24 11:11:46
>>SnowHi+pg
Is it not incredible that after so many decades talking about local minima there is now some supposition that all of them must merge?
◧◩◪◨⬒⬓⬔⧯▣
89. shadow+Bu1[view] [source] [discussion] 2022-05-24 11:45:08
>>zarzav+nK
> As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections...

Is it? I'm reminded of the Microsoft Tay experiment, were they attempted to train an AI by letting Twitter users interact with it.

The result was a non-viable mess that nobody liked.

◧◩◪◨
90. nmfish+tL5[view] [source] [discussion] 2022-05-25 16:56:52
>>karpie+D8
What about preschool teacher?

I say this because I’ve been visiting a number of childcare centres over the past few days and I still have yet to see a single male teacher.

91. webmav+7K7[view] [source] 2022-05-26 05:02:47
>>tines+(OP)
> We certainly don't want to perpetuate harmful stereotypes. But is it a flaw that the model encodes the world as it really is, statistically, rather than as we would like it to be? By this I mean that there are more light-skinned people in the west than dark, and there are more women nurses than men, which is reflected in the model's training data. If the model only generates images of female nurses, is that a problem to fix, or a correct assessment of the data?

If the model only generated images of female nurses, then it is not representative of the real world, because male nurses exist and they deserve to not be erased. The training data is the proximate causes here, but one wonders what process ended up distorting "most nurses are female" into "nearly all nurse photos are of female nurses" something amplified a real world imbalance into a dataset that exhibited more bias than the real world, and then training the AI bakes that bias into an algorithm (that may end up further reinforcing the bias in the real world depending on the use-cases).

◧◩◪
92. webmav+u08[view] [source] [discussion] 2022-05-26 07:53:26
>>bufbup+m7
> If the input text lacks context/nuance, then the model must have some bias to infer the user's intent. This holds true for any image it generates; not just the politically sensitive ones. For example, if I ask for a picture of a person, and don't get one with pink hair, is that a shortcoming of the model?

You're ignoring that these models are stochastic. If I ask for a nurse and always get an image of a woman in scrubs, then yes, the model exhibits bias. If I get a male nurse half the time, we can say the model is unbiased WRT gender, at least. The same logic applies to CEOs always being old white men, criminals always being Black men, and so on. Stochastic models can output results that when aggregated exhibit a distribution from which we can infer bias or the lack thereof.

◧◩◪◨⬒⬓⬔⧯
93. webmav+B18[view] [source] [discussion] 2022-05-26 08:04:34
>>jfoste+R81
At the very least we should expect that the results not be more biased than reality. Not all criminals are Black. Not all are men. Not all are poor. If the model (which is stochastic) only outputs poor Black men, rather than a distribution that is closer to reality, it is exhibiting bias and it is fair to ask why the data it picked that bias up from is not reflective of reality.
replies(1): >>jfoste+P28
◧◩◪◨⬒⬓⬔⧯▣
94. jfoste+P28[view] [source] [discussion] 2022-05-26 08:18:15
>>webmav+B18
Yeah, it makes sense for the results to simply reflect reality as closely as possible. No bias in any direction is desirable.
replies(1): >>webmav+Q2b
◧◩◪◨⬒⬓⬔⧯▣
95. webmav+538[view] [source] [discussion] 2022-05-26 08:21:45
>>rpmism+tK
It isn't wrong, but we aren't talking about the model somehow magically transcending the data it's seen. We're talking about making sure the data it sees is representative, so the results it outputs are as well.

Given that male nurses exist (and though less common, certainly aren't rare), why has the model apparently seen so few?

There actually is a fairly simple explanation: because the images it has seen labelled "nurse" are more likely from stock photography sites rather than photos of actual nurses, and stock photography is often stereotypical rather than typical.

◧◩◪
96. webmav+H58[view] [source] [discussion] 2022-05-26 08:52:52
>>adrian+z8
> I can understand why people wouldn’t want a tool they have created to be used to generate disturbing, offensive or disgusting imagery. But I don’t really see how doing that would be dangerous.

Propaganda can be extremely dangerous. Limiting or discouraging the use of powerful new tools for unsavory purposes such as creating deliberately biased depictions for propaganda purposes is only prudent. Ultimately it will probably require filtering of the prompts being used in much the same way that Google filters search queries.

◧◩◪◨⬒⬓⬔
97. andyba+ed9[view] [source] [discussion] 2022-05-26 16:38:46
>>calvin+Md
It's gone now. What was it?
◧◩◪◨⬒⬓⬔⧯▣▦
98. webmav+Q2b[view] [source] [discussion] 2022-05-27 09:05:19
>>jfoste+P28
Sarcasm, eh? At least there's no way THAT could be taken the wrong way.
[go to top]