Some of the reasoning:
>Preliminary assessment also suggests Imagen encodes several social biases and stereotypes, including an overall bias towards generating images of people with lighter skin tones and a tendency for images portraying different professions to align with Western gender stereotypes. Finally, even when we focus generations away from people, our preliminary analysis indicates Imagen encodes a range of social and cultural biases when generating images of activities, events, and objects. We aim to make progress on several of these open challenges and limitations in future work.
Really sad that breakthrough technologies are going to be withheld due to our inability to cope with the results.
We certainly don't want to perpetuate harmful stereotypes. But is it a flaw that the model encodes the world as it really is, statistically, rather than as we would like it to be? By this I mean that there are more light-skinned people in the west than dark, and there are more women nurses than men, which is reflected in the model's training data. If the model only generates images of female nurses, is that a problem to fix, or a correct assessment of the data?
If some particular demographic shows up in 51% of the data but 100% of the model's output shows that one demographic, that does seem like a statistics problem that the model could correct by just picking less likely "next token" predictions.
Also, is it wrong to have localized models? For example, should a model for use in Japan conform to the demographics of Japan, or to that of the world?
If you want the model to understand what a "nurse" actually is, then it shouldn't be associated with female.
If you want the model to understand how the word "nurse" is usually used, without regard for what a "nurse" actually is, then associating it with female is fine.
The issue with a correlative model is that it can easily be self-reinforcing.
I'd say that bias is only an issue if it's unable to respond to additional nuance in the input text. For example, if I ask for a "male nurse" it should be able to generate the less likely combination. Same with other races, hair colors, etc... Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.
Randomly pick one.
> Trying to generate a model that's "free of correlative relationships" is impossible because the model would never have the infinitely pedantic input text to describe the exact output image.
Sure, and you can never make a medical procedure 100% safe. Doesn't mean that you don't try to make them safer. You can trim the obvious low hanging fruit though.
How does the model back out the "certain people would like to pretend it's a fair coin toss that a randomly selected nurse is male or female" feature?
It won't be in any representative training set, so you're back to fishing for stock photos on getty rather than generating things.
—
The AI doesn’t know what’s common or not. You don’t know if it’s going to be correct unless you’ve tested it. Just assuming whatever it comes out with is right is going to work as well as asking a psychic for your future.
If they were to weight the training data so that there were an equal number of male and female nurses, then it may well produce male and female nurses with equal probability, but it would also learn an incorrect understanding of the world.
That is quite distinct from weighting the data so that it has a greater correspondence to reality. For example, if Africa is not represented well then weighting training data from Africa more strongly is justifiable.
The point is, it’s not a good thing for us to intentionally teach AIs a world that is idealized and false.
As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections, lest we start to disassociate from reality.
Chinese media (or insert your favorite unfree regime) also presents China as a utopia.
No it is not, because you don’t know if it’s been shown each one of its samples the same number of times, or if it overweighted some of its samples more than others. There’s normal reasons both of these would happen.