Or—and this happens—it "summarizes" the same text differently, depending on whether the author's name happens to fit a certain ethnicity.
Maybe in five years they will be useful enough that it would have been worth including these features
With how inexpensive trainings are starting to get, it will not be long until we can train our own specialized models to fit our specific needs.
I am pessimistic on that front, since:
1. If LLM's can't detect biases in their own output, why would we expect them to reliably detect it in documents in general?
2. As a general rule, deploying bias/tricks/fallacies/BS is much easier than the job of detecting them and explaining why it's wrong.