If your concern is with the labels themselves being used to convey a (possibly offensive) message, I think you could just have a way for people to hide specific labels and never see them again. Or maybe a way to label the labels as subjective, or just delete ones that are obvious flamebait.
People love to misuse tools meant for good, on Reddit I've been on the receiving end of the "reddit cares" self-harm notification because of some barely spicy comments.