Fair concerns but I have trained a Naive bayes classifier on twitter data in the past using [1] a social study of categorised tweet to train the classifier and got around 85% accuracy. It was able to detect and properly classify rape threats as abusive but conversations about rape seed oil as non abusive. Considering the small data set and how little entropy there is between samples I consider it pretty useful.
I get that no machine learning is 100% perfect which is why it should be used as an indicator rather than the deciding factor.
I have had issues with gmail blocking emails but as you point out it was always because of ip reputation not over zealous Naive Bayes.
[1] https://demos.co.uk/press-release/staggering-scale-of-social...