zlacker

[return to "Moderation is different from censorship"]
1. comex+7y[view] [source] 2022-11-03 07:50:37
>>feross+(OP)
> And it would make the avoid-harassment side happier, since they could set their filters to stronger than the default setting, and see even less harassment than they do now.

I highly doubt it.

I’m pretty sure typical harassment comes in the form of many similar messages by many different users joining a bandwagon. Moderation wouldn’t really be fast enough to stop that; indeed, Twitter’s current moderation scheme isn’t fast enough to stop it. But the current scheme is capable of penalizing people after the fact, particularly the organizer(s) of the bandwagon, and that creates some level of deterrence. An opt-out moderation scheme would be less effective as a deterrent, since the type of political influencers that tend to be involved in these things could likely easily convince their followers to opt out.

That may be a cost worth paying for the sake of free speech. But don’t expect it to make the anti-harassment side happy.

That said, it’s not like that side can only tolerate (what this post terms as) censorship. On the contrary, they seem to like Mastodon and its federated model. I do suspect that approach would not work as well at higher scale - not in a technical sense, but in terms of the ability to set and enforce norms across servers. But that’s total speculation, and I haven’t even used Mastodon myself…

◧◩
2. danger+A31[view] [source] 2022-11-03 12:43:21
>>comex+7y
> Moderation wouldn’t really be fast enough to stop that

Social media keep using this excuse for not trying. We can moderate spam in emails with a simple naive bayes classifier, why don't we just do that with comments? It could easily classify comments that are part of a bandwagon and flag them automaticly hiding them or for human review.

We are able to moderate email but the concepts we use to do so are never applied to comments, I don't know why, this seems like a solved problem.

◧◩◪
3. pixl97+6a1[view] [source] 2022-11-03 13:21:22
>>danger+A31
If you're trying to use the SPAM model as some kind of example of success I believe you may have already failed.

In SMTP servers I've managed for clients we typically block anywhere from 80 to 99.999% (yes 10000 blocked to one success) messages. I'd call that MegaModeration if there was such a term.

And if you think email spam is solved then I don't believe you read HN often as there is a common complaint of "Gmail is blocking anything I send, I'm a low volume non-commercial sender"

In addition email filtering is extremely slow to react to new methods, generally taking hours depending on the reporting system.

Lastly, you've not thought about the problem much. How are you going to rapidly detect the difference between a fun meme that spreads virally versus an attack against an individual. Far more often you're going to be blocking something that's not a bad thing.

◧◩◪◨
4. danger+sS1[view] [source] 2022-11-03 16:15:02
>>pixl97+6a1
Fair concerns but I have trained a Naive bayes classifier on twitter data in the past using [1] a social study of categorised tweet to train the classifier and got around 85% accuracy. It was able to detect and properly classify rape threats as abusive but conversations about rape seed oil as non abusive. Considering the small data set and how little entropy there is between samples I consider it pretty useful.

I get that no machine learning is 100% perfect which is why it should be used as an indicator rather than the deciding factor.

I have had issues with gmail blocking emails but as you point out it was always because of ip reputation not over zealous Naive Bayes.

[1] https://demos.co.uk/press-release/staggering-scale-of-social...

◧◩◪◨⬒
5. pixl97+V32[view] [source] 2022-11-03 17:01:06
>>danger+sS1
Training classifiers can also go off the rails under adversarial attack. This commonly showed up in our systems when people sent short emails that were more ambiguous. For example this tends to cause problems where malevolent users adopt dogwhistles co-opting the language of the attacked group. The attacked group commonly becomes the ones getting banned/blocked in these cases
[go to top]