I'm only half joking. Fundamentally, the thread is about a filtering system.
I have done some research on this (unpublished), and I got a really good performance on predicting hacker news votes by just counting how many new words (not stopwords, not very-high-frequency words) a comment was adding to a thread. Just using a few variations on this theme predicted better than word counts or bigram features.
Fundamentally, though, I disagree with machine learning- based approaches as they can only _reinforce_ present behavior, and we'd like to shape voting behavior.
(also what if the ML only provided feedback while one is typing the comment?)
Using ML to provide feedback is a bad idea. Most ML techniques latch on to surface features of the text rather than the deeper structure, so it'd just make it really easy for people to reword their mean comments ("this is just stupid" becomes "What an incoherent piece of gobblydegook" or something like this, which might make things funnier but I doubt it would help).
1) who votes on good comments 2) who votes on whose comments 3) who votes a lot / a little.
But mostly (1).
I take your point about ML being superficial. But if it's being used at all, shouldn't the users be informed about what the robo-brain thinks of them?
Your excellent example of a rewording might fool a lot of humans too (see pg's article another commenter linked to ... Ctrl+F "DH4").