zlacker

[parent] [thread] 2 comments
1. thih9+(OP)[view] [source] 2023-03-18 09:30:49
> In spam detection, we often use heuristics in conjunction with machine learning to identify spammers.

Heuristics can only be used to identify suspected spammers. Not everyone who behaves like a spammer is a spammer, it could be e.g. a random user with privacy settings on, or someone who didn’t update their bio in a while and it got affected by link rot, etc.

Even if a group of low activity accounts stars the same projects, it could be that the account owners just discuss these projects elsewhere.

replies(1): >>GlumWo+o1
2. GlumWo+o1[view] [source] 2023-03-18 09:48:47
>>thih9+(OP)
The article notes this, and like any spam detection method, it has a degree of false positives, but it seems very low (less than a percent according to the article). I'm sure an official implementation of this could take more internal, non-public factors into account, like IP addresses and clustering of account creation times, to make it even more accurate and drastically reduce the amount of spam users.
replies(1): >>andrea+i3
◧◩
3. andrea+i3[view] [source] [discussion] 2023-03-18 10:12:00
>>GlumWo+o1
The claim I saw in the article is 98% precision. Which doesn't actually tell us the predictive value without the base rate which seems to be all over the place.
[go to top]