I wrote a tiny tool which calculates the "brightness" score of a github repo based on calculating the total star count of the people who starred your repo. It will automatically detect these kinds of scams (assuming that it's mostly low star bots giving the stars).
https://github.com/Hellisotherpeople/Bright
Edit: I love clustering, I really do, but I think that techniques like the one I am using are far superior to unsupervised learning for trying to detect fake accounts in this context.