zlacker

[parent] [thread] 2 comments
1. w-m+(OP)[view] [source] 2019-08-08 12:29:23
As the article goes "The most admired arguments are made with data, but the origins, veracity, and malleability of those data tend to be ancillary concerns.", let's see what we can do here.

The latest HN comment dump I could find with a quick search was from May 2018 [1].

There were 237,646 comments in total that month, made by 36,358 unique users. 75% of users posted 5 or fewer comments, with a median of 2. The most prolific user wrote 798 comments, dang managed 6th place with 425 comments.

Even if the numbers doubled since then, there is no need to moderate millions of users, as only (well) tens of thousands of them are actively participating.

[1] https://www.reddit.com/r/datasets/comments/6v685o/complete_h...

replies(1): >>danso+h4
2. danso+h4[view] [source] 2019-08-08 13:08:35
>>w-m+(OP)
The "official" dataset on BigQuery seems to have been last updated in October 2018:

https://news.ycombinator.com/item?id=19304326

replies(1): >>minima+It
◧◩
3. minima+It[view] [source] [discussion] 2019-08-08 15:59:10
>>danso+h4
That page was last updated October 2018, but the dataset itself (`bigquery-public-data.hacker_news.full`) is up to date and continually updating.
[go to top]