Sites classified as "general news" (ordered by frequency in the front-page archive): nytimes.com, bbc.com, bbc.co.uk, theguardian.com, washingtonpost.com, reuters.com, npr.org, cnn.com, slate.com, vice.com, latimes.com, cnet.com, yahoo.com, sfgate.com, cbc.ca, cnbc.com, guardian.co.uk, bits.blogs.nytimes.com, vox.com, salon.com, time.com, nymag.com, telegraph.co.uk, boston.com, newsweek.com, chronicle.com, msn.com, axios.com, news.com.com, propublica.org, independent.co.uk, timesonline.co.uk, mercurynews.com, theglobeandmail.com, pbs.org, theintercept.com, usatoday.com, buzzfeednews.com, spiegel.de, rollingstone.com, thestandard.com, go.com, smh.com.au, cbsnews.com, abc.net.au, nbcnews.com, seattletimes.com, aljazeera.com, bloombergview.com, motherjones.com, firstlook.org, thehill.com, apnews.com, informationweek.com, news.com, thedailybeast.com, huffingtonpost.com, theage.com.au, csmonitor.com, nwsource.com, japantimes.co.jp, thestar.com, bostonglobe.com, dw.com, indiatimes.com, nypost.com, ap.org, chicagotribune.com, sfchronicle.com, dailymail.co.uk, news.com.au, foxnews.com, kqed.org, theatlanticwire.com, scmp.com, texasmonthly.com, wbur.org, yahoo.net, swissinfo.ch, nationalpost.com, spectator.co.uk, sfweekly.com, detroitnews.com, theweek.com, nzherald.co.nz, washingtonexaminer.com, aljazeera.net, cbslocal.com, nltimes.nl, weeklystandard.com, ctvnews.ca, miamiherald.com, nydailynews.com, thetimes.co.uk, dallasnews.com, startribune.com, bostonherald.com, euronews.com, kuow.org, themorningnews.org, upi.com, globalnews.ca, guardiannews.com, theherald.com.au, thesun.co.uk, belfasttelegraph.co.uk, houstonchronicle.com, ibtimes.co.uk, koreaherald.com, metro.co.uk, mirror.co.uk, seattleweekly.com, standard.co.uk, dailyherald.com, huffingtonpost.co.uk, huffingtonpost.com.au, huffpost.com, inquirer.com, ktvu.com, ocweekly.com, sundayherald.com, theweek.co.uk, wpri.com, wtsp.com, americanchronicle.com, annarborchronicle.com, augustachronicle.com, catholicherald.co.uk, dukechronicle.com, heraldsun.com.au, katu.com, kdvr.com, kfor.com, ktla.com, myfox8.com, myfoxdc.com, myfoxny.com, news-herald.com, news.google.ca, pressherald.com, thechronicleherald.ca, timesherald.com, wttw.com, wtvr.com, wunc.org, wvgazette.com.
This is based on downloading all archived HN front pages from 2007-02-20 through 2023-06-21 and analysing stories by title, site, votes, comments, and submitter.
(I generally use 2009 as a representative "early year" as HN was sorting things out and evolving rapidly in 2007 & 2008.)
By year:
2007 418
2008 438
2009 407
2010 290
2011 271
2012 222
2013 224
2014 259
2015 329
2016 442
2017 426
2018 476
2019 418
2020 251
2021 194
2022 167
2023 95
This is the first I've looked at these numbers specifically. I'm noting the substantial fall-off in 2020, which I suspect is paywall-related. Note that data for 2023 are partial.Sites: bloomberg.com, wsj.com, economist.com, venturebeat.com, businessweek.com, businessinsider.com, fastcompany.com, inc.com, hbr.org, ft.com, alleyinsider.com, forbes.com, fortune.com, nikkei.com, marketwatch.com, xconomy.com, entrepreneur.com, portfolio.com, business2.com, cio.com, bizjournals.com, bloombergquint.com, insidefacebook.com, nasdaq.com, fool.com, financialpost.com, prnewswire.com, adweek.com, morningstar.com, americanbanker.com, businessinsider.com.au, industryweek.com, bankertimes.com, businessinsider.co.za, businessinsider.de, businessinsider.fr, forbesindia.com
As above, these are ordered by overall frequency within the FP archive.
Site list: theatlantic.com, newyorker.com, archive.org, smithsonianmag.com, qz.com, nationalgeographic.com, aeon.co, openculture.com, theconversation.com, might.net, theparisreview.org, vanityfair.com, ted.com, popularmechanics.com, laphamsquarterly.org, buzzfeed.com, fivethirtyeight.com, outsideonline.com, thehustle.co, newrepublic.com, foreignpolicy.com, harpers.org, esquire.com, longreads.com, newstatesman.com, lettersofnote.com, gq.com, thewalrus.ca, cjr.org, strongtowns.org, historytoday.com, variety.com, hyperallergic.com, 1843magazine.com, collectorsweekly.com, theamericanscholar.org, nplusonemag.com, bigthink.com, brainpickings.org, thenation.com, theoutline.com, theinformation.com, washingtonmonthly.com, macleans.ca, redherring.com, thenewatlantis.com, prospectmagazine.co.uk, quoteinvestigator.com, theawl.com, airspacemag.com, calvertjournal.com, canada.com, mensjournal.com, torontolife.com, thecorrespondent.com, thecritic.co.uk, britishmuseum.org, nationalgeographic.co.uk, publishersweekly.com, autoweek.com, folksonomy.org, laweekly.com, menshealth.com, rijksmuseum.nl, metmuseum.org, prospect-magazine.co.uk, wunderground.com, agweek.com, banksy.co.uk, banksyfilm.com, minnesotamonthly.com, openlettersmonthly.com
(Again, by order of frequency in front-page stories.)
This and other precentages are based on 35% of stories being unclassified, that is, coming from sites I've not explicitly tagged. Based on some random sampling of that pool, those are most often blogs or corporate sites. My classification for news, science/academic, and programming sites is generally more comprehensive as I'm able to leverage regex matches: "edu" and "ac" for academic, GitHub and GitLab domains for programming, for example, also station call-letter patterns such as [KW][A-Z][A-Z][A-Z]for the US for many general news sites.