News Minimalist – Only significant news

>>t0bia_+(OP)
I am European and see another US news oriented site which ironically is not "minimalist" from my perspective. It is (to me) littered with US domestic concerns. My point is that _minimalist_ is highly subjective and a pretty huge promise from a site.

But kudos to the effort and the idea of keeping news small is a most noble cause

>>finnjo+G81
Agree, as a Canadian, I also find that today too many US-centric news made the cut, but often it's not the case. Here's a recent issue covering different topics and not mentioning US at all: https://newsletter.newsminimalist.com/p/tuesday-april-25-3-m...

>>yakhin+I91
Oddly enough in the 'least significant section', with scores <0.5, it's nearly all Daily Mail articles focused on the UK.

Also, the second to last least significant article seems to be incorrectly categorized: "Regenerative medicine has come a long way, baby"

Which is actually a serious look back at the advancements over the last quarter century, hardly deserving the second to last position.

It seems like ChatGPT is ranking them not based on actual content significance but presumed significance of the headline. (Which would also make sense technically as ~1200 headlines is about the max context length of GPT-4).

>>Michae+CE1
Nice catch. Just checked that article — it actually got rating 2.8 just based on the news content, but the source credibility 1/10 brought it down to 0.3.

I don't think it's fair, I think ChatGPT hallucinated that it's a tabloid.

Not sure how to fix this. I don't want to adjust sources credibility manually, that will introduce too much bias. My hope is that OpenAI will update ChatGPT with newer data and I could rerun the credibility evaluation.

>>yakhin+lS1
Assuming an average of 20 tokens per headline (~10-14 words), 1200 headlines would be 24000 tokens, which is already near the limit of the API-exclusive GPT-4's window of 32,768 tokens, and way beyond the 8,192 token length of the ChatGPT version.

So it's exceedingly unlikely the actual content, beyond the headline, is processed if your using the ChatGPT version.

>>Michae+cX1
I score each article individually, so there's no need to put many news in one context window.

In 99% cases a single news article fits within the context.

I drop those that don't fit, since several examples I saw were announcement of lottery numbers (too many tokens) and articles with broken html.

>>yakhin+D22
The score has to be in relation to other articles. Or else it's too random to have meaning. ChatGPT doesn't even given consistent scores from session-to-session for the same article.

And the context length limit prevents that relation from extending to more then a few articles, if that's your method.

i.e. Your method doesn't actually produce a meaningful score that can be ranked in some linear order with the 1200 other articles.

At most it would make sense to rank a discrete score in relation to the few other articles it remembers.

Anything beyond that should be placed in 'score ranges' from 5 to 7 for example, not given a discrete score.

>>Michae+fs2
You are spot on. I use temperature 0, but even with it, ChatGPT can be unpredictable.

Sometimes I'm very frustrated about the news that get to the top. When I try to debug it, it gives me a completely different score.

I considered using ranges over discrete score, but dropped the idea, as it makes it too hard to find 1-5 articles that should make it to newsletter (there are 71 articles in this range right now) and it's hard to clearly display that idea in UI.

I guess my position right now is — it's not perfect, there are obvious errors (like the one you found above), and improvements are definitely possible.

But I hope that some people would find it "good enough" even with these inconsistencies. I also hope that ChatGPT or another LLM will make a big progress soon that would solve this problem automatically.

zlacker