I don't think it's fair, I think ChatGPT hallucinated that it's a tabloid.
Not sure how to fix this. I don't want to adjust sources credibility manually, that will introduce too much bias. My hope is that OpenAI will update ChatGPT with newer data and I could rerun the credibility evaluation.
So it's exceedingly unlikely the actual content, beyond the headline, is processed if your using the ChatGPT version.
In 99% cases a single news article fits within the context.
I drop those that don't fit, since several examples I saw were announcement of lottery numbers (too many tokens) and articles with broken html.
And the context length limit prevents that relation from extending to more then a few articles, if that's your method.
i.e. Your method doesn't actually produce a meaningful score that can be ranked in some linear order with the 1200 other articles.
At most it would make sense to rank a discrete score in relation to the few other articles it remembers.
Anything beyond that should be placed in 'score ranges' from 5 to 7 for example, not given a discrete score.
Sometimes I'm very frustrated about the news that get to the top. When I try to debug it, it gives me a completely different score.
I considered using ranges over discrete score, but dropped the idea, as it makes it too hard to find 1-5 articles that should make it to newsletter (there are 71 articles in this range right now) and it's hard to clearly display that idea in UI.
I guess my position right now is — it's not perfect, there are obvious errors (like the one you found above), and improvements are definitely possible.
But I hope that some people would find it "good enough" even with these inconsistencies. I also hope that ChatGPT or another LLM will make a big progress soon that would solve this problem automatically.
I just realized, for that particular news article about Regenerative medicine it was my mistake all along. I asked ChatGPT to give unknown sources a score of 1 and completely forgot about. I think that's what it did.
For now it marked only 8 sources as unknown out of 1700.