zlacker

[return to "My AI skeptic friends are all nuts"]
1. a_bono+vq[view] [source] 2025-06-02 23:58:23
>>tablet+(OP)
I find the Konwinski Prize to be very interesting in this context. 1 million dollars to whoever's open source LLM solves >90% of a set of novel Github issues.

https://www.kaggle.com/competitions/konwinski-prize/

Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.

And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.

◧◩
2. jachee+ft[view] [source] 2025-06-03 00:22:59
>>a_bono+vq
They’re tab-completion with extra cognitive-load steps.
◧◩◪
3. a_bono+zu[view] [source] 2025-06-03 00:34:13
>>jachee+ft
I mean, if you can solve 9% of Github issues automatically that's a fairly huge load of work you can automate. Then again you'd have to manually identify which 9% of issues.
◧◩◪◨
4. blibbl+vv[view] [source] 2025-06-03 00:43:55
>>a_bono+zu
"update dependencies"

that would probably cover it, and you don't need "AI" to do that

[go to top]