zlacker

[return to "Watching AI drive Microsoft employees insane"]
1. margor+72[view] [source] 2025-05-21 11:23:29
>>laiysb+(OP)
With how stochastic the process is it makes it basically unusable for any large scale task. What's the plan? To roll the dice until the answer pops up? That would be maybe viable if there was a way to automatically evaluate it 100% but with a human in the loop required it becomes untenable.
◧◩
2. eterev+J2[view] [source] 2025-05-21 11:33:07
>>margor+72
The plan is to improve AI agents from their current ~intern level to a level of a good engineer.
◧◩◪
3. ethano+l3[view] [source] 2025-05-21 11:38:38
>>eterev+J2
Seems like that is taking a very long time, on top of some very grandiose promises being delivered today.
◧◩◪◨
4. infect+K5[view] [source] 2025-05-21 11:58:42
>>ethano+l3
I look back over the past 2-3 years and am pretty amazed with how quick change and progress have been made. The promises are indeed large but the speed of progress has been fast. Not defending the promise but “taking a very long time” does not seem to be an accurate representation.
◧◩◪◨⬒
5. bakugo+O9[view] [source] 2025-05-21 12:28:49
>>infect+K5
> I look back over the past 2-3 years and am pretty amazed with how quick change and progress have been made.

Now look at the past year specifically, and only at the models themselves, and you'll quickly realize that there's been very little real progress recently. Claude 3.5 Sonnet was released 11 months ago and the current SOTA models are only marginally better in terms of pure performance in real world tasks.

The tooling around them has clearly improved a lot, and neat tricks such as reasoning have been introduced to help models tackle more complex problems, but the underlying transformer architecture is already being pushed to its limits and it shows.

Unless some new revolutionary architecture shows up out of nowhere and sets a new standard, I firmly believe that we'll be stuck at the current junior level for a while, regardless of how much Altman & co. insist that AGI is just two more weeks away.

[go to top]