zlacker

[return to "We gave 5 LLMs $100K to trade stocks for 8 months"]
1. bcrosb+l2[view] [source] 2025-12-04 23:20:57
>>cheese+(OP)
> Grok ended up performing the best while DeepSeek came close to second. Almost all the models had a tech-heavy portfolio which led them to do well. Gemini ended up in last place since it was the only one that had a large portfolio of non-tech stocks.

I'm not an investor or researcher, but this triggers my spidey sense... it seems to imply they aren't measuring what they think they are.

◧◩
2. tclanc+tk[view] [source] 2025-12-05 01:24:31
>>bcrosb+l2
I mean, run the experiment during a different trend in the market and the results would probably be wildly different. This feels like chartists [1] but lazier.

[1] https://www.investopedia.com/terms/c/chartist.asp

◧◩◪
3. refact+ro[view] [source] 2025-12-05 01:59:13
>>tclanc+tk
If you've ever read a blog on trading when LSTMs came out, you'd have seen all sorts of weird stuff with predicting the price at t+1 on a very bad train/test split, where the author would usually say "it predicts t+1 with 99% accuracy compared to t", and the graph would be an exact copy with a t+1 offset.

So eye-balling the graph looks great, almost perfect even, until you realize that in real-time the model would've predicted yesterday's high on today's market crash and you'd have lost everything.

◧◩◪◨
4. blitza+Fj2[view] [source] 2025-12-05 16:20:22
>>refact+ro
if you feed in price i.e. 280.1, 281.5, 281.9 ... you are going to get some pretty good looking results when it comes to predicting the next days price (t+1) with a margin of +/- a percent or so.
[go to top]