It would almost be more interesting to specifically train the model on half the available market data, then test it on another half. But here it’s like they added a big free loot box to the game and then said “oh wow the player found really good gear that is better than the rest!”
Edit: from what I causally remember a hedge fund can beat the market for 2-4 years but at 10 years and up their chances of beating the market go to very close to zero. Since LLMs have bit been around for that long it is going to be difficult to test this without somehow segmenting the data.
Yes, ideally you’d have a model trained only on data up to some date, say January 1, 2010, and then start running the agents in a simulation where you give them each day’s new data (news, stock prices, etc.) one day at a time.
I think a potentially better way would be to segment the market up to today but take half or 10% of all the stocks and make only those available to the LLM. Then run the test on the rest. This accounts for rules and external forces changing how markets operate over time. And you can do this over and over picking a different 10% market slice for training data each time.
But then your problem is that if you exclude let’s say Intel from your training data and AMD from your testing data then there ups and downs don’t really make sense since they are direct competitors. If you separate by market segment then does training the model on software tech companies might not actually tell you accurately how it would do for commodities or currency training. Or maybe I am wrong and trading is trading no matter what you are trading.
My working definition of technical analysis [0]
How is that relevant to what was proposed? If it's trading and training on 2010 data, what relevance does todays market dynamics and regulations have?
Which further begs the question, what's the point of this exercise?
Is it to develop a model than compete effectively in today's market? If so then yeah, the 2010 trading/training idea probably isn't the best idea for the reasons you've outlined.
Or is it to determine the capacity of an AI to learn and compete effectively within any given arbitrary market/era? If so, then today's dynamics/constraints are irrelevant unless you're explicitly trying to train/trade on todays markets (which isn't what the person you're replying to proposed, but is obviously a valid desire and test case to evaluate in it's own right)
Or is it evaluating its ability to identify what those constraints/limitations are and then build strategies based on it? In which case it doesn't matter when you're training/trading so much as your ability to feed it accurate and complete data for that time period be it today, or 15 years ago or whenever, which is no small ask.
One of the worst possible things to do in a competitive market is to trade by some publicly-available formulaic strategy. It’s like announcing your rock-paper-scissors move to your opponent in advance.
Occasionally it's (as far as I can tell) a legitimately new 'wow that's obvious' style thing and I consider prototyping it. :)
> I think a potentially better way would be to segment the market up to today but take half or 10% of all the stocks and make only those available to the LLM.
Autocorrelation is going to bite you in the ass.Those stocks are going to be coupled. Let's take an easy example. Suppose you include Nvidia in the training data and hold out AMD for test. Is there information leakage? Yes. The problem is that each company isn't independent. You have information leakage in both the setting where companies grow together as well as zero sum games (since x + y = 0, if you know x then you know y). But in this example AMD tends with Nvidia. Maybe not as much, but they go in the same direction. They're coupled
Not to mention that in the specific setting the LLMs were given news and other information.
In that case the winning strategy would be to switch hedge funds every 3 years.
Agriculture would have been considered tech 200 years ago.
Not really. Sentiment analysis in social networks has been around for years. It's probably cheaper to by that analysis and feed it to LLMs than to have LLMs do it.
If they are all giving the LLMs money to invest and the AIs generally buy the same group of stocks, those stocks will go up. As more people attempt the strategy it infuses fresh capital and more importantly signaling to the trading firms there are inflows to these stocks. I think its probably a reflexive loop at this point.
When you flip a coin, you can easily get all heads for the first 2-4 flips, but over time it will average out to about 50% heads. It doesn’t follow from this that the winning strategy is to change the coin every 3 flips.
Its just a system of interpreting money flows and trends on a graph.