Right around the time you get your first basic regression or classification model going, it will at least cross your mind. The vast piles of time series data, coupled with the possibility of retiring young has the irresistible pull of finding an old treasure map in your grandfather’s attic. How can you NOT think about it? Can you use machine learning to predict the market?
I had to at least try. Here’s what I did, and what I learned.
There are plenty of small scales tutorials on the web that are a great place to start. They show you how to pull down the history of a stock, perhaps calculate a few indicators, and feed it to a regression algorithm and try to predict the next day’s value. Alternatively, they use a classifier to predict whether the stock will rise or fall, without predicting a value.
I had two ideas on where to go from here. First, I wanted to go bigger. I theorized that there might be hidden relationships between some stocks, currencies, and financial indicators that were just too subtle to be found by eye. I figured a machine learning algorithm might be able to pick them out.
Second, I wasn’t going to pick a stock that I wanted to predict. I was going to to train models for all of them, and see which stocks performed best. The idea was that some companies might be more predictable than others, so I needed to find them.
I started by downloading the histories of most of the stocks in the S&P 500, a bunch of currency value history, and a couple of dozen financial indicators. A Python script took care of converting them into a consistent format, filling in missing values, and dropping time series that didn’t go back to at least the early 2000’s. All told, when the dust cleared, I had over a thousand columns in a nice Pandas table with 18 years of data.
Once that was completed, I used the excellent TA-lib library to calculate a bunch of indicators for each time series for windows of 5, 10, and 30 days. I don’t have a financial background and didn’t know which ones would really add value, so I took the approach of adding a whole lot of them and letting the model sort them out. That made the number of columns explode. The dataset was ready to input to the training model with over 32000 columns.
I selected XGBoost for my algorithm because of the overall performance, and the ability to easily see which features the model was using to make the prediction. I set it up to loop through all the stocks in the dataset, training two models for each. The first was a classifier, which would predict whether the stock would rise or fall the next day. The second was a regression model, which predicted the next day’s close price.
For each model, I trained it on 95% of my available data, and then used the remaining data for a validation test, to simulate stock data it had never seen. That remaining 5% was about 3 months worth of trading data. Any machine learning model will do a great job predicting the data it was trained on — the trick is to make it more general and perform well on data it has never been exposed to.
For the validation run, a simulated investment of $1000 was made to start. If the stock was predicted to rise, it bought, and it sold if the forecast was for a drop. No effort was made to factor in trading costs, because I wanted to see what the results looked like without that.
I ran my Jetson TX2 flat out for a month.
What did it find?
As expected, for most stocks the results were poor — accuracy was not much better than a coin toss. There were some, though, that appeared to perform exceptionally well on the validation data. Quite a few doubled or tripled my simulated money in 3 to 6 months, and a couple generated a 20x profit in that time period. It turns out a graph can make your heart rate spike. Who knew?
Remember, this is data that the algorithm has never seen before — that last 5% that was kept out of the training dataset.
Had I found it? Were there some stocks that were subtly tied to market indicators, and could thus be predicted? If so, I could make money off the fluctuations in price.
By the time I had written the code and run it, several months had passed since I downloaded my giant dataset. I updated it to include the most recent trading data and decided to see what the models would have done during that timeframe They had done great with their validation runs — would they have performed as well had I been trading live with them for the last couple months? I was getting pretty excited by this point.
The results were puzzling, and gloomy. Models that did great during their initial training and validation runs might do ok during runs on later data, but could also fail spectacularly and burn all the seed money. Half the time the simulation would make money, and half of the time it would go broke. Sometimes it would be just a few percentage points better than a coin toss, and other times it would be far worse. What had happened? It had looked so promising.
Lessons Learned
It finally dawned on me what I had done.
The results cycling around 50% was exactly what you’d expect if the stock price was a random walk. By letting my program hunt through hundreds of stocks to find ones it did well on, it did stumble across some stocks that it happened to predict well for the validation time frame. However, just a few weeks or months later, during a different slice of the random walk, it failed.
There was no subtle underlying pattern. The model had simply gotten lucky a few times by sheer chance, and I had cherry picked those instances. It was not repeatable.
Thus, it was driven home — machine learning is not magic. It can’t predict a random sequence, and you have to be very careful of your own biases when training models. Careful validation is critical.
I am sure I will not be the last to fall victim to the call of the old treasure map in the attic, but exercise caution. There are far less random time series to play with if you are looking to learn. Simulate, validate carefully, and be aware of your own biases.