Can Deep Learning Models Predict Stocks?
June 3, 2019Transcript
When you hear that 70% percent of trading volume in the entire US stock market is generated by algorithms, you might think you are missing out something big. Are we the only fools in the market who still trade the traditional way? If the machines dominate the market today, do we mere mortals even stand a chance against the mighty machines?
Well, a big chunk of that automated 70% is a result of high-frequency trading algorithms trying to predict only milliseconds into the future. And those algorithms usually use very very simple methods, nothing fancier than a chain of hardcoded rules or a simple linear regression model. So yes, if you race against milliseconds, then you do need to be a machine.
But for casual investors like us, that shouldn't really matter. Or should it? Deep learning models can learn much more complex patterns in data. Is it possible to predict longer-term price movements in the market using deep learning? Nobody knows for sure.
Many large financial institutions are hiring data scientists, machine learning engineers, and deep learning experts with hefty salaries. So, does that mean it works for sure? Well, that may give us a sign about the trends in investing strategies, but institutional investing can be different from individual investing. For example, unless you are buying a very-low-volume stock, the shares you buy or sell barely have any impact on the price. But if you are buying and selling in large amounts, how you execute your trades can make a big difference. A machine learning model can help decide how you should split up your sales over time to avoid causing big price movements.
To actually predict the price movements, you can try a lot of things. From very simple things such as training LSTMs or Temporal Convolutional Networks on historical prices to overly complex models such as training Convolutional LSTMs on satellite imagery to predict macroeconomic movements. Any predictive model you may build practically tries to find some inefficiencies in the market. So, in a fully efficient market, none of these models should work. For example, your model can analyze text from various sources, such as financial news websites and social media, to decide whether a particular stock will go up or down. You can do sentiment analysis at character level text or audio. You can analyze not only what's in an earnings statement but also the way it's announced.
However, the efficient market hypothesis states that the stock prices reflect all available and relevant information immediately. It's likely that any new information that can impact the prices is already incorporated in the price by the time your model gets to parse it from the web. If you think this hypothesis is true, then using any data beyond the prices would be redundant. Then, should we just use nothing but the historical price information to build a predictive model? That saves us a lot of trouble. Technically, that's the opposite of what the hypothesis says. But for now, let's forget about that and talk about what we can do with historical price data.
Trying to forecast the direction of prices by finding patterns in past market data is a form of technical analysis. Day traders do this all the time. They look at charts and name the patterns they see things like head and shoulders, cup holders, shooting stars, etc. Neural networks are very good at finding patterns in data. If there are really such patterns, a neural network with enough capacity would be easily able to pick up any patterns that might lead to profit.
Give a neural network a price chart, and then it will fit a function to that chart as closely as possible. The problem is that being able to find patterns in past data doesn't mean that those patterns will generalize and hold in the future as well.
A neural network can even find 'patterns' in completely random data. But what the model actually learns wouldn't be any more useful than a lookup table. A table that holds the price information for the past data while having no predictive power. This is called overfitting, and you can check out my earlier videos to learn more about it. Machine learning models are not the only ones who find patterns in data that don't exist. Humans do too. We see faces in the clouds and inanimate objects. We see patterns in price charts and assume that the prices will regress to the mean. I generated this chart, for example, completely randomly by changing the price by some random percentage at each point. Yet there seems to be some pattern.
So, I would be cautious about technical analysis. There are so many books and success stories about how some authors got rich by doing technical analysis, but honestly, most of those authors are getting rich by either writing the books about how they got rich or they just happened to be lucky. Yes, the odds of such luck is not very high, but there's a lot of survivorship bias. Because losers hardly ever write books about their failure stories. If you gather 1000 people in a room and ask them to predict coin flips, the odds of at least one person predicting the outcome of 8 flips in a row is over 98%. It takes only a room of people to find a clairvoyant. There are many many more people trading in the market.
Earlier I mentioned that according to the efficient market hypothesis, any publicly available information that might have an impact on the price of an asset has already done it. Past market data is publicly available information. So, any information it might have, such as some patterns, is likely to be already in the price in an efficient market. Not all markets are efficient or rational, though. For example, something as simple as the log-periodic power-law model was able to predict the 2018 Bitcoin bubble eight days before the bubble burst. It doesn't mean that it will predict the next one, or whether there'll be another cryptocurrency bubble at all. Still, it's an interesting observation in hindsight.
Ok, let's say you want to experiment with machine learning models to predict the stock market just for fun, so where to start?
You can start with defining the goal of your model and pick a corresponding loss function. For example, if you want your model to pick the best stocks among some options, you can treat this as an n-way classification problem and use a softmax cross-entropy as your loss function. If you want your model to give a rating between 0 and 1 to any given stock, then you can use sigmoid cross-entropy.
Next, you can go on to design your model architecture. It doesn't have to be a sophisticated model. You can stack one or two layers of LSTMs or gated recurrent units. You can even use temporal convolutional networks, which are simple and easy to train. You can check out my earlier video on recurrent models to learn more.
What about hardware? GPUs are one of the first things that come to mind when the topic is machine learning. In this particular case, you probably won't need any special hardware. If you already have a GPU, then use it. Otherwise, unless you plan to use some extraordinarily large volume data, any model that is too big to train on a CPU will wildly overfit. So, whatever hardware you already have will probably be enough to get started. If you want to speed up your experiments using GPUs, you can always use cloud services and pay as much as you use.
One of the problems we have with using stock market data in deep models is that we don't have enough data to train a large model without overfitting. To reduce the risk of overfitting, you can do all sorts of data preprocessing and augmentation. For example, you can add a small amount of noise to your data; pick a random subset of stocks for each time interval at every epoch; and Generate new samples by taking random linear combinations of existing stocks. Those samples should essentially behave like randomly managed mutual funds or ETFs.
If you want to use data other than just price and volume information, you can look into Google Trends to see how much people are searching for particular keywords. However, any signal you might get from there is likely to be lagged.
None of these will probably work. So, I wouldn't expect too much. But give it a try anyway, you'll learn a lot while you try even if it doesn't work. If you find a trick that no one has found yet and it does work, have fun, and enjoy the profits.
Anything I say here is clearly not investment advice. I don't have a financial background. I'm a computer scientist who played a little bit with financial data in my spare time just for fun. I do have a decent amount of experience in machine learning, but my area of expertise is its applications on imaging data.
Alright, that's all for today. I hope you liked it. Subscribe for more videos, while keeping in mind that this is not the type of videos that I usually make. And as always, thanks for watching, stay tuned, and see you next time.