New breakthroughs in AI make the headlines everyday. Far from the buzz of customer-facing businesses, the wide adoption and powerful applications of Machine Learning in Finance are less well known. In fact, there are few domains with as much historical, clean and structured data as the financial industry -making it one of those predestined use cases where 'learning machines' made an early mark with tremendous success that still continues.
About three years ago, I got involved in developing Machine Learning (ML) models for price predictions and algorithmic trading in Energy markets, specifically for the European market of Carbon emission certificates. In this article, I want to share some of the learnings, approaches and insights which I have found relevant in all my ML projects since. Rather than on technical detail, my focus here is on the general considerations behind modelling choices which are discussed rarely in the classical academic textbooks or online tutorials on new techniques- On the example of algorithmic trading, I present some 'tricks of the trade' which you might find useful when applying Machine Learning to real-life contexts in the vast world beyond synthetique examples, as a lonely seeker or with your team of fellow data scientists.
The market of European Carbon Emission Certificates (EU ETS) has been established in the wake of the Kyoto Protocol 2005 as a major pillar of EU climate policy, regulating about half of Europe's anthropogenic CO2 emissions by a scheme of 'cap and trade'. This mechanism provides national governments with control over the total amount of greenhouse gas emissions ('cap') while conceding the efficient allocation of emission rights to market forces ('trade'). The basic idea is to put a price on pollution: each industrial installation covered in the scheme has to monitor and report its exact quantity of greenhouse gas emissions to the authorities and then offset the respective amount (measured in tons) by handing in allowances. These 'rights to pollution' are auctioned off or given for free to the industrial players, and can then be traded over-the-counter or on a central market for prices flexibly set by demand and supply. With overall supply of yearly permits constrained by the reduction targets of the environmental policy, some polluters are compelled to opt for measures reducing their pollution ('abatement'), eg by installing additional filters in their chimneys. These polluters with marginal abatement costs lower than the current market price of permits (eg because their specific filter requirements are cheap) can then sell their excess pollution allowances on the market for a profit, to polluters facing higher marginal abatement costs. In a perfectly efficient emissions trading market, the equilibrium price of permits would settle at the marginal abatement cost of the final unit of abatement required to meet the overall reduction target set by the cap on the supply of permits.
Given the uncertainty about the actual industry-specific abatement costs, this instrument lets governments control the total amount of emissions, while the actual price of emission permits fluctuates according to demand-side market forces, namely
· market expectations about future policy changes
· the size of upcoming permit auctions, price and cover ratio of ongoing auctions (see Figure 1)
· speculation of market participants
· banking behaviour (permits issued in one year are valid for all years in the same policy phase)
· the price relations of other energy commodities.
To exemplify the latter, suppose the price of natural gas per calorific unit drops below the price of brent oil. Power producers and utilities would switch over to this less carbon intense fuel, thus lowering the demand for carbon allowances. Accordingly, the price of allowances would drop as well in those periods (see Figure 2).
A comprehensive model needs to reflect all these factors. While we can safely assume that patterns observed in the abundant historical market data carry over into the present and will continue into the future (this is actually the sine qua non, the indispensable assumption for any analytical modelling), it is obvious that this setting is too complex for any approach trying to model the market based on generic beliefs, fundamental relations or state space concepts from Econophysics.
So this is really a use case to unleash the power of Machine Learning. How to leverage it?
Here is a typical workflow for a trading system using supervised learning:
Get the data in place. Good sources for financial time series are the API of the exchange you want to trade on, the APIs of AlphaVantage or Quandl. The scale of the data should at least be as fine as the scale you want to model and ultimately predict. What is your forecast horizon? Longer-term horizons will require additional input factors like market publications, policy outlooks, sentiment analysis of twitter revelations etc. If you are in for the game of short-term or even high-frequency trading based on pure market signals from tick data, you might want to include rolling averages of various lengths to provide your model with historical context and trends, especially if your learning algorithm does not have explicit memory cells like Recurrent Neural Networks or LSTMs. All common indicators used in technical analysis (eg RSI, ADX, Bollinger Bands, MACD) are based on some sort of moving averages of some quantity (price, trading volume) even if you don't believe in simplistic trading rules, including them will help the model to reflect trading behaviour of a majority of market participants. Your computational capacity might be a limiting factor, especially in a context where your ML model will be up against hard-coded, fast and unique-purpose algorithms of market-making or arbitrage seekers. Deploying dedicated cloud servers or ML platforms like H2O and TensorFlow allows you to spread computation over various servers. Clean the data (how do you interpolate gaps?), chart it, play with it - do you already spot trading opportunities, trends, anomalies?
2. Supervised Model Training
Split your data into complementary sets for training, validation (for parameter tuning, feature selection etc) and testing. This is actually more complex than it sounds: optimally, the test set should be as 'similar' as possible to the present 'state of the market', and both validation and test set should follow the same distribution. Otherwise you might waste effort tuning the model parameters on the validation set only to find that it poorly generalizes to the test set. Following the concept of 'market regimes'- ie extended periods where a specific combination of commodities dominates the price dynamics of your target instrument- it might be worthwhile to first have a clustering algorithm of unsupervised learning discover defining correlations in the data and then evaluate model performance on data in the validation and test set belonging to the same clusters (see Figure 3 - in this project, clustering increased predictive performance by 8%).
Early on, decide on and establish a single-number evaluation metric. Chasing too many different metrics will only lead to confusion. In the context of algorithmic trading, a suitable measure is 'Profit and Loss' (PnL) as it weights classification precision (price up/down) with the actual size of the swing ('relevance'). And it fits with the metrics you may consider for your Trading Policy. Observe the model performance on training and validation set. If error on the training set, ie 'model bias', is high, you may need to allow for more model parameters (eg by adding more layers/neurons in a deep learning model). If the model poorly generalizes ('the model is overfitting to the training set'), that is performance difference on validation and training set ('model variance') is high, you may need to add more data to the training set, reduce the number of features to the most relevant ones, add regularization (eg L2, L1 or dropout) or early stopping (in the gradient descent optimization). Examining closely the cases where the model went wrong will help to identify any potential and avoidable model bias, see Figure 4.
Establish your target performance: for market forecasts, a classification precision of 75% is actually quite good - it is 50% better than random guessing (50% precision). This baseline is very different to other ML applications like object or speech recognition which operate in a closed environment where the factors affecting the modelling target can be clearly identified (the RGB channels of image pixels, the wave frequencies of sound samples).
3. Trading Policy
Define your trading policy: a set of rules defining the concrete trading implications of the model outputs: eg depending on a threshold for the model confidence of a given prediction, what position do you place on the market, what position size, for how long do you hold a position in the given state of the market etc. A policy usually comes with some more free parameters which need to be optimized (next step). In the context of supervised learning discussed here, this is a fairly manual process based on backtesting and grid search (some shortcomings outlined below).
4. Backtesting & Optimization
Now it gets down to the numbers - how well is your trading system, or the interplay of prediction models and a given trading policy, performing on a hold-out set of historical market data? Here the test set used in step 2 (model training) can become the validation set for tuning the parameters of the policy. Genetic algorithms allow you to explore the policy space, starting from a first generation of say 100 randomly chosen policy parameters, iteratively eliminating the 80 worst performers and making the 20 survivors produce 4 offspring each. Or you can employ a grid search in the multidimensional parameter space: starting from some plausible values for the parameters of the policy, what is the best-performing setting you can achieve by varying the parameter values one-by-one. Your performance metric here is the one you finally aim to optimize in your trading strategy, eg the PnL or some derived quantity like Return on Investment, SharpeRatio (the return per volatility risk), Value at Risk, the beta etc, see Figure 5.
A good measure to prevent overfitting the parameters to the validation set is a cross-validation with a 'walk-forward-test' (WTF) verifying the robustness of your approach: optimize the policy parameters on a validation segment, test them forward in time on data following the validation segment, shift the validation segment forward to include that test data, repeat. The basic assumption here is that the recent past is a better gauge for the future than the more distant past.
5. Simulation & Live Trading
Before your strategy goes live, freeze all system parameters and test in real-time as if actually placing your orders according to the outputs of your trading algorithm. This important step is called paper trading and is the crucial litmus test for the validity of your approach. You might notice here that in your historical data you have actually used values which are not really available at a given time, eg when calculating moving averages. If your strategy still looks promising, congratulations - it's time to go live! While you might start by placing your orders manually, do not underestimate both the administrative and technical efforts it takes to integrate your strategy with the API of your exchange.
The typical workflow presented here has some severe shortcomings:
For derivative contracts, like futures on an underlying, historical data usually reports the open and close price of a day or a chosen time interval and a settle price which is about the average price of the contracts in all deals realized in the interval. But this is unlikely to be the price for which you can clear your buy or sell order, depending on the dynamics of order books which have different volumes at various bid/ask price levels. So your model predictions from step 2 do refer to a theoretical price but likely not to the price you will place your bets on. A more detailed modelling approach would need to take into account the actual structure and dynamics of order books.
Developing the policy (step 3) is not part of the Machine Learning based modelling but a manual process guided by intuition, experience or just simple heuristics. Eg you would place a buy order ('go long') when the model predicts a price increase. But how many contracts do you buy? What confidence threshold do you use? How long do you hold your position in the face of adverse market conditions?
Feedback comes late: you need to undergo steps 1-3 before you get a first indication about the performance of your strategy. Parameters of the prediction model and the policy are optimized independently even if model and policy actually interact closely. Exploring the space of policy parameters in this framework is done via inefficient numerical optimisation, not with the powerful gradient optimization of your predictive Machine Learning model.
The framework of Reinforcement Learning integrates steps 2 and 3 above, modelling trading as the interaction of an agent (trader) with the environment (market, order books) to optimize a reward (eg return) by its actions (placing orders). While still at an early stage, recent research indicates that this is a route worth exploring -further studies need to be done.
S.Smith, Environmental Economics (Oxford University Press 2011) provides a great introduction into the history and implications of market approaches to environmental policies.
Denny Britz' blog post
gives more detail on the mechanics of order books and the prospects of Reinforcement Learning approaches in Algorithmic Trading.
The author: A passionate data scientist, I have worked as the tech lead for startups across the globe and implemented real-life AI solutions for the last four years. Contact me at email@example.com.
The article was originally published here