Traditional asset pricing literature has yielded numerous anomaly variables for predicting stock returns, but real-world outcomes often disappoint. Many of these predictors work best in small-cap stocks, and their profitability tends to decline over time, particularly in the United States. As market efficiency improves, exploiting these anomalies becomes harder. The fusion of machine learning with finance research offers promise. Machine learning can handle extensive data, identify reliable predictors, and model complex relationships. The question is whether these promises can deliver more accurate stock return predictions...
In the recent study written by Cakici, Fieberg, Metko and Zaremba, the authors employ various machine learning models to investigate return predictability using 153 stock characteristics from the U.S. market spanning from 1972 to 2020. Their primary goals are twofold. First, they aim to reassess traditional anomaly-based strategies, examining their limitations, including their focus on small firms, short-term effectiveness, and declining profitability over time. Machine learning models are expected to handle the multidimensional nature of equity returns more effectively than traditional methods. Second, the authors evaluate the robustness of machine learning strategies and assess whether they can overcome the shortcomings of anomaly signals.
Findings, summarized in Exhibit 1, reveal that machine learning models perform well but vary significantly across three key dimensions: forecast horizon, firm size, and time. Monthly models outperform annual ones, and their effectiveness is more pronounced in micro- and small-cap stocks compared to larger firms. Furthermore, as market efficiency improves over time, the benefits of machine learning diminish. The authors consider these dimensions collectively, machine learning strategies excel in specific intersections, such as monthly forecasts, early data, and small-cap stocks. However, their performance becomes less impressive in more realistic settings, particularly for yearly forecasts in large firms, representing the majority of the U.S. capital markets.
In summary, while machine learning strategies hold promise, their performance is not consistent across various dimensions, diminishes over time, and retains attractive performance only among the small-cap stocks.
Authors: Nusret Cakici and Christian Fieberg and Daniel Metko and Adam Zaremba
Title: Predicting Returns with Machine Learning Across Horizons, Firms Size, and Time
Researchers and practitioners hope that machine learning strategies will deliver better performance than traditional methods. But do they? This study documents that stock return predictability with machine learning depends critically on three dimensions: forecast horizon, firm size, and time. It works well for short-term returns, small firms, and early historical data; however, it disappoints in opposite cases. Consequently, annual return forecasts have failed to produce substantial economic gains within most of the U.S. market in the last two decades. These findings challenge the practical utility of predicting returns with machine learning models.
As always, we present several interesting figures and tables:
Notable quotations from the academic research paper:
“The machine learning strategies prove remarkably successful in various intersections of monthly forecasts, early security data, and small- and micro-cap stocks. However, when we consider a more realistic setting, the performance becomes nothing but disappointing. For example, for the yearly forecasts during the past two decades, the annualized abnormal returns in big firms— representing 90% of the U.S. capital markets—varied between -3.66% and -0.36%. While the mean returns frequently remain sufficiently positive to be harvested by financial institutions, the associated six-factor model alphas fail to pass any statistical significance threshold. In other words, the machine learning models no longer add value beyond the popular factor strategies; the combination of large-cap stocks, annual forecasts, and recent data proved detrimental to their alphas. To put it differently, in the vast majority of the U.S. stock market, there is no evidence of any reliable economic gains over the past 20 years.
Our sample comprises all NYSE, AMEX, and NASDAQ stocks. The study period runs from January 1972 to December 2020. All market data comes from CRSP; furthermore, the corresponding accounting data is obtained from Compustat. We discard all companies with a market capitalization below five million U.S. dollars.1 We calculate all firm characteristics and returns at monthly intervals. Lastly, we winsorize the returns at 0.1% and 99.9% each month in order to eliminate potential miscalculation errors. Our sample contains, on average, 5,289 unique companies per month with a market capitalization of 2.7 billion U.S. dollars. All the data is expressed in U.S. dollars, and the corresponding risk-free rate is represented by a one-month U.S. Treasury bill rate from French (2022).
The machine learning models work, but their performance varies substantially across three dimensions: forecast horizon, firm size, and time. First, the machine learning models produce considerably higher economic gains for short-term return predictions than long-term ones. The annualized alphas on strategies based on monthly predictions are roughly twice as high as yearly ones. Nonetheless, monthly rebalancing may be substantially more costly as well. The differences between the two approaches stem from the contribution of different stock characteristics over both short and long horizons. The monthly forecasts extract information largely from trading frictions. The yearly predictions, in turn, emphasize more fundamental data: valuations, profitability, and investment.
Combining all three dimensions does not build an optimistic picture of machine learning strategies. The interactions between return horizon, firm size, and time further undermined the implementability of machine learning strategies. For example, when yearly return forecasts are considered, no significant abnormal returns have been recorded in the big-firm segment during the period of 2001 to 2020. To put it simply, our machine learning strategies failed to produce any alpha over the past 20 years in the stocks representing 90% of the U.S. market.”
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.