Decreasing Returns of Machine Learning Strategies

Traditional asset pricing literature has yielded numerous anomaly variables for predicting stock returns, but real-world outcomes often disappoint. Many of these predictors work best in small-cap stocks, and their profitability tends to decline over time, particularly in the United States. As market efficiency improves, exploiting these anomalies becomes harder. The fusion of machine learning with finance research offers promise. Machine learning can handle extensive data, identify reliable predictors, and model complex relationships. The question is whether these promises can deliver more accurate stock return predictions...

In the recent study written by Cakici, Fieberg, Metko and Zaremba, the authors employ various machine learning models to investigate return predictability using 153 stock characteristics from the U.S. market spanning from 1972 to 2020. Their primary goals are twofold. First, they aim to reassess traditional anomaly-based strategies, examining their limitations, including their focus on small firms, short-term effectiveness, and declining profitability over time. Machine learning models are expected to handle the multidimensional nature of equity returns more effectively than traditional methods. Second, the authors evaluate the robustness of machine learning strategies and assess whether they can overcome the shortcomings of anomaly signals.

Findings, summarized in Exhibit 1, reveal that machine learning models perform well but vary significantly across three key dimensions: forecast horizon, firm size, and time. Monthly models outperform annual ones, and their effectiveness is more pronounced in micro- and small-cap stocks compared to larger firms. Furthermore, as market efficiency improves over time, the benefits of machine learning diminish. The authors consider these dimensions collectively, machine learning strategies excel in specific intersections, such as monthly forecasts, early data, and small-cap stocks. However, their performance becomes less impressive in more realistic settings, particularly for yearly forecasts in large firms, representing the majority of the U.S. capital markets.

In summary, while machine learning strategies hold promise, their performance is not consistent across various dimensions, diminishes over time, and retains attractive performance only among the small-cap stocks.

Authors: Nusret Cakici and Christian Fieberg and Daniel Metko and Adam Zaremba

Title: Predicting Returns with Machine Learning Across Horizons, Firms Size, and Time



Researchers and practitioners hope that machine learning strategies will deliver better performance than traditional methods. But do they? This study documents that stock return predictability with machine learning depends critically on three dimensions: forecast horizon, firm size, and time. It works well for short-term returns, small firms, and early historical data; however, it disappoints in opposite cases. Consequently, annual return forecasts have failed to produce substantial economic gains within most of the U.S. market in the last two decades. These findings challenge the practical utility of predicting returns with machine learning models.

As always, we present several interesting figures and tables:

Notable quotations from the academic research paper:

“The machine learning strategies prove remarkably successful in various intersections of
monthly forecasts, early security data, and small- and micro-cap stocks. However, when we
consider a more realistic setting, the performance becomes nothing but disappointing. For example, for the yearly forecasts during the past two decades, the annualized abnormal returns in
big firms— representing 90% of the U.S. capital markets—varied between -3.66% and -0.36%.
While the mean returns frequently remain sufficiently positive to be harvested by financial institutions, the associated six-factor model alphas fail to pass any statistical significance threshold. In other words, the machine learning models no longer add value beyond the popular factor
strategies; the combination of large-cap stocks, annual forecasts, and recent data proved detrimental to their alphas. To put it differently, in the vast majority of the U.S. stock market, there
is no evidence of any reliable economic gains over the past 20 years.

Our sample comprises all NYSE, AMEX, and NASDAQ stocks. The study period runs from
January 1972 to December 2020. All market data comes from CRSP; furthermore, the corresponding accounting data is obtained from Compustat. We discard all companies with a market
capitalization below five million U.S. dollars.1 We calculate all firm characteristics and returns
at monthly intervals. Lastly, we winsorize the returns at 0.1% and 99.9% each month in order
to eliminate potential miscalculation errors. Our sample contains, on average, 5,289 unique
companies per month with a market capitalization of 2.7 billion U.S. dollars. All the data is
expressed in U.S. dollars, and the corresponding risk-free rate is represented by a one-month
U.S. Treasury bill rate from French (2022).

The machine learning models work, but their performance varies substantially across three dimensions: forecast horizon, firm size, and time. First, the machine learning models produce
considerably higher economic gains for short-term return predictions than long-term ones. The
annualized alphas on strategies based on monthly predictions are roughly twice as high as yearly
ones. Nonetheless, monthly rebalancing may be substantially more costly as well. The differences between the two approaches stem from the contribution of different stock characteristics
over both short and long horizons. The monthly forecasts extract information largely from trading frictions. The yearly predictions, in turn, emphasize more fundamental data: valuations,
profitability, and investment.

Combining all three dimensions does not build an optimistic picture of machine learning strategies. The interactions between return horizon, firm size, and time further undermined the implementability of machine learning strategies. For example, when yearly return forecasts are
considered, no significant abnormal returns have been recorded in the big-firm segment during
the period of 2001 to 2020. To put it simply, our machine learning strategies failed to produce
any alpha over the past 20 years in the stocks representing 90% of the U.S. market.”

Are you looking for more strategies to read about? Sign up for our newsletter or visit our Blog or Screener.

Do you want to learn more about Quantpedia Premium service? Check how Quantpedia works, our mission and Premium pricing offer.

Do you want to learn more about Quantpedia Pro service? Check its description, watch videos, review reporting capabilities and visit our pricing offer.

Are you looking for historical data or backtesting platforms? Check our list of Algo Trading Discounts.

Or follow us on:

Facebook Group, Facebook Page, Twitter, Linkedin, Medium or Youtube

Share onRefer to a friend

Subscribe for Newsletter

Be first to know, when we publish new content

    The Encyclopedia of Quantitative Trading Strategies

    Log in

    - bi-weekly research insights -
    - tips on new trading strategies -
    - notifications about offers & promos -