Out-of-sample Dataset Before the “Sample”: Pervasive Anomalies Before 1926

30.November 2021

Data are the key to systematic investing/trading strategies. The hypotheses testing, risk or return evaluations, correlations, and factor loadings rely on past data and backtests. With an increasing speed of publication in finance, critiques of quantitative strategies have emerged. Strategies seem to decay in alpha, post-publication returns tend to be lower, and many strategies become insignificant once rigorously tested (in or out-of-sample). Moreover, some might even appear profitable purely by chance and the repetitive examination of the same dataset, such as CRSP stocks after 1963. 

Is there any solution to overcome these limitations? Partially, the design of the novel machine learning strategies consisting of training, validation, and testing sets might help. Perhaps the most crucial part of such a scheme is the usage of the purely out-of-sample dataset. In this regard, the novel research by Baltussen et al. (2021) provides several valuable findings for the most recognized factors. The authors constructed a database of U.S. stocks, including dividends and market caps for 1488 major stocks from 1866 to 1926. The sample can be described as the pre-CRSP period, including independent, pre-publication, and “out-of-sample” data that can be a perfect test for the factors utilized today. 

Factor Exposures of Thematic Indices

31.August 2021

Numerous new businesses are emerging related to autonomous traffic, clean energy, biotechnology, etc. Without any doubt, these new companies look promising and at least the technology behind them seems to be the future. Moreover, this novel trend is also supported by the most prominent index creators S&P and MSCI. Both providers have created numerous thematic indexes connected to these hot industries. The popularity has caused that ETFs are nowhere behind, and as a result, these thematic indexes could be easily tracked. However, popularity itself does not guarantee the best investment, and we should be interested in these indexes in greater detail. A vital insight provides the novel research paper of Blitz (2021). The findings are interesting – the thematic investors bet against quantitative investors or, more precisely, against the most common factors that are well-known from the asset pricing models.

Embedded Leverage in High Beta Funds and Management Fees

4.June 2020

Risk-averse investors want higher returns at any cost. If they are constrained and are not able to use leverage on their own, they will look for other ways to increase their performance. Recent academic paper written by Hitzemann, Sokolinski, Tai suggests, that such risk-seeking investor will search for a high-beta fund that will give them requested embedded leverage, even when that fund charge higher than average fees. Resultant net alpha of those high-beta funds is then negative, and this effect can explain the significant part of the underperformance of the overall mutual fund industry. And now, the logical question follows: As hedge funds have even higher fees than mutual funds, what is embedded in them, that constrained clients normally can’t access? Higher leverage and access to option-like return distribution? Maybe…

Authors: Hitzemann, Sokolinski, Tai

Title: Paying for Beta: Embedded Leverage and Asset Management Fees

Did Automated Trading Resurrect the CAPM?

28.February 2020

Once upon a time, there was everybody’s favourite finance tool in a town – Capital Asset Pricing Model, which was liked and used by nearly everyone. But a few decades ago, it went out of fashion. Easier accessibility of cheap finance databases allowed a lot of researchers to dig deeper into those data. They uncovered a tremendous amount of evidence for a lot of market anomalies not consistent with CAPM. A new research paper written by Park and Wang shows that CAPM is maybe not completely useless. The rise of automated trading causes individual stocks’ returns to align more closely with the market. Intraday correlation in the equity market is rising, and so is the fraction of firms’ returns that are explained by market returns …

Authors: Park, Wang

Title: Did Trading Bots Resurrect the CAPM?

Two Versions of CAPM

19.July 2019

This week's analysis of selected financial research paper contains more text and no picture, but we still think it's worth reading …

Authors: Siddiqi

Title: CAPM: A Tale of Two Versions



Given that categorization is the core of cognition, we argue that investors do not view firms in isolation. Rather, they view them within a framework of categories that represent prior knowledge. This involves sorting a given firm into a category and using categorization-induced inferences to form earnings and discount-rate expectations. If earnings-aspect is categorization-relevant, then earnings estimates are refined, whereas discount-rates are confounded with the category-exemplar. The opposite happens when discount-rates are categorization relevant. Earnings-focused approach such as DCF, generally used by institutional investors, leads to a version of CAPM in which the relationship between average excess return and stock beta is flat (possibly negative). Value effect and size premium (controlling for quality) arise in this version. Discount-rate focused approach such as multiples or comparables valuation, typically used by individual investors, leads to a second version in which the relationship is strongly positive with growth stocks doing better. The two-version CAPM accounts for several recent empirical findings including fundamentally different intraday vs overnight behavior, as well as behavior on macroeconomic announcement days. Momentum is expected to be an overnight phenomenon, which is consistent with empirical findings. We argue that, perhaps, our best shot at observing classical CAPM in its full glory is a laboratory experiment with subjects who have difficulty categorizing (such as in autism spectrum disorders).

Notable quotations from the academic research paper:

"Consider the following two empirical observations:

Firstly, stock prices behave very differently with respect to their sensitivity to market risk (beta) at specific times. Typically, average excess return and beta relationship is flatter than expected. It could even be negative. However, during specific times, this relationship is strongly positive, such as on days when macroeconomic announcements are made or during the night.

Secondly, a hue, which is halfway between yellow and orange, is seen as yellow on a banana and orange on a carrot. In this article, we argue that the two observations are driven by the same underlying mechanism.

The second observation is an example of the implications of categorization for color calibration. In this article, we argue that the first observation is also due to categorization, which gives rise to two versions of CAPM. In one version, the relationship between expected return and stock beta is flatter than expected or could even be negative, whereas in the second version, this relationship is strongly positive.

Categorization is the mental operation by which brain classifies objects and events. We do not experience the world as a series of unique events. Rather, we make sense of our experiences within a framework of categories that represent prior knowledge. That is, new information is only understood in the context of prior knowledge.

Here, in accord with cognitive science literature, we present a view of categorization that has both an upside as well as a downside, and apply this nuanced perspective to the capital asset pricing model (CAPM). If categorization is fundamental to how our brains make sense of information, then investor behavior, like any other domain of human behaviour, should also be viewed through this lens. This means that the traditional view that each firm is viewed in isolation needs to be altered. When an investor considers a firm, she views it within a framework of categories that represent prior knowledge. This involves sorting a given firm into a category based on attributes that are deemed categorization-relevant. Categorization-induced inferences help refine such attributes while confounding categorization-irrelevant attributes with the category-exemplar.

Valuation requires estimating earnings (cash-flows) potential and estimating discount-rates. Even among firms that sell similar products (same sector) some may have more similar earnings potential, whereas other may have more similar discount-rates. The former type may include firms with similar earnings-related fundamentals but very different levels of debt ratio and equity betas. Also, their multiples (generally related to inverse of the discount-rate) such as P/E, EV/Sales or EV/EBITDA could be very different. The latter type may include firms with similar debt ratios and equity betas or similar P/E and EV/EBITDA but quite different earnings or cash-flows fundamentals.

We argue that, an earnings-focused approach, such as discounted cash-flows (DCF), tends to categorize the former type of firms together, whereas, the relative valuation approach (RV) based on multiples such as P/E or EV/EBITDA tends to categorize the latter types of firms together. In other words, the choice of a valuation approach introduces a bias in how firms are categorized.

In this paper, we take discounted cash-flows (DCF) as the prototype of an earnings-potential focused approach, and valuation by multiples or relative valuation (RV) as the prototype discount-rate focused approach.

We show that when earnings aspect is categorization-relevant (as in DCF analysis), a version of CAPM is obtained, which displays a flatter or even negative relationship between stock beta and expected excess returns. Betting-against-beta anomaly is observed along with the value effect, as well as the size premium after controlling for quality (consistent with the findings in Asness et al 2018). We argue that this is the default version which typically prevails. While categorizing firms, if investors are focused on the discount rate aspect (as in RV analysis), then the discount-rates are refined whereas earnings estimates are confounded with the category-exemplar. A second version of CAPM arises. In this version, there is a strong positive relationship between beta and expected excess return.

One way to make sense of the co-existence of two versions is to classify investors as either earnings-focused or discount rate-focused. If earnings-focused investors dominate, then the first version is observed. If the discount-rate-focused investors dominate, then the second version is observed. Note, that earnings-focused approach (such as DCF) is typically employed by large institutional investors, whereas RV approach is associated with individual investors (and with sell-side equity analysts who publish research reports for individual investors).

If institutional investors are earnings-focused and individual investors are discount rate-focused, then the trading behavior of each type can be observed to make specific predictions:

1) Institutional investors typically avoid trading at the open and prefer to trade in the afternoon near the market close. The objective is to time the trade when the market is most liquid to avoid any adverse price impact. This means that trade at open is dominated by individual investors. So, one expects to see the relationship between stock beta and average return to be strongly positive (second version) overnight and flat or even negative (first version) intraday.

2) Institutional traders typically trade in the right direction prior to macroeconomic announcement days (suggesting superior information) with institutional trading volume falling sharply on macro-announcement days. As trade on such days is dominated by individual investors, one expects to see a strongly positive relationship (second version) on macro-announcement days.

3) The first version generally dominates intraday due to institutional investors being dominant. As the corresponding CAPM version comes with size and value effects, the prediction is that size and value are primarily intraday phenomena.

4) We show that, all else equal, discount rate-focused investors have higher willingness-to-pay than earnings-focused investors. If discount rate-focused investors dominate trade at open, whereas earnings-focused investors are active intraday, then one expects prices to typically rise overnight from close-to-open and fall intraday between open-to-close.

5) If momentum traders, who buy past winners and short past losers, are primarily individual investors, then one expects momentum to be an overnight phenomenon observed between close-to-open. This is because individual traders dominate trade at or near open.


Two Centuries of Global Factor Premiums

7.March 2019

Related to all major factor strategies (trend, momentum, value, carry, seasonality and low beta/volatility):

Authors: Baltussen, Swinkels, van Vliet

Title: Global Factor Premiums



We examine 24 global factor premiums across the main asset classes via replication and new-sample evidence spanning more than 200 years of data. Replication yields ambiguous evidence within a unified testing framework with methods that account for p-hacking. The new-sample evidence reveals that the large majority of global factors are strongly present under conservative p-hacking perspectives, with limited out-of-sample decay of the premiums. Further, utilizing our deep sample, we find global factor premiums to be not driven by market, downside, or macroeconomic risks. These results reveal strong global factor premiums that present a challenge to asset pricing theories.

Notable quotations from the academic research paper:

"In this paper we study global factors premiums over a long and wide sample spanning the recent 217 years across equity index (but not single securities), bond, currency, and commodity markets.

The first objective of this study is to robustly and rigorously examine these global factor premiums from the perspective of ‘p-hacking’.

We take as our starting point the main global return factors published in the Journal of Finance and the Journal of Financial Economics during the period 2012-2018: time-series momentum (henceforth ‘trend’), cross-sectional momentum (henceforth ‘momentum’), value, carry, return seasonality and betting-against-beta (henceforth ‘BAB’). We examine these global factors in four major asset classes: equity indices, government bonds, commodities and currencies, hence resulting in a total of 24 global return factors.4

We work from the idea that these published factor premiums could be influenced by p-hacking and that an extended sample period is useful for falsification or verification tests. Figure 1, Panel A summarizes the main results of these studies.

Global factor strategies

Shown are the reported Sharpe ratio’s in previous publications, as well as the 5% significance cutoff in the grey-colored dashed line. In general, the studies show evidence on the global factor premiums, with 14 of the 22 factors (return seasonality is not tested in bonds and currencies) displaying significant Sharpe ratio’s at the conventional 5% significance level.

Global factor strategies 1981-20111

Further, most of the studies have differences in, amongst others, testing methodologies, investment universes and sample periods, choices that introduce degrees of freedom to the researcher. To mitigate the impact of such degrees of freedom, we reexamine the global return factors using uniform choices on testing methodology and investment universe over their average sample period (1981-2011). Figure 1, Panel B shows the results of this replicating exercise. We find that Sharpe ratios are marginally lower, with 12 of the 24 factor premiums being significant at the conventional 5% level.

Global factor strategies 1981-2011

The second objective of this study is to provide rigorous new sample evidence on the global return factors. To this end, we construct a deep, largely uncovered historical global database on the global return factors in the four major asset classes. This data consists of pre-sample data spanning the period 1800- 1980, supplemented with post-sample data from 2012-2016, such that we have an extensive new sample to conduct further analyses. If the global return factors were unintentionally the result of p-hacking, we would expect them to disappear for this new sample period.

Our new sample findings reveal consistent and ubiquitous evidence for the large majority of global return factors. Figure 1, Panel C summarizes our main findings by depicting the historical Sharpe ratio’s in the new sample period. In terms of economic significance, the Sharpe ratios are substantial, with an average of 0.41. Remarkably, in contrast to most out-of-sample studies we see very limited ‘out-of-sample’ decay of factor premiums.

In terms of statistical significance and p-hacking perspectives, 19 of the 24 t-values are above 3.0,19 Bayesian p-values are below 5%, and the break-even prior odds generally need to be above 9,999 to have less than 5% probability that the null hypothesis is true."

