Less is More? Reducing Biases and Overfitting in Machine Learning Return Predictions
Machine learning models have been successfully employed to cross-sectionally predict stock returns using lagged stock characteristics as inputs. The analyzed paper challenges the conventional wisdom that more training data leads to superior machine learning models for stock return predictions. Instead, the research demonstrates that training market capitalization group-specific machine learning models can yield superior results for stock-level return predictions and long-short portfolios. The paper showcases the impact of model regularization and highlights the importance of careful model design choices.