1st Edition

Causal Inference and Machine Learning In Economics, Social, and Health Sciences

By Mutlu Yuksel, Yigit Aydede Copyright 2026
838 Pages 34 Color & 17 B/W Illustrations
by Chapman & Hall

Causal Inference and Machine Learning in Economics, Social, and Health Sciences bridges the gap between modern machine learning methods and the applied needs of economists, public health researchers, and social scientists. Designed with students and practitioners in mind, the book introduces machine learning through the lens of causal inference, offering a rigorous yet accessible roadmap for using data to answer real-world policy questions.

It combines econometric and machine learning methods such as penalized regressions, random forests, boosting, double machine learning, and the most up-to-date estimation methods for addressing selection on observables (e.g., matching, AIPW) and unobservables (e.g., instrumental variables, difference-in-differences, synthetic control). Readers learn how to estimate treatment effects, uncover heterogeneity, and work with high-dimensional data, while gaining clarity on assumptions, trade-offs, and limitations. The book also covers advanced and often underrepresented topics such as time series forecasting with machine learning methods, neural networks and deep learning, and core optimization algorithms like gradient descent. Each method is introduced with intuition, formal treatment, and applied examples from economics, health, labor, and development studies. It places special emphasis on transparency, identification, and interpretability.

Beyond introducing models, it provides step-by-step guidance from raw data to estimation, showing not just what works, but how and why—both methodologically and computationally. Unlike many texts that rely on pre–built software or assume deep technical knowledge, this book builds from foundational concepts such as estimation, error decomposition, and bias-variance trade-offs, then progresses to advanced machine learning approaches. Simulation-based pedagogy helps readers visualize model behavior under known conditions, enabling researchers and students alike to see how statistical tools perform across diverse empirical settings.

A distinctive feature of the book is its focus on when and how to use predictive versus causal models. Rather than treating them as separate tasks, it shows how each can inform the other. Practical insights, diagnostics, and examples guide readers in selecting appropriate tools based on research goals and data characteristics.

With its clear style, practical code in R, and integrated approach to prediction and causality, this book is an essential resource for applied researchers, students, and anyone using data to inform policy and decision–making.

KEY FEATURES

  • Integrates causal inference with the latest econometric and machine learning methods to address real–world policy questions in economics, health, and the social sciences.
  • Offers clear, detailed explanations and intuitive guidance—even for foundational concepts often overlooked in other sources—to build theoretical understanding and link econometric principles to application.
  • Designed for applied researchers, students, and practitioners with limited technical background, with step-by-step instruction from raw data and basic code, including how both the methods and the underlying code function.
  • Provides practical guidance on when and how to use predictive vs. causal models, highlighting their trade-offs and pitfalls to avoid, supported by real-world examples and simulation-based demonstrations.

1. Introduction

2. From Data to Causality

3. Learning Systems

4. Error

5. Bias-Variance Trade-off

6. Overfitting

7. Parametric Estimation - Basics

8. Nonparametric Estimations - Basics

9. Hyperparameter Tuning

10. Classification

11. Model Selection and Sparsity

12. Penalized Regression Methods

13. Classification and Regression Trees (CART)

14. Ensemble Learning and Random Forest

15. Boosting

16. Counterfactual Framework

17. Randomized Controlled Trials

18. Selection on Observables

19. Double Machine Learning

20. Matching Methods

21. Inverse Weighting and Doubly Robust Estimation

22. Selection on Unobservables and DML-IV

23. Heterogeneous Treatment Effects

24. Causal Trees and Forests

25. Meta Learners for Treatment Effects

26. Difference in Differences and DML-DiD

27. Synthetic DiD and Regression Discontinuity

28. Time Series Forecasting

29. Direct Forecasting with Random Forests

30. Neural Networks & Deep Learning

31. Matrix Decomposition and Applications

32. Optimization Algorithms - Basics

Biography

Mutlu Yuksel is a Professor of Economics at Dalhousie University, Canada, and an applied microeconomist whose research spans labor, health, and development. His recent work applies machine learning and high-dimensional data to complex policy questions. He has received teaching awards and co-founded the ML Portal to support research and training in social and health policy.

Yigit Aydede is the Sobey Professor of Economics at Saint Mary’s University, Canada, and an applied economist working at the intersection of econometrics, machine learning, and artificial intelligence (AI). He teaches data analytics and serves as Faculty in Residence at the Sobey School of Business and as an Affiliate Scientist at Nova Scotia Health. Aydede is also the co-founder of Novastorms.ai and the ML Portal, both focused on data-driven public policy and health research.