専修大学経済学部 英語による特殊講義 前期・後期
Causal Inference using Python and Machine Learning 1&2
This course helps students understand applied causal inference and develop the skills needed to plan and execute their own empirical projects in economics. A key challenge of empirical research in a data-rich environment is formulating insightful questions and conducting robust empirical analysis. To address this, we will examine numerous examples and engage in substantial data analysis ourselves. Topics of the first-semester course include randomized controlled trials, regression and matching techniques. Topics of the second-semester course include instrumental variables, difference-in-differences, and regression discontinuity designs. Throughout the course, we will also explore key contributions from the emerging econometric literature that integrates machine learning with causal inference—referred to as causal machine learning.
By the end of the course, you will have gained a practical familiarity with the tools of causal machine learning, proficiency in data handling using Python, and a solid understanding of the models and methods of applied causal inference.
Topics:
- Identification and potential outcome framework
- Review of statistics
- Python
- Markdown
- Analysis and interpretation of randomized trials
- Regression basics – conditional expectation function
- Regression basics – casual reg vs causal reg
- Regression basics – identification
- Regression basics – estimation
- Using multivariate regression – omitted variable bias
- Using multivariate regression – selection on observables
- Heteroskedasticity and clustered standard errors
- Directed acyclic graphs (DAG) and variable selection
- Matchmaker
- Inverse probability weighting and Doubly-robust estimator
- Instrumental variables – selection on unobservables
- Instrumental variables – heterogeneous effects
- Instrumental variables – local average treatment effect
- Panel data model – fixed effects and time effects
- Panel data model – interactive effects
- Difference-in-differences method 、DID with multiple time periods
- Synthetic control method
- Regression discontinuity design – Sharp RD and Fuzzy RD
- Least Absolute Shrinkage and Selection Operator (LASSO)
- Mostly dangerous big data – Double selection method
- Trees and causal trees
- Causal forests
- Targeting
- Final Exam (presentations, day 1)
- Final Exam (presentations, day 2)
Textbooks:
- Mastering ‘Metrics – The Path from Cause to Effect, Angrist and Pischke
References:
- Casual Inference for the Brave and True, Matheus Facure
- Applied Causal Inference Powered by ML and AI, Chernozhukov et al. (2024)
Videos:
- Ceteris Paribus: Public vs. Private University
- Randomized Trials: The Ideal Weapon
- How to Read Economics Research Papers: Randomized Controlled Trials (RCTs)
- Selection Bias, Regression, Matching: Will You Make More Going to a Private University?
- Introduction to Instrumental Variables
- Difference in differences (DD)
- Regression discontinuity designs (RD)
- Mastering Mostly Harmless Econometrics (Alberto Abadie, Joshua Angrist, and Christopher Walters) – 2020 AEA Continuing Education Webcasts
Grading:
- Spring semester (前期): 3 assignments (75%), attendance and class participation (25%)
- Fall semester (後期): 2 assignments (50%), and a term project presentation (50%)
