Causal Inference using Python and Machine Learning

専修大学経済学部 英語による特殊講義 前期・後期

Causal Inference using Python and Machine Learning 1&2

This course helps students understand applied causal inference and develop the skills needed to plan and execute their own empirical projects in economics. A key challenge of empirical research in a data-rich environment is formulating insightful questions and conducting robust empirical analysis. To address this, we will examine numerous examples and engage in substantial data analysis ourselves. Topics of the first-semester course include randomized controlled trials, regression and matching techniques. Topics of the second-semester course include instrumental variables, difference-in-differences, and regression discontinuity designs. Throughout the course, we will also explore key contributions from the emerging econometric literature that integrates machine learning with causal inference—referred to as causal machine learning.

By the end of the course, you will have gained a practical familiarity with the tools of causal machine learning, proficiency in data handling using Python, and a solid understanding of the models and methods of applied causal inference.

Topics:

  1. Identification and potential outcome framework
  2. Review of statistics
  3. Python
  4. Markdown
  5. Analysis and interpretation of randomized trials
  6. Regression basics – conditional expectation function
  7. Regression basics – casual reg vs causal reg
  8. Regression basics – identification
  9. Regression basics – estimation
  10. Using multivariate regression – omitted variable bias
  11. Using multivariate regression – selection on observables
  12. Heteroskedasticity and clustered standard errors
  13. Directed acyclic graphs (DAG) and variable selection
  14. Matchmaker  
  15. Inverse probability weighting and Doubly-robust estimator
  16. Instrumental variables – selection on unobservables
  17. Instrumental variables – heterogeneous effects
  18. Instrumental variables – local average treatment effect
  19. Panel data model – fixed effects and time effects
  20. Panel data model – interactive effects
  21. Difference-in-differences methodDID with multiple time periods
  22. Synthetic control method
  23. Regression discontinuity design – Sharp RD and Fuzzy RD
  24. Least Absolute Shrinkage and Selection Operator (LASSO)
  25. Mostly dangerous big data – Double selection method
  26. Trees and causal trees
  27. Causal forests
  28. Targeting
  29. Final Exam (presentations, day 1)
  30. Final Exam (presentations, day 2)

Textbooks:

References:

Videos:

Grading:

  • Spring semester (前期): 3 assignments (75%), attendance and class participation (25%)
  • Fall semester (後期): 2 assignments (50%), and a term project presentation (50%)