Applied Statistics for Policy Analysis
Professor: Sturm
Units: 1.0
Elective Course (fulfills a Research, Analysis, and Design Empirical Analysis distribution requirement)
Econometrics, biostatistics, psychometrics, sociometrics, machine learning—all start from the same principles and there are plenty of shared tools. Econometrics emphasizes causal inference with observational data, which is a natural fit for policy analysis where experimental opportunities are limited and yet the goal is to predict changes in a system. Biostatistics (but not epidemiology) deals with experimental design and analysis, psychometrics with measurement and deriving conceptual ideas, machine learning with predictions. All fields change over time and are prone to fads, but there is deep and stable foundation that is shared by all of them. As you are in an interdisciplinary program, it is important that you see the relationship between similar approaches across disciplines.
This course covers tools and concepts used across fields at an intermediate level, between the basics of the first year sequence and more advanced quantitative courses. And at a minimum, you will learn how to avoid making some very common and yet embarrassing mistakes in the future.
Students should have taken the econometric/statistics sequence before trying this course. It is not a first course in statistics. The mathematical level is not that high, lower than in a graduate econometrics or statistics course, and we do not discuss theorems or prove them. But that doesn’t mean the concepts are easy, the concepts are still hard. And we work through concepts, not just teach you how to press buttons as in a programming course.
Linear models are fundamental tools, always a good first approach, and sometimes even all you need. But real data are less cooperative. Outcomes cluster on some values, are top-coded (e.g. income reported in tax returns), are qualitative in nature, relationships are inherently nonlinear. We cover nonlinear statistical models and the basic theory behind them, tools to choose between different models, discusses issues with transformations (common approaches, such as taking logs can cause unexpected problems). The emphasis is on applications and we build models that use several tools simultaneously (e.g. the multi-part models popular in health economics) and combine them into simulation models to predict policy changes