The Center for Causal Inference supports RAND researchers—and their clients—by applying methodological and statistical rigor to sometimes confounding questions. Here are some of our primary methods for inferring causality, with some examples and links.
Randomized Controlled Trials
Randomized controlled trials (RCTs) aim to reduce certain sources of bias when testing the effectiveness of a new treatment or policy action by randomly allocating subjects to two or more groups, treating them differently, and then comparing them with respect to a measured response.
RCTs may be the gold-standard methodology of determining causal effects, but they are rarely available when examining policy problems. And even when an RCT is available, it might not always demonstrate cause and effect or assess why something worked (or didn’t).
-
The Study of Restorative Practices was a 5-year, cluster-randomized controlled trial (RCT) of the Restorative Practices Intervention in 14 middle schools in Maine to assess whether the intervention affects both positive developmental outcomes and problem behaviors; it was the first RCT of its kind.
-
Researchers created a video game to recalibrate how trauma triage physicians determine whether a patient's injuries appear typical. They then conducted a randomized controlled trial to compare the effect of this game with that of another educational program on physicians' triage decisions.
More researching using Randomized Controlled Trials
Propensity Scores
Propensity scores help researchers balance study groups and draw causal conclusions from observational studies. A propensity score is the probability that, based on certain characteristics, a study participant would be assigned to a specific treatment group. We help our RAND colleagues appropriately weigh, match, and apply propensity scores to their observational studies.
-
Propensity scores are commonly employed in observational study settings where the goal is to estimate average treatment effects. The paper introduces a flexible propensity score modelling approach, where the probability of treatment is modelled through a Gaussian process framework.
-
Among three state-level tobacco policies (cigarette taxation, tobacco control spending, and smoke-free air laws), a difference-in-differences analysis with generalized propensity scores found that only taxation significantly reduced smoking among the general adult population.
More research using Propensity Scores
Synthetic Control
Synthetic control is similar to propensity scores in the sense that it helps researchers balance the study groups and thus draw causal conclusions from observational studies. It involves the construction of a weighted combination of groups used as controls, to which the treatment group is compared. This method is applicable even in the case of only one treatment observation, a scenario not covered by propensity score methods.
-
To evaluate a drug market intervention, researchers used a synthetic control model to reduce the bias introduced by models that use non-equivalent comparison groups. The research also demonstrated the method and its versatility for evaluating programs retrospectively.
-
Results of a study implementing cohort and cross-section estimators using "synthetic" control groups—combinations of unaffected districts that are reweighted to closely resemble the treatment unit in the pre-intervention period—indicate that the policy was mostly ineffective at reducing the prevalence of overweight or obesity
Matching Estimators
Matching estimators evaluate the effects of a treatment intervention by comparing outcomes for treated persons to those of similar persons in a comparison group. Treatment may represent, for example, participation in a training program, where the outcome is earnings or employment after the program intervention. Matching estimators are used with propensity score, synthetic control and other methods.
-
For each of the world's 195 countries, a diagnostic tool developed by RAND for the U.S. military produces an overall security cooperation propensity score. Planners can then compare these scores with available funding and security cooperation priorities.
-
This dissertation employs a large, nationally representative panel dataset and a propensity score matching technique to assess the impact of bully victimization and the success of intervention programs.
Instrumental Variables
Instrumental variables may also be used to estimate causal relationships when RCTs are not feasible. The instrumental variable approach for controlling unobserved sources of variability is the mirror opposite of the propensity score method for controlling observed variables. We help identify and use instrumental variables to control for confounding and measurement error in observational studies so we can make appropriate causal inferences.
-
Small changes in analytic approach can yield contradictory results, which is demonstrated for antidepressant medication and counseling. With a sufficiently large sample size, instrumental variable estimation provides a possible solution and permits causal inferences under certain conditions.
-
Estimates of the benefits of weather warning systems are sparse, perhaps because there is often no clear counterfactual of how individuals would have fared without a particular warning system. Researchers used conditional variation in the initial broadcast dates of the National Oceanic and Atmospheric Administration's Weather Radio All Hazards (NWR) transmitters to produce both cross-sectional and fixed effects estimates of the causal impact of expanding the NWR transmitter network.
Difference-in-Differences
Difference-in-differences estimation is a statistical technique that is used after the fact to mimic an RCT using observational study data. In this case, we use longitudinal data from two groups to obtain an appropriate counterfactual to estimate a causal effect. We typically use this approach to estimate the effects of a specific intervention—such as passing a law, enacting a policy, or implementing a large-scale program—by comparing the changes in outcomes over time between a population that is affected by the intervention and a population that is not.
-
A retrospective, difference-in-difference analysis of live births in the State Inpatient Databases from 8 U.S. states in varying years between 2003 and 2014 found that states that impose punitive action against pregnant women who use illicit substances are associated with higher rates of infants being born with opioid withdrawal.
-
Opening onsite health clinics to provide comprehensive primary care to teachers and their families can lower a school district's health care costs and decrease teacher absenteeism, according to a difference-in-differences analysis.
Regression-Discontinuity Design
Regression-discontinuity design allows us to determine whether a program is effective without requiring us to assign potentially vulnerable individuals to a "no-program" comparison group to evaluate the effectiveness of a program. In fact, we encourage the use of RDD when we wish to target a program or treatment to those who most need or deserve it—for example, students with low test scores, or patients in need of an experimental treatment.
-
A rigorous program evaluation of the Medical Insurance Program for the Poor in the republic of Georgia looks at costs, usage and health behaviors under this system. The research design exploits the sharp discontinuities at two regional eligibility thresholds to estimate local average treatment effects.
-
Contrary to popular belief, having a dog or cat in the home does not improve the mental or physical health of children. Researchers used a weighted propensity score regression approach and double robust regression analyses to examine the association between living with a dog or cat and health outcomes, while accounting for confounding factors.