2
population, to building and testing theory (Deaton and Cartwright, 2018). However, designing and
conducting RCTs remains costly to researchers and subjects. While major firms now openly acknowledge
running thousands of experiments at any given point of time on their live platforms, the economic and
social costs borne by users are substantial even if not well understood (Clarke 2016; Goel 2014).
Therefore, it is important to derive from RCT data as much value and knowledge as possible. One way to
economize the use of RCTs is not just to measure the average treatment effect (ATE), but also to find
more nuanced insights about the underlying theoretical mechanisms and to generalize the inferences to
other contexts.
As is well known, the random assignment of subjects to treatment groups ideally removes
endogenous influences, thereby making it possible to identify an overall population causal effect of the
treatment on an outcome. Assuming the experiment is perfectly executed, auxiliary variables that would
otherwise be needed to identify the causal effect of a treatment on the outcome—as in a regression
model to analyze observational data—are rendered by the random assignment to be unnecessary for
identifying and reliably estimating the expected ATE. Hence, in theory, the auxiliary variables—also
known as control variables, subgroups, or covariates—are not required for inclusion in regression
models to prevent omitted variable biases or endogeneity biases. That covariates are not required in
analyzing randomized experiments is often stated as their strength over observational studies. Yet,
Deaton and Cartwright (2018) point out an important drawback: to the extent that researchers rely on
experiments to avoid thinking about covariates and their underlying causal structure, it can hamper the
cumulative development of science and theory.
Despite the theoretical justification for avoiding the use of covariates for estimating an overall
ATE, in practice researchers often use covariates in experiments (see e.g., survey by Montgomery et al.
2018). Often, and especially in large-scale experiments, researchers estimate ATEs for subgroups.
Equivalently, researchers run a regression or ANCOVA model, conditioning on a set of control variables.
This practice is popular in experimental studies in economics, epidemiology, management, political
science, information systems, and other areas where researchers commonly report multiple versions of
models that include different subsets of control variables, along with the overall treatment effect; this is
also known as post-stratification, covariate adjustment, or conditioning on auxiliary variables. This
widely used practice suggests that scholars see value in the information provided by auxiliary variables.
In addition to utilizing auxiliary variables in the analysis (e.g., in a regression model or ANCOVA), some
researchers also include interaction terms between the auxiliary variables and the treatment variable
(Imbens and Rubin, 2015; section 7.5-7.6). Researchers use auxiliary variables for different reasons: to
评论0
最新资源