超越整体治疗效果：在因果结构指导的随机实验中利用协变量-研究论文

需积分: 4 39 浏览量 2021-06-09 15:29:00 上传评论收藏 4.44MB PDF 举报

资源详情

资源评论

Beyond overall treatment effects: Leveraging covariates in randomized

experiments guided by causal structure

Ali Tafti, College of Business Administration, The University of Illinois at Chicago, Chicago, IL 60607,

atafti@uic.edu

Galit Shmueli, Institute of Service Science, National Tsing Hua University, Hsinchu 30013, Taiwan,

galit.shmueli@iss.nthu.edu.tw

Abstract

Researchers using randomized controlled trials (RCTs) often subgroup or condition on auxiliary variables

that are not the randomized treatment variable. There are many good reasons to condition on auxiliary

variables—also referred to as control variables or covariates— in randomized experiments. In particular,

designing and conducting RCTs is costly to researchers and subjects, and therefore it is important to

derive greater value from RCT data; measuring not just the average treatment effect (ATE), but also

finding more nuanced insights about the underlying theoretical mechanisms and generalizing the

inferences. Unfortunately, there are many confusing and even contradictory guidelines on the use of

subgroups or auxiliary variables in RCTs. We show how researchers can leverage covariates without

biasing their causal inferences, by applying a few simple rules based on Judea Pearl’s causal

diagramming framework. We demonstrate how to create a causal schema, through careful and

deliberate operationalization of auxiliary covariates, in order to analyze the intermediate effects along a

causal chain from the treatment to outcome; and we discuss some other ways to leverage covariates for

theory development and generalization of findings from RCTs. We present a criterion for distinguishing

pre-treatment and post-treatment variables that is based on directed acyclic graphs (DAGs). We provide

a succinct set of guidelines to help readers begin to employ some essential techniques of DAG-based

causal analysis. Finally, we provide a series of short tutorials (with accompanying simulated data and R

scripts) to help readers explore the connections between RCT and observational contexts in causal

diagramming. This commentary aims to raise awareness of the DAG methodology, explain its usefulness

to experimental research, and encourage adoption in the IS community for studies using RCTs as well as

observational data.

Keywords: Randomized experiments, Causal inference, Causal diagram, Direct Acyclic Graph (DAG)

1.Introduction and motivation

Randomized experiments, or randomized control trials (RCTs) have long been an important part

of the research and product development process for firms in many industries—most notably the

pharmaceutical industry where randomized clinical trials are one of the requirements to obtain

government approval of new products. RCTs have become popular tools for empirical research in

information systems (IS) and in many other fields, as they are considered to be the gold standard for

accurate causal inferences, and as digital technologies have helped reduce their implementation costs.

RCTs are used for a variety of goals, ranging from simple generalization of an effect to a larger

Electronic copy available at: https://ssrn.com/abstract=3331772

population, to building and testing theory (Deaton and Cartwright, 2018). However, designing and

conducting RCTs remains costly to researchers and subjects. While major firms now openly acknowledge

running thousands of experiments at any given point of time on their live platforms, the economic and

social costs borne by users are substantial even if not well understood (Clarke 2016; Goel 2014).

Therefore, it is important to derive from RCT data as much value and knowledge as possible. One way to

economize the use of RCTs is not just to measure the average treatment effect (ATE), but also to find

more nuanced insights about the underlying theoretical mechanisms and to generalize the inferences to

other contexts.

As is well known, the random assignment of subjects to treatment groups ideally removes

endogenous influences, thereby making it possible to identify an overall population causal effect of the

treatment on an outcome. Assuming the experiment is perfectly executed, auxiliary variables that would

otherwise be needed to identify the causal effect of a treatment on the outcome—as in a regression

model to analyze observational data—are rendered by the random assignment to be unnecessary for

identifying and reliably estimating the expected ATE. Hence, in theory, the auxiliary variables—also

known as control variables, subgroups, or covariates—are not required for inclusion in regression

models to prevent omitted variable biases or endogeneity biases. That covariates are not required in

analyzing randomized experiments is often stated as their strength over observational studies. Yet,

Deaton and Cartwright (2018) point out an important drawback: to the extent that researchers rely on

experiments to avoid thinking about covariates and their underlying causal structure, it can hamper the

cumulative development of science and theory.

Despite the theoretical justification for avoiding the use of covariates for estimating an overall

ATE, in practice researchers often use covariates in experiments (see e.g., survey by Montgomery et al.

2018). Often, and especially in large-scale experiments, researchers estimate ATEs for subgroups.

Equivalently, researchers run a regression or ANCOVA model, conditioning on a set of control variables.

This practice is popular in experimental studies in economics, epidemiology, management, political

science, information systems, and other areas where researchers commonly report multiple versions of

models that include different subsets of control variables, along with the overall treatment effect; this is

also known as post-stratification, covariate adjustment, or conditioning on auxiliary variables. This

widely used practice suggests that scholars see value in the information provided by auxiliary variables.

In addition to utilizing auxiliary variables in the analysis (e.g., in a regression model or ANCOVA), some

researchers also include interaction terms between the auxiliary variables and the treatment variable

(Imbens and Rubin, 2015; section 7.5-7.6). Researchers use auxiliary variables for different reasons: to

Electronic copy available at: https://ssrn.com/abstract=3331772

allow ATEs to be identified for specific subgroups of interest, to help identify moderating (interaction) or

mediating effects that reveal underlying theoretical mechanisms, and, from a statistical point of view, to

improve the precision of the ATE estimator (Kahan et al. 2014). Although collecting and using data on

auxiliary variables is widespread in RCTs, we argue that researchers have not been extracting the full

extent of information that such auxiliary variables could provide, which would have both theoretical and

practical value. Current guidance on the use of auxiliary variables in RCTs is focused on avoiding bias to

the ATE estimator. Not only has prior literature provided confusing and sometimes contradictory

guidelines on avoiding bias to the ATE, but moreover, it does not provide clear guidance on how to

extract further information using auxiliary variables, beyond the ATE.

In this research commentary, we present guidelines for using subgroups and auxiliary variables

in RCTs, based on the causal diagramming framework by Pearl (2009).

This framework provides clear

guidelines on how to leverage the information in auxiliary variables, and we illustrate its usefulness with

an example in the IS context. We do not add anything fundamentally new to Pearl’s methodology, nor

do we provide a comprehensive introduction to it. Rather, our five-fold contribution, which we have not

seen in prior published work, includes the following: 1) We demonstrate how to create a causal schema,

through careful and deliberate operationalization of auxiliary covariates, in order to analyze the

intermediate effects along a causal chain from the treatment to outcome; and we discuss other ways to

leverage covariates for theory development and generalization of findings from RCTs, 2) we present a

criterion for distinguishing pre-treatment and post-treatment variables that is based on directed acyclic

graphs (DAGs), 3) we provide a succinct set of guidelines to help readers begin to employ basic and

essential DAG-based causal analysis, and help orient readers on where to reference other work for

further guidance, 4) we point out issues of particular relevance to online digital experiments, and 5) we

provide a series of short tutorials, and make the accompanying simulated data and R scripts available, to

make our points more concrete to readers who wish to learn to implement some of the techniques. Our

five-fold contribution aims to raise awareness of the DAG methodology, explain its usefulness to

experimental research, and encourage adoption in the IS community for studies using RCTs as well as

observational data. Readers can find a more thorough grounding in the theoretical underpinnings of

causal diagramming, and some exposure to its broad domains of application, in the textbook by Judea

The first edition of Pearl’s textbook Causality was published in 2000. The methods introduced in that work have

already been adopted extensively in epidemiology and biostatistics and, to a lesser extent, in other social science

fields such as political science, sociology and psychology. Pearl’s work has brought about a paradigmatic shift in

conceptualizing causal questions, and has been formally recognized as a fundamental contribution to the

philosophy of science and to computer science, as evidenced by the highest awards in those fields.

Electronic copy available at: https://ssrn.com/abstract=3331772

Pearl (2009).

Pearl et al. (2016) offers a more basic workbook with useful examples. Pearl and

Mackenzie (2018) is a delightfully readable introductory text that puts the development of Pearl’s

framework in historical context. Morgan and Winship (2015) also provide an effective introduction to

causal diagramming for the social sciences.

1.1 Confusing and contradictory guidelines in current literature

Among various scientific disciplines where auxiliary variables are used in the analysis of RCTs, there has

been some disagreement and confusion on the practice of which auxiliary variables should be included

in the analysis of randomized experiments. For example, based on surveying 50 clinical trials, Pocock et

al. (2002) find inconsistencies regarding the use of covariate-adjusted analyses “perhaps largely because

their rationale and statistical properties are poorly understood.” They also highlight the lack of clear

guidelines on covariate selection. Examining the guidelines provided in a variety of textbooks and

papers, we identified several types of guidelines and conditions:

1. Guidance based on correlation of the auxiliary variable with the outcome variable: The classic

textbook by Cook et al. (2002, p. 306) prescribes including in the ANCOVA of an RCT “variables that

are highly correlated with the outcome whether or not they distinguish between [the treatment]

groups at pretest (Begg, 1990; Maxwell, 1993; Permutt, 1990)”, while also advising against “adding

covariates that do not predict outcome or that are highly correlated with each other”. A related, but

different, approach is the inclusion of covariates shown in the past to correlate with the outcome—

these are called prognostic covariates. For example, Kahan et al. (2014) recommend including

known prognostic covariates for the purpose of increasing power.

2. Guidance based on correlation of the auxiliary variable with the group imbalance: Raab et al. (2000,

p. 330) advocate always conditioning on balanced covariates, while adjusting for unbalanced

covariates only when the sample size is large. Yet, Mutz et al. (2018, p.5) explain why using

significance tests for determining unbalanced covariates is inappropriate (“One should choose

covariates for their anticipated relation to the dependent variable. To alter this choice because of a

balance test is to choose based on a relation with the independent variable”).

3. Guidance based on pre-treatment vs. post-treatment variables. Montgomery et al. (2018) document

a widespread misuse of auxiliary variables in the analysis of RCTs; in particular, they find research

publications routinely adjust for “post-treatment variables” in ways that are known to introduce bias

in causal estimates. We use quotes, because the terms pre- and post-treatment variables are ill-

defined in the papers we reviewed (see Section 2.1.3). Though we agree with Montgomery et al.

(2018) and others, that simply conditioning on “post-treatment variables” is usually better avoided,

Electronic copy available at: https://ssrn.com/abstract=3331772

we point out that (1) a clear definition of “post-treatment” and “pre-treatment” variables should be

based on causal structure, and (2) some variables labeled “post-treatment variables” can provide

essential insight, if used in the right way.

Based on a review of published experiments as well as the confusing, ambiguous, and sometimes

contradicting guidelines from the statistics literature, it is clear that we need more systematic guidance

on how to leverage the information of auxiliary variables in RCTs. As Hernán et al. (2002) explain,

strategies that rely purely on statistical criteria to determine whether to condition on auxiliary variables

are inadequate. Deaton and Cartwright (2018) argue that we need to pay more attention not just to

obtaining unbiased overall treatment effects, but also to how to use the results of RCTs for theory

building and generalizing results beyond narrowly circumscribed experimental settings: “If trials are to

be useful, we need paths to their use that are as carefully constructed as are the trials themselves.”

(Deaton and Cartwright 2018, p. 10). Deaton and Cartwright (2018) explain that covariates can be useful

to generalize results and enhance theory implications from RCT studies, but their commentary does not

provide methodological guidance on how to do this. Although some econometrics texts offer limited

guidance on how to avoid ‘bad control variables’ (e.g. Angrist and Pischke 2008), Pearl (2015) presents a

series of causal scenarios and ‘toy problems’ in which even those helpful texts and the prevailing

econometric frameworks prove inadequate. Thus, a systematic framework is needed to leverage

covariates beyond estimating an ATE. In this commentary, we aim to provide information to make it

easier for empirical researchers in IS to begin applying Pearl’s (2009) causality framework in RCT studies

in order to properly leverage information from covariates that is useful for theory building and

generalizing experimental results.

1.2 Causal diagrams for bridging theory and practice for RCTs in IS research

Causal diagrams explicitly encode a limited set of relatively precise and narrow theoretical

claims that relate variables to one another. In doing so, Pearl’s framework enables more insight into

theoretical mechanisms by properly identifying subgroup specific effects, causal mediation,

intermediate effects, and other insights of theoretical value that can be leveraged from auxiliary

variables (Pearl 2009). Moreover, the framework offers quantitative methods for generalizing RCT

results, enabling researchers to extend and transport the estimated causal effect from a tightly

circumscribed RCT setting to non-experimental settings (Bareinboim and Pearl 2016).

Despite Pearl’s extensive body of work, there remains a gap between theory and practice: The

causal diagramming framework is still perceived in fields such as econometrics as not amenable to

practical applications (Imbens 2019). Thus far, Pearl’s work is comprised primarily of “toy examples,”

Electronic copy available at: https://ssrn.com/abstract=3331772

剩余45页未读，继续阅读

评论收藏

内容反馈

超越整体治疗效果：在因果结构指导的随机实验中利用协变量-研究论文

评论0

最新资源

超越整体治疗效果：在因果结构指导的随机实验中利用协变量-研究论文

评论0

最新资源

相关推荐

论文研究-基于动态因果结构推断的SVAR模型识别:算法和仿真.pdf

论文研究-基于DAG方法的SVAR模型识别: 理论基础和仿真实验.pdf

用于因果推理的机器学习工具变量-研究论文

国外 - 为什么：关于因果关系的新科学pdf

为什么：关于因果关系的新科学_book.pdf

论文研究-基于因果效应的贝叶斯网络结构学习方法.pdf

论文研究 - 基于Stata软件的面板数据双变量动态Probit模型中因果关系的估计：技术评论

论文研究 - 预后和预测生物标志物的因果措施

为什么：关于因果关系的新科学中文版.pdf.zip

论文研究-基于因果图启发式的并行概率规划求解.pdf

激励有效的移动应用采用：来自大规模随机现场实验的证据-研究论文

ICLR 2020 Bengio 一作论文：因果机制、元学习与模型泛化如何产生关联？.zip

论文研究-基于互信息的适用于高维数据的因果推断算法.pdf

论文研究 - 为什么不应该将定量变量重新分类

论文研究-一种基于d-分离树分解的GES因果贝叶网络结构学习改进算法 .pdf

论文研究-全面质量管理中因果分析的定量方法.pdf

论文研究-基于贝叶斯网络的民航突发事件因果关系分析方法研究.pdf

有限实验的因果可运输性-研究论文

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

NPPJSONViewer.zip

GD32替换STM32注意事项.pdf

XCP协议的规范文档

VS2015安装证书，JavaScript_ProjectSystem.msi，JavaScript_LanguageService.msi

CANoe通过CAPL脚本实现自动测试

蓝牙BLE协议中文版.pdf

Qt 5实现串口调试助手（源工程文件、0积分下载）