【免费】Asurveyoncausalinference_onlinecausalfeature资源-CSDN文库

因果推断

需积分: 0 20 浏览量 2023-09-30 04:20:12 上传评论收藏 946KB PDF 举报

资源推荐

资源详情

资源评论

A Survey on Causal Inference

LIUYI YAO, Alibaba Group

ZHIXUAN CHU and SHENG LI, University of Georgia

YALIANG LI, Alibaba Group

JING GAO, Purdue University

AIDONG ZHANG, University of Virginia

Causal inference is a critical research topic across many domains, such as statistics, computer science, ed-

ucation, public policy, and economics, for decades. Nowadays, estimating causal eect from observational

data has become an appealing research direction owing to the large amount of available data and low bud-

get requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine

learning area, various causal eect estimation methods for observational data have sprung up. In this survey,

we provide a comprehensive review of causal inference methods under the potential outcome framework,

one of the well-known causal inference frameworks. The methods are divided into two categories depending

on whether they require all three assumptions of the potential outcome framework or not. For each category,

both the traditional statistical methods and the recent machine learning enhanced methods are discussed

and compared. The plausible applications of these methods are also presented, including the applications

in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as

well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore,

evaluate and apply the causal inference methods.

CCS Concepts: • Computing methodologies → Causal reasoning and diagnostics; Machine learning;•

Information systems → Data mining;

Additional Key Words and Phrases: Treatment eect estimation; Representation learning

ACM Reference format:

Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang. 2021. A Survey on Causal

Inference. ACM Trans. Knowl. Discov. Data 15, 5, Article 74 (May 2021), 46 pages.

https://doi.org/10.1145/3444944

Work done when Liuyi Yao was a Ph.D. student at University at Bualo.

This work is supported in part by the US National Science Foundation under grants IIS-1747614, IIS-2008208, IIS-1934600,

IIS-1938167, and IIS-1955151.

Authors’ addresses: L. Yao, Alibaba Group, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, 311121, China;

email: [email protected]; Z. Chu, University of Georgia, 415 Boyd Graduate Studies Research Center, Athens,

Georgia, 30602-7404, USA; email: [email protected]; S. Li, University of Georgia, 415 Boyd Graduate Studies Research

Center, Athens, Georgia, 30602-7404, USA; email: [email protected]; Y. Li, Alibaba Group, 500 108th Ave NE, Suite800,

Bellevue, Washington, 98004, USA; email: [email protected]; J. Gao, Purdue University, 465 Northwestern Ave.,

West Lafayette, Indiana, 47907-2035, USA; email: [email protected]; A. Zhang, University of Virginia, 85 Engineer’s

Way, Charlottesville, Virginia, 22904; email: [email protected].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee

provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and

the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored.

Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires

prior specic permission and/or a fee. Request permissions from p[email protected].

1556-4681/2021/05-ART74 $15.00

https://doi.org/10.1145/3444944

ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 5, Article 74. Publication date: May 2021.

74:2 L. Yao et al.

1 INTRODUCTION

In everyday language, correlation and causality are commonly used interchangeably, although

they have quite dierent interpretations. Correlation indicates a general relationship: two variables

are correlated when they display an increasing or decreasing trend [7]. Causality is also referred

to as cause and eect where the cause is partly responsible for the eect, and the eect is partly

dependent on the cause. Causal inference is the process of drawing a conclusion about a causal

connection based on the conditions of the occurrence of an eect. The main dierence between

causal inference and inference of correlation is that the former analyzes the response of the eect

variable when the cause is changed [104, 150].

It is well known that “correlation does not imply causation.” For example, a study showed that

girls have breakfast normally have lightweight than the girls who don’t and thus concluded that

having breakfast can help to lose weight. But in fact, these two events may just have correlation

instead of causality. Maybe the girls who have breakfast every day have a better lifestyle, such

as exercise frequently, sleep regularly, and have a healthy diet, which nally makes them have

lightweight. In this case, having a better lifestyle is the common cause of both having breakfast

and lightweight, and thus we also can treat it as a confounder of the causality between having

breakfast and lightweight.

In many cases, it seems obvious that one action can cause another; however, there exists also

many cases that we cannot easily tease out and make sure the relationship. Therefore, learning

causality is one dauntingly challenging problem. The most eective way of inferring causality is

to conduct a randomized controlled trial, which randomly assigns participants into a treatment

group or a control group. As the randomized study is conducted, the only expected dierence

between the control and treatment groups is the outcome variable being studied. However, in re-

ality, randomized controlled trials are always time-consuming and expensive, and thus the study

cannot involve many subjects, which may be not representative of the real-world population a

treatment/intervention would eventually target. Another issue is that the randomized controlled

trials only focus on the average of samples, and it does not explain the mechanism or pertain for

individual subjects. In addition, ethical issues also need to be considered in most of the random-

ized controlled trials, which largely limits its applications. Therefore, instead of the randomized

controlled trials, the observational data is a tempting shortcut. Observational data is obtained by

the researcher simply observing the subjects without any interfering. That means, the researchers

have no control over treatments and subjects, and they just observe the subjects and record data

based on their observations. From the observational data, we can nd their actions, outcomes,

and information about what has occurred, but cannot gure out the mechanism why they took a

specic action. For the observational data, the core question is how to get the counterfactual out-

come. For example, we want to answer this question “would this patient have dierent results if he

received a dierent medication?” Answering such counterfactual questions is challenging due to

two reasons [135]: the rst one is that we only observe the factual outcome and never the counter-

factual outcomes that would potentially have happened if they have chosen a dierent treatment

option. The second one is that treatments are typically not assigned at random in observational

data, which may lead the treated population diers signicantly from the general population.

To solve these problems in causal inference from observational data, researchers develop

various frameworks, including the potential outcome framework [127, 149] and the structural

causal model (SCM) [102, 105, 107]. The potential outcome framework is also known as the

Neyman–Rubin Potential Outcomes or the Rubin Causal Model. In the example, we mentioned

above, a girl would have a particular weight if she had breakfast normally every day, whereas she

would have a dierent weight if she didn’t have breakfast normally. To measure the causal eect of

ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 5, Article 74. Publication date: May 2021.

A Survey on Causal Inference 74:3

having breakfast normally for a girl, we need to compare the outcomes for the same person under

both situations. Obviously, it is impossible to see both potential outcomes at the same time, and one

of the potential outcomes is always missing. The potential outcome framework aims to estimate

such potential outcomes and then calculate the treatment eect. Therefore, the treatment eect

estimation is one of the central problems in causal inference under the potential outcome frame-

work. Another inuential framework in causal inference is the SCM, which includes the causal

graph and the structural equations. The SCM describes the causal mechanisms of a system where

a set of variables and the causal relationship among them are modeled by a set of simultaneous

structural equations. Another line of learning causality is causal structure learning, whose

objective is to reveal the causal relation by generating a causal graph. Representative methods can

be divided into three categories, including constraint-based models [147], score-based models [31,

114], and functional causal models [62, 176]. Dierent from causal eect estimation, causal

structure learning address a dierent class of problems, which is out of our survey’s scope;

see [148] for more information.

The causal inference has a close relationship with the machine learning area. In recent years, the

magnicent bloom of the machine learning area enhances the development of the causal inference

area. Powerful machine learning methods such as decision tree, ensemble methods, deep neural

network, are applied to estimate the potential outcome more accurately. In addition to the amelio-

ration of the outcome estimation model, machine learning methods also provide a new aspect to

handle the confounders. Benetting from the recently deep representation learning methods, the

confounder variables are adjusted by learning the balanced representation for all covariates, so

that conditioning on the learned representation, the treatment assignment is independent of the

confounder variables. In machine learning, the more data the better. However, in causal inference,

more data alone is not yet enough. Having more data only helps to get more precise estimates, but

it cannot make sure these estimates are correct and unbiased. Machine learning methods enhance

the development of causal inference, meanwhile, causal inference also helps machine learning

methods. The simple pursuit of predictive accuracy is insucient for modern machine learning

research, and correctness and interpretability are also the targets of machine learning methods.

Causal inference is starting to help to improve machine learning, such as recommender systems

or reinforcement learning.

In this article, we provide a comprehensive review of the causal inference methods under the po-

tential outcome framework. We rst introduce the basic concepts of the potential outcome frame-

work as well as its three critical assumptions to identify the causal eect. After that, various causal

inference methods with these three assumptions are discussed in detail, including re-weighting

methods, stratication methods, matching based methods, tree-based methods, representation-

based methods, multi-task learning based methods, and meta-learning methods. Additionally,

causal eect estimation methods that relax the three assumptions are also described to fulll the

needs in dierent settings. After introducing various causal eect estimation methods, the real-

world applications that the discussed methods have great potential to benet are discussed, in-

cluding the advertisement area, recommendation area, medicine area, and reinforcement learning

area as the representative examples.

To the best of our knowledge, this is the rst article that provides a comprehensive survey for

causal inference methods under the potential outcome framework. There also exist several surveys

that discuss one category of the causal eect estimation methods, such as the survey of matching

based methods [151], survey of tree-based and ensemble-based method [12], and the review of

dynamic treatment regimes [28]. For the SCM, it is suggested to refer to the survey [104]orthe

book [103]. There is also a survey about learning causality from observational data [52]whose

content ranges from inferring the causal graph from observational data, SCM, potential outcome

ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 5, Article 74. Publication date: May 2021.

74:4 L. Yao et al.

framework, and their connection to machine learning. Compared with the surveys mentioned

above, this survey article mainly focuses on the theoretical background of the potential outcome

framework, the representative methods across the statistic domain and machine learning domain,

and how this framework and the machine learning area enhance each other.

To summarize, our contributions to this survey are as follows:

—New taxonomy: We separate various causal inference methods into two major categories

based on whether they require the three assumptions of the potential outcome framework.

The category requiring three assumptions are further divided into seven sub-categories

based on the way to handle the confounder variables.

—Comprehensive review: We provide a comprehensive survey of the causal inference methods

under the potential outcome framework. In each category, the detailed descriptions of the

representative methods, the connection and comparison between the mentioned methods,

and the general summation are provided.

—Abundant resources: In this survey, we list the state-of-art methods, the benchmark datasets,

open-source codes, and representative applications.

The rest of the article is organized as follows. In Section 2, the background of the potential

outcome framework is introduced, including the basic denitions, the assumptions, and the fun-

damental problems with their general solutions. In Section 3, the methods under three assumptions

are presented. Then, in Section 4, we discuss the problem when some assumptions are not satised,

and describe the methods that relax those assumptions. Next, we provide experimental guidelines

in Section 5. Afterward, in Section 6, the typical applications of causal inference are illustrated.

After that, in Section 7, the future directions and open problems are discussed. Finally, Section 8

summarizes the article.

2 BASIC OF CAUSAL INFERENCE

In this section, we present the background knowledge of causal inference, including task descrip-

tion, mathematical notions, assumptions, challenges, and general solutions. We also give an illus-

trative example that will be used throughout this survey.

Generally speaking, the task of causal inference is to estimate the outcome changes if another

treatment had been applied. For example, suppose there are two treatments that can be applied to

patients: Medicine A and Medicine B. When applying Medicine A to the interested patient cohort,

the recovery rate is 70%, while applying Medicine B to the same cohort, the recovery rate is 90%.

The change of recovery rate is the eect that treatment (i.e., medicine in this example) asserts on

the recovery rate.

The above example describes an ideal situation to measure the treatment eect: applying dier-

ent treatments to the same cohort. In real-world scenarios, this ideal situation can only be approx-

imated by a randomized experiment, in which the treatment assignment is controlled, such as a

completely random assignment. In this way, the group receives a specic treatment can be viewed

as an approximation to the cohort we are interested in.

However, performing randomized experiments are expensive, time-consuming, and sometimes

even unethical. Therefore, estimating the treatment eect from observational data has attracted

growing attention due to the wide availability of observational data. Observational data usually

contains a group of individuals taken dierent treatments, their corresponding outcomes, and

possibly more information, but without direct access to the reason/mechanism why they took the

specic treatment. Such observational data enable researchers to investigate the fundamental

problem of learning the causal eect of a certain treatment without performing randomized

experiments. To better introduce various treatment eect estimation methods, the following

ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 5, Article 74. Publication date: May 2021.

A Survey on Causal Inference 74:5

section introduces several denitions, including unit, treatment, outcome, treatment eect, and

other information (pre- and post-treatment variables) provided by observational data.

2.1 Definitions

Here we dene the notations under the potential outcome framework [127, 149], which is logically

equivalent to another framework, the SCM framework [72]. The foundation of the potential out-

come framework is that the causality is tied to treatment (or action, manipulation, intervention),

applied to a unit [69]. The treatment eect is obtained by comparing units’ potential outcomes of

treatments. In the following, we rst introduce three essential concepts in causal inference: unit,

treatment, and outcome.

Denition 1 (Unit). A unit is the atomic research object in the treatment eect study.

A unit can be a physical object, a rm, a patient, an individual person, or a collection of objects

or persons, such as a classroom or a market, at a particular time point [69]. Under the potential

outcome framework, the atomic research objects at dierent time points are dierent units. One

unit in the dataset is a sample of the whole population, so in this survey, the term “sample” and

“unit” are used interchangeably.

Denition 2 (Treatment). Treatment refers to the action that applies (exposes, or subjects) to a

unit.

Let W (W ∈

{

0, 1, 2,...,N

}

) denote the treatment, where N

+ 1 is the total number of pos-

sible treatments. In the aforementioned medicine example, Medicine A is a treatment. Most of the

literatures consider the binary treatment, and in this case, the group of units applied with treat-

ment W = 1isthetreated group, and the group of units with W = 0isthecontrol group.

Denition 3 (Potential Outcome). For each unit-treatment pair, the outcome of that treatment

when applied on that unit is the potential outcome [69].

The potential outcome of treatment with value w is denoted as Y (W = w).

Denition 4 (Observed Outcome). The observed outcome is the outcome of the treatment that is

actually applied.

The observed outcome is also called factual outcome, and we use Y

to denote it where F

stands for “factual.” The relation between the potential outcome and the observed outcome is:

= Y (W = w) where w is the treatment actually applied.

Denition 5 (Counterfactual Outcome). Counterfactual outcome is the outcome if the unit had

taken another treatment.

The counterfactual outcomes are the potential outcomes of the treatments except the one actu-

ally taken by the unit. Since a unit can only take one treatment, only one potential outcome can

be observed, and the remaining unobserved potential outcomes are the counterfactual outcome.

In the multiple treatment case, let Y

(W = w



) denote the counterfactual outcome of treatment

with value w



. In the binary treatment case, for notation simplicity, we use Y

to denote the

counterfactual outcome, and Y

= Y (W = 1 − w),wherew is the treatment actually taken by the

unit.

In the observational data, besides the chosen treatments and the observed outcome, the units’

other information is also recorded, and they can be separated as pre-treatment variables and the

post-treatment variables.

ACM Transactions on Knowledge Discovery from Data, Vol. 15, No. 5, Article 74. Publication date: May 2021.

剩余45页未读，继续阅读

评论收藏

内容反馈

sinat_27639359

粉丝: 0
资源: 4

A survey on causal inference

最新资源

A survey on causal inference

causal-inference

Bayesian inference for causal effects

user2020-causal-inference

causal-inference-class

因果推断书籍《causal inference in python》电子书，《使用Python进行因果推断：科技产业应用》

A Survey on Causal Inference.pdf

A First Course in Causal Inference

Causal Inference in Statistics.pdf

Causal Inference for Statistics, Social, and Biomedical Sciences

decart_causal_inference_2018

causal_inference:因果推理领域的学习

causal-inference:面向公众的因果推理和机器学习网站

无数据结构剪枝的因果推理_On Causal Inference for Data-free Structured Prunin

Counterfactuals and Causal Inference (Second Edition)

Causal Inference from Relational Data.pdf

因果推论书Causal Inference Book

Causal Inference - What If

Causal Inference and Discovery in Python

Causal Inference Tutorial (读书会)1

Causal Inference for Robust, Reliable, and Responsible NLP

Causal Inference:WhatIf.pdf

Causal Inference 笔记 （Marginnote 脑图版）

Causal inference in online systems：Methods, pitfalls and best pr

Causal_inference_project

完整车牌号识别程序，可以识别车牌和颜色，可以集成到项目中 支持win7+

ChatGPT教程（终极版）最全整理

博客中Kmeans以及FCM算法数据（免积分）

最新资源

Causal Inference 笔记（Marginnote 脑图版）

完整车牌号识别程序，可以识别车牌和颜色，可以集成到项目中支持win7+