**Author |** **Edited by Li Mei | Chen Caixian**

Judea Pearl, the 2011 Turing Award winner and the father of causal science, put forward the famous “Causal Hierarchy” (Pearl Causal Hierarchy, PCH).

He believes that there are three levels of causal inference, the lowest level is association (association), which involves prediction, not causality, and only discusses the correlation between variables, such as the correlation between rooster crowing and sunrise. .

The second level is intervention, which involves causality, such as the causal relationship between smoking and developing lung cancer.

The third level , Counterfactuals , involves answering questions such as “what might have happened if this were not the case”.

Counterfactuals are a hot topic in many causal inference studies at the moment, but there are also a lot of hot researches. Some studies use the term “counterfactual” inaccurately or even abuse it.

Recently, a causal inference paper with multiple references to “counterfactuals” was criticized by Judea Pearl after it was retweeted on Twitter.

Paper address: https://ift.tt/vN7H3ps

The first author of this paper is Professor Michael Jordan of the University of California, Berkeley. In the paper, the author studies a constructive algorithm that focuses on causal inference functionals and approximates the Gateaux of statistical functionals through finite differences. Derivative. In cases where the probability distribution is not known a priori but also needs to be estimated from the data, the estimated distribution yields empirical Gateaux derivatives, so the authors further examine the relationship between empirical, numerical, and analytical Gateaux derivatives. In a case study of counterfactual mean estimation, the authors demonstrate the exact relationship between finite differences and analytical Gateaux derivatives.

A company account @www.ar-tiste.xyz (hereinafter referred to as “ar-tiste”) concerned with providing Bayesian network software and services retweeted the paper and commented: Professor Michael Jordan uses Bayesian network instead of SCM to Do counterfactuals, so he thinks that the third step (ie counterfactuals) can be done without SCM.

SCM is the Structural Causal Models proposed by Judea Pearl, which consists of graphical models representing causal knowledge, counterfactual and intervention logic, and structural equations, which are often used to answer counterfactual questions.

And Pearl thinks: Anyone who claims to use Bayesian networks (Rung-2, second step) for counterfactual computation should be questioned, the evidence comes from pages 35-36 (Pearl’s book Causality: models, reasoning, and inference”). The Jordan paper defines the counterfactual as E[Y(1)], the second step, rather than the third step counterfactual, which is E[Y(1)|Y].

The evidence he cited was the following two pages:

A Russian researcher who studies causality also weighed in, noting that counterfactuals involve questions like “how likely is the outcome to be different if the treatment was different?” So, this paper is not doing counterfactual calculations.

ar-tiste responded by saying that he did not claim that SCM was wrong, but believed that SCM was a special case. If a FUNCTIONAL Taylor series of the full probability distribution of bnet is made, then the dominant term in the extension is SCM. He believes that, which is exactly what Jordan’s paper is for, Gateaux derivatives are functional derivatives. This paper is not a paper on variational inference (VI).

He goes on to point out that “Potential Outcomes” (PO) are counterfactuals without the use of SCM, while Pearl and Bareinboim claim that counterfactuals can only be done with SCM. So either the paper got it wrong, or Pearl got it wrong.

This statement drew strong opposition from Pearl, who said he did not claim that “counterfactual calculations can only be done using SCM”, his attitude was, “If you want to understand what you are doing, you want to defend or test your assumptions. , then you need to know that the counterfactual originates from the SCM .”

Pearl cites a blog he wrote back in 2014, “On the First Law of Causal Inference,” in which he mentioned that modern tools for causal analysis are not new, but organically inherited from the SEM framework. Therefore, one can use the research of SEM to make causal analysis more effective.

Blog address: https://ift.tt/2kOgDGB

The so-called SEM refers to “Structural Equation Model” (Structural Equation Model), which is a statistical tool for multivariate data analysis. In causal research, adding SEM on the basis of Bayesian network can further build SCM. Pearl argues that the structural definition of counterfactuals is the first law of causal inference.

At this point, Angela Zhou, one of the authors of the paper, finally responded to Pearl: “Yes, this paper only focuses on the second step (intervention effect, intervention average), and does not publish anything about the third step (counterfactual) at all. View”.

However, ar-tiste didn’t give up after seeing the main response, and searched the word “counterfactual” in the paper and found that it was mentioned 25 times in total, so the term “at all” is not accurate…

At this time, another netizen stood up and explained that in the PO context, there is no difference between the amount of intervention and the amount of counterfactuals, so even if the word “counterfactual” appears many times in the text, the paper itself may not involve Rung-3.

In ar-tiste’s view, this seems to imply that in the eyes of Pearl and PO (SCM and PO are the two main causal frameworks), the definition of “counterfactual” is different, but both are Y(0) and Y (1) is defined as a counterfactual variable.

Pearl expresses his opinion on “counterfactuals”, arguing that even people doing first-step estimates claim to be working on counterfactuals because the term is more modern and forward-looking, which is why he A reason to appeal to people to only use the word “counterfactual” in the third ladder task .

It can be seen that Pearl is very cautious about the use of the term “counterfactual”. When a netizen who did not know the truth commented that Jordan’s paper was “excellent causal modeling and very 1980s style”, Pearl did not hesitate. He graciously pointed out that he did not see the shadow of the 1980s in it. The paper has no d-separation and no graphoids, which is incompatible with the graphical model of the 1980s.

It’s true that causal inference research is very hot right now, and there’s a lot of blindly chasing the trend, but Pearl believes that the habit of calling everything produced by randomized controlled trials “counterfactual” is a major source of misunderstanding. .

In the end, the author of the paper, Angela Zhou, did not give further explanations, and she responded that the edited version of the paper will be renamed “interventional mean” as a clarification.

Pearl, however, followed through with rigor, arguing that even the definition of “intervention mean” in Example 1 of the paper was unclear.

It seems that even in the seemingly less “hard” discipline of causal inference, researchers need to maintain sufficient rigor.

**For more content, click below to follow:**

**Scan the code to add AI technology comment WeChat account, submit & join the group:**

** **

Leifeng.com

This article is reproduced from: https://www.leiphone.com/category/academic/kEwFqgdhPzhQujC0.html

This site is for inclusion only, and the copyright belongs to the original author.