“What must nature, including man, be like in order that science be possible at all? … What must the world be like in order that man may know it?”, the philosopher of science Thomas Kuhn (1962/2012, p. 172) asks in the final paragraph of his book, The Structure of Scientific Revolutions.
A cognitive corollary of Kuhn’s question is, “What must we humans assume the world be like, in order that we may know it?” James Woodward’s work on invariance has opened up several lines of inquiry into an answer to this question. I, for one, have benefitted in my work from his perspective. His key proposal, as I see it, is: we humans, and perhaps any intelligent cognitive system, biological or artificial, that supports adaptive functioning in this world, must assume that there are invariant cause-and-effect relations in the world, amidst things and events that are inexorably changing. If a causal relation changes from the moment one induces it to the moment one applies it, that knowledge would be useless. Causation With A Human Face (CHF) brings together theories and evidence from a truly wide array of disciplines, both descriptive and normative, in support of his proposal. As old as the concept of invariance may be, the interest that Woodward’s work attracted to it, and the insights and findings in his and other’s work on the topic, are new.
Woodward’s proposal shifts the goal of causal-knowledge formulation from the seeking of truth, which is dichotomous and objective, to the attainment of greater generalizability, which is a matter of degree and subjective in the sense of being a constraint imposed by the reasoner’s need to best achieve desired outcomes. This shift does not merely mean that the reasoner pays attention to generalizability. The psychological literature extensively reviewed and closely examined in CHF shows how this subjective goal shapes both the way humans acquire causal knowledge and the way they use that knowledge. As the psychological work covered in Chapters 4 and 7 of CHF shows, this shift of the goal—when incorporated into a computational model in terms of an assumption of the invariance of the causal strength of a candidate cause to influence a target binary outcome (e.g., having an allergic reaction or not) across contexts differing in background causes—leads to causal conclusions about the world that differ from those following from models that omit that assumption. The sameness of causal strength is taken as an indication that the causal mechanism operates the same way. Results from discriminating experiments testing intuitive causal judgments are in accord with the implicit adoption of that assumption. Even preschool children seem to make that assumption. (Inference regarding the causes of a binary outcome best reveals the difference between models that do and do not make the assumption.)
Because causes are often not invariant across contexts differing in background causes, assuming invariance across contexts may seem simplistic or wishful. Not so, because candidate causes are not fixed concepts. They are representations of nature that are ours to formulate. When the observed outcome deviates notably from the outcome expected under the invariance assumption, the deviation provides a signal to the reasoner indicating a potential need to revise current empirical knowledge toward greater causal invariance.
The broad set of arguments and evidence brought together in CHF suggests that evolutionary had a reason for human causal reasoning to be so pervasively motivated and enabled by the concept of invariance. The crucial role played by causal invariance in the construction of generalizable causal knowledge raises some questions: are current normative statistical models rational for the goal of obtaining generalizable causal knowledge? Would they benefit from making the invariance assumption that humans make? Statistical models such as logistic regression often applied as a default to analyze data involving a binary outcome (e.g., whether a tumor is malignant or not, whether a patient is pregnant or not) do not make the causal invariance assumption. Likewise, would artificial intelligence methods benefit from making that assumption, thus enabling them to revise their representations?
The typical flow of insight on rationality has been from explicit formal analyses to everyday intuitive thinking. Introductory psychology courses teach the many ways intuitive reasoning is riddled with biases. Work on invariance may show that insight can flow in the opposite direction. My guess is that which direction insight flows depends on whether rationality in a cognitive process has been crucial to the survival of the species during evolution. The invariance assumption is both rational and crucial.
I thank Patricia Cheng for her commentary. Her research has long been an inspiration to me. It brings together normative insight in the form of “rational” models of causal reasoning with detailed empirical work, often showing that people conform, perhaps to a surprising degree, to what these models take to be rationally appropriate. As she notes, there is a substantial tendency within psychology to emphasize the biases and fallacies to which human reasoning is subject. It is thus an important corrective to also emphasize the other side of the coin– the extent to which human causal reasoning and perhaps other forms of reasoning , are rational, in the sense of being well-adapted to goals that we have.
As described in CHF and previous posts, one central theme in Patricia’s research has been the role that invariance assumptions play (and should play) in human causal cognition. The normative role of invariance has real teeth is in disciplines like statistics and AI. Patricia refers to this briefly in her commentary but readers should know that her claims there are backed up by a substantial body of theoretical and empirical work. One illustration is provided by the standard treatment of the notion of (causal) interaction in statistics. This is typically conceptualized in terms of a failure of additivity: two variables X and Y interact with respect of effect Z if the function linking X and Y to Z does not take an additive form –that is, is not of the form (1) Z= aX+BY. Additivity or the absence of interaction is one kind of invariance assumption–in (1) it corresponds to the idea that, e.g., X has the same effect on Z regardless of the level of Y. Interaction thus corresponds to one kind of failure of invariance. As Patricia notes, when such a failure occurs it is sometimes being possible to restore invariance in the relation of interest by changing variables or by adding terms to that relation (e.g., adding an multiplicative term to (1)). This illustrates Patricia’s idea that a failure of invariance can serve as a sort of signal that something needs to be changed.
The conceptualization of interaction in terms of a failure of additivity is arguably sensible when the variables involved are continuous but as Patricia observes it is clearly defective when the variables are binary. In such cases we need a reconceptualization of what the presence or absence of interaction involves– one that does not rely on considerations of additivity. The general idea that interaction involves a failure of invariance (and non-interaction satisfaction of an invariance requirement) provides a guide to the formulation of right sort of conceptualization of interaction for binary variables. In Liljeholm and Cheng, 2007, (see also Cheng, 1997 and for a somewhat similar argument, Woodward, 1990), Patricia provides such a formulation and then shows that ordinary subjects reason in accord with it. Moreover, they do so successfully in the sense that such reasoning allows for correct extrapolation to new situations in a way the additivity-based conceptualization does not. As she notes, one clear implication of this work is that the usual understanding of interaction in terms of a failure of additivity is normatively inappropriate when applied to binary variables. This is thus a case in which, to put things in terms of the tile of Patricia’s post, statistics should adopt a conceptualization motivated by invariance considerations. In more recent work, Patricia has applied the same general line of reasoning to logistic regression models, which in their simplest form use a logistic function to model causal relations with a binary dependent variable. As she suggests in her commentary, logistic models do not appropriately incorporate invariance assumptions and, despite their widespread use, are consequently a problematic way of representing causal relations involving binary dependent variables.
As Patricia also suggests in passing, it is arguable that similar considerations apply to AI and machine learning, at least insofar as these have to do with causal learning. Suppose it is true that what many current algorithms (such as deep learning algorithms) learn are at bottom, associational or correlational relations. (This will seem a controversial claim to some but bear with me.) Then insofar as genuinely causal information needs to incorporate invariance-based considerations this represents a limitation on the ability of such algorithms to learn and represent causal relationships. It is thus not surprising that (at least as I see it) successful causal inference procedures such as those described in Sprites et al., 2000 go beyond the representation of associational relations, incorporating additional layers of structure such as those represented in causally interpreted directed graphs.
The suggestion that ordinary causal reasoning may embody assumptions (e.g., regarding invariance) that are not only normatively appropriate but which may have beneficial effects when incorporated into statistics and AI may seem surprising. But at least in the case of AI (and computer science more generally) this is an idea with a substantial history. For example, attempts to develop computer programs capable of high quality visual recognition have drawn inspiration from our knowledge of how the human visual system works. Empirical results concerning human causal reasoning may have a similarly fruitful impact on causal inference problems in statistics and AI.
Lest this last claim be misunderstood, let me reiterate a point made in CHF and previous posts. The argument is not that certain conceptualizations and patterns of reasoning are normatively appropriate simply because they are what humans do. The normative justification comes from the fact that the conceptualizations etc. are conducive to goals that we have such the generalizability of causal knowledge to new situations. However, it may not be obvious which conceptualizations and reasoning patterns are best justified in this way. For example, that a conceptualization of interaction for binary variables in terms of non-additivity is inadequate is not something that has been recognized by many statisticians. In such cases, finding that a reasoning pattern is present in human causal cognition (particularly a reasoning pattern contrary to conventional normative understanding ) can lead us to ask whether there is some normative, means/ends justification (perhaps not previously recognized) for its presence. Sometimes the answer to this question is “yes”.
Cheng, P. (1997) “From Covariation to Causation: A Causal Power Theory” Psychological Review 104: 367-405.
Liljeholm, M. and Cheng, P. (2007) “When is a Cause the ‘Same”? Coherent Generalization Across Contexts” Psychological Science 18: 1014- 1021.
Spirtes, P., Glymour, C. and Scheines, R. (2000) Causation, Prediction and Search. Second Edition. Cambridge: MIT Press.
Woodward, J. (1990) “Supervenience and Singular Causal Claims”. In Knowles, D (ed.) Explanation and its Limits. Cambridge: Cambridge University Press. 211-46.