Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).
5.7 Causal Models
A primitive atom is an atom that is stated as an atomic clause when it is true. A derived atom is one that uses rules to define when it is true. Typically the designer writes axioms for the derived atoms and then expects a user to specify which primitive atoms are true. Thus, the derived atoms will be inferred as necessary from the primitive atoms and other atoms that can be derived.
The designer of an agent must make many decisions when designing a knowledge base for a domain. For example, consider two propositions, a and b, both of which are true. There are many choices of how to write this. A designer could specify both a and b as atomic clauses, treating both as primitive. A designer could have a as primitive and b as derived, stating a as an atomic clause and giving the rule b ←a. Alternatively, the designer could specify the atomic clause b and the rule a ←b, treating b as primitive and a as derived. These representations are logically equivalent; they cannot be distinguished logically. However, they have different effects when the knowledge base is changed. Suppose a was no longer true for some reason. In the first and third representations, b would still be true, and in the second representation b would no longer true.
A causal model, or a model of causality, is a representation of a domain that predicts the results of interventions. An intervention is an action that forces a variable to have a particular value; that is, it changes the value in some way other than manipulating other variables in the model.
To predict the effect of interventions, a causal model represents how the cause implies its effect. When the cause is changed, its effect should be changed. An evidential model represents a domain in the other direction - from effect to cause. Note that we do not assume that there is "the cause" of an effect; rather there are many propositions, which together make the effect true.
Alternatively, we could specify in the evidential direction:
live_w3 ←lit_l2.
These are all statements that are true of the domain.
Suppose that wire w3 was live and someone put switch s3 up; we would expect that l2 would become lit. However, if someone was to make s3 lit by some mechanism outside of the model (and not by flipping the switch), we would not expect the switch to go up as a side effect.
up_s_1 ↔(lit_l_1↔up_s_2).
This formula is symmetric between the three propositions; it is true if and only if an odd number of the propositions are true. However, in the world, the relationship between these propositions is not symmetric. Suppose all three atoms were true in some state. Putting s1 down does not make s2 go down to preserve lit_l1. Instead, putting s1 down makes lit_l1 false, and up_s2 remains true to preserve this invariant. Thus, to predict the result of interventions, we require more than proposition (5.33) above.
A causal model is
lit_l1 ←∼up_s1 ∧∼up_s2.
The completion of this is equivalent to proposition (5.33); however, it makes reasonable predictions when one of the values is changed.
An evidential model is
up_s1 ←∼lit_l1∧∼up_s2.
This can be used to answer questions about whether s1 is up based on the position of s2 and whether l1 is lit. Its completion is also equivalent to formula (5.33). However, it does not accurately predict the effect of interventions.
A causal model consists of
- a set of background variables, sometimes called exogenous variables, which are determined by factors outside of the model;
- a set of endogenous variables, which are determined as part of the model; and
- a set of functions, one for each endogenous variable, that specifies how the endogenous variable can be determined from other endogenous variables and background variables. The function for a variable X is called the causal mechanism for X. The entire set of functions must have a unique solution for each assignment of values to the background variables.
When the variables are propositions, the function for a proposition can be specified as a set of clauses with the proposition as their head (under the complete knowledge assumption). One way to ensure a unique solution is for the knowledge base to be acyclic.
An intervention is an action to force a variable X to have a particular value v by some mechanism other than changing one of the other variables in the model. The effect of an intervention can be obtained by replacing the causal mechanism for X by X=v. To intervene to force a proposition p to be true involves replacing the clauses for p with the atomic clause p. To intervene to force a proposition p to be false involves removing the clauses for p.
If the values of the background variables are not known, the background variables can be represented by assumables. An observation can implemented by two stages:
- abduction to explain the observation in terms of the background variables and
- prediction to see what follows from the explanations.
Intuitively, abduction tells us what the world is like, given the observations. The prediction tells us the consequence of the action, given how the world is.