Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).
4.7 Variable Elimination
Arc consistency simplifies the network by removing values of variables. A complementary method is variable elimination (VE), which simplifies the network by removing variables.
The idea of VE is to remove the variables one by one. When removing a variable X, VE constructs a new constraint on some of the remaining variables reflecting the effects of X on all of the other variables. This new constraint replaces all of the constraints that involve X, forming a reduced network that does not involve X. The new constraint is constructed so that any solution to the reduced CSP can be extended to a solution of the larger CSP that contains X. In addition to creating the new constraint, VE provides a way to construct a solution to the CSP that contains X from a solution to the reduced CSP.
The following algorithm is described using the relational algebra calculations of join and project. When eliminating X, the influence of X on the remaining variables is through the constraint relations that involve X. First, the algorithm collects all of the constraints that involve X. Let the join of all of these relations be the relation rX(X,Y), where Y is the set of other variables in the scope of rX. It then projects rX onto Y; this relation can replace all of the relations that involve X. It now has a reduced CSP that involves one less variable, which it can solve recursively. Once it has a solution for the reduced CSP, it can extend that solution to a solution for the original CSP by joining the solution with rX.
When only one variable is left, it returns those domain elements that are consistent with the constraints on this variable.
A B 1 2 1 3 1 4 2 3 2 4 3 4
B C 1 2 1 3 1 4 2 3 2 4 3 4 =
A B C 1 2 3 1 2 4 1 3 4 2 3 4
To get the relation on A and C induced by B, project this join onto A and C, which gives
A C 1 3 1 4 2 4
This relation on A and C contains all of the information about the constraints on B that affect the solutions of the rest of the network.
The original constraints on B are replaced with the new constraint on A and C. It then solves the rest of the network, which is now simpler because it does not involve the variable B. It must remember the joined relation to construct a solution that involves B from a solution to the reduced network.
Figure 4.5 gives a recursive algorithm for variable elimination, VE_CSP, to find all solutions for a CSP.
2: Inputs
3: V: a set of variables
4: C: a set of constraints on V Output
5: a relation containing all of the consistent variable assignments
6: if (V contains just one element) then
7: return the join of all the relations in C
8: else
9: select variable X to eliminate
10: V'←V \ {X}
11: CX←{T in C: T involves X}
12: let R be the join of all of the constraints in CX
13: let R' be R projected onto the variables other than X
14: S←VE_CSP(V',(C \ CX)∪{R'})
15: return RS
The base case of the recursion occurs when only one variable is left. In this case (line 7), a solution exists if and only if there are rows in the join of the final relations. Note that these relations are all relations on a single variable, and so they are the sets of legal values for this variable. The join of these relations is the intersection of these sets.
In the non-base case, a variable X is selected for elimination (line 9). Which variable is selected does not affect the correctness of the algorithm, but it may affect the efficiency. To eliminate variable X, the algorithm propagates the effect of X onto those variables that X is directly related to. This is achieved by joining all of the relations in which X is involved (line 12) and then projecting X out of the resulting relation (line 13). Thus, a simplified problem (with one less variable) has been created that can be solved recursively. To get the possible values for X, the algorithm joins the solution of the simplified problem with the relation R that defines the effect of X.
If you only wanted to find one solution, instead of returning RS, the algorithm can return one element of the join. No matter which element it returns, that element is guaranteed to be part of a solution. If any value of R in this algorithm contains no tuples, there are no solutions.
The efficiency of the VE algorithm depends on the order in which variables are selected. The intermediate structure - which variables the intermediate relations are over - depends not on the actual content of relations but only on the graph structure of the constraint network. The efficiency of this algorithm can be determined by considering the graphical structure. In general, VE is efficient when the constraint network is sparse. The number of variables in the largest relation returned for a particular variable ordering is called the treewidth of the graph for that variable ordering. The treewidth of a graph is the minimum treewidth for any ordering. The complexity of VE is exponential in treewidth and linear in the number of variables. This can be compared to searching (see Section 4.4), which is exponential in the number of variables.
Finding an elimination ordering that results in the smallest treewidth is NP-hard. However, some good heuristics exist. The two most common are
- min-factor: at each stage, select the variable that results in the smallest relation.
- minimum deficiency or minimum fill: at each stage, select the variable that adds the smallest number of arcs to the remaining constraint network. The deficiency of a variable X is the number of pairs of variables that are in a relationship with X that are not in a relationship with each other. The intuition is that it is okay to remove a variable that results in a large relation as long as it does not make the network more complicated.
The minimum deficiency has usually been found empirically to give a smaller treewidth than min-factor, but it is more difficult to compute.
VE can also be combined with arc consistency; whenever VE removes a variable, arc consistency can be used to further simplify the problem. This approach can result in smaller intermediate tables.