Two methods have been proposed previously for exploiting causal independence to speed up inference in general BNs [26,15]. They both use causal independence to transform the topology of a BN. After the transformation, conventional algorithms such as CTP or VE are used for inference.
We shall illustrate those methods by using the BN in Figure 3(1). Let be the base combination operator of e, let denote the contribution of to e, and let be the contributing factor of to e.
The parent-divorcing method [26] transforms the BN into the one in Figure 3(3). After the transformation, all variables are regular and the new variables and have the same possible values as e. The conditional probabilities of and are given by
The conditional probability of e is given by
for any value of e, of , and of . We shall use PD to refer to the algorithm that first performs the parent-divorcing transformation and then uses VE for inference.
The temporal transformation by Heckerman [15] converts the BN into the one in Figure 3(4). Again all variables are regular after the transformation and the newly introduced variables have the same possible values as e. The conditional probability of is given by
for each value of . For i=2, 3, 4, the conditional probability of ( stands for e) is given by
for each possible value of and of . We shall use TT to refer to the algorithm that first performs the temporal transformation and then uses VE for inference.
The factorization represented by the original BN includes a factor that contain five variables, while factors in the transformed BNs contain no more than three variables. In general, the transformations lead to finer-grain factorizations of joint probabilities. This is why PD and TT can be more efficient than VE .
However, PD and TT are not as efficient as VE1 . We shall provide empirical evidence in support of this claim in the next section. Here we illustrate it by considering calculating . Doing this in Figure 3(3) by VE using the elimination ordering , , , , , and would require numerical multiplications and 18 numerical additions. Doing the same in Figure 3(4) using the elimination ordering , , , , , , would require numerical multiplications and 20 numerical additions. In both cases, more numerical multiplications and additions are performed than VE1 . The differences are more drastic in complex networks, as will be shown in the next section.
Figure: The result of Applying Oleson's method to the BN of Figure 1.
The saving for this example may seem marginal. It may be reasonable to conjecture that, as Oleson's method produces families with three elements, this marginal saving is all that we can hope for; producing factors of two elements rather than cliques of three elements. However, interacting causal variables can make the difference more extreme. For example, if we were to use Oleson's method for the BN of Figure 1, we produce the network of Figure 4. Any triangulation for this network has at least one clique with four or more elements, yet VE1 does not produce a factor with more than two elements.
Note that as far as computing in the networks shown in Figure 3 is concerned, VE1 is more efficient than PD, PD is more efficient than TT, and TT is more efficient than VE . Our experiments show that this is true in general.