GraphCuts UGM Demo
There are no known exact algorithms for decoding/inference/sampling in UGMs that don't in general require a runtime that is exponential in the treewidth of the graph. However, we can sometimes still do these operations efficiently by considering a restricted class of potentials. In this demo, we consider the case of binary variables with attractive potentials. In this scenario, it is still possible to find the optimal decoding in polynomial time, independent of the graph structure, by re-formulating the problem as a minimum graph-cut problem.
By default, the UGM_Decode_GraphCut function searches for the mex wrapper to the maxflow code. If this software is not found on the path, then UGM_Decode_GraphCut uses a simple implementation of the Ford-Fulkerson method. Note that the Ford-Fulkerson implementation is very slow, so it recommended to install the maxflow software for larger graphs.
Binary Image Denoising Problem
We consider a binary image denoising problem, where we are
given a binary image that has been corrupted by independent Gaussian noise, and we want to try
to recover the original image.
For example, the image below is known as the "famous X":
It is a 32 by 32 binary image. If we corrupt it with independent Gaussian noise, it will look
something like this:
Given the noisy X, our task is to assign binary states to the pixels to try and
recover the original X (or whatever other binary image has been corrupted by
noise).
The first approach we might try to solve this problem is to threshold the pixel
values. Decoding by independently thresholding the pixels gives this result:
Independently thresholding the pixels doesn't take into account that
neighboring pixels are likely to take the same value. One way to model this
dependency is with a UGM.
Representing the binary denoising problem with UGM
In our UGM, each node will represent a pixel, and we will place an edge between
adjacent pixels. In particular, a pixel will be connected to the pixels above,
below, to the left, and to the right. This graph structure forms a 2D lattice,
a type of graph structure that has been studied not only for images, but has a long
history in statistical physics
(e.g. the Ising model).
We can make the edgeStruct for a 2D lattice-structured model as follows:
load X.mat % Load the image
X = X + randn(size(X))/2; % Make a noisy version of it
[nRows,nCols] = size(X);
nNodes = nRows*nCols;
nStates = 2;
adj = sparse(nNodes,nNodes);
% Add Down Edges
ind = 1:nNodes;
exclude = sub2ind([nRows nCols],repmat(nRows,[1 nCols]),1:nCols); % No Down edge for last row
ind = setdiff(ind,exclude);
adj(sub2ind([nNodes nNodes],ind,ind+1)) = 1;
% Add Right Edges
ind = 1:nNodes;
exclude = sub2ind([nRows nCols],1:nRows,repmat(nCols,[1 nRows])); % No right edge for last column
ind = setdiff(ind,exclude);
adj(sub2ind([nNodes nNodes],ind,ind+nRows)) = 1;
% Add Up/Left Edges
adj = adj+adj';
edgeStruct = UGM_makeEdgeStruct(adj,nStates);
% Standardize Features
Xstd = UGM_standardizeCols(reshape(X,[1 1 nNodes]),1);
The last line transforms the noisy image intensities so that they have a mean of zero and a standard deviation of 1.
We will use the following node potentials:
nodePot = zeros(nNodes,nStates);
nodePot(:,1) = exp(-1-2.5*Xstd(:));
nodePot(:,2) = 1;
We want to use edge potentials that reflect that neighboring pixels are more likely to have the same label. Further, we expect that they are even more likely to have the same label if the difference in their intensities is small. We incoporate this intuition by using the following Ising-like edge potentials
edgePot = zeros(nStates,nStates,edgeStruct.nEdges);
for e = 1:edgeStruct.nEdges
n1 = edgeStruct.edgeEnds(e,1);
n2 = edgeStruct.edgeEnds(e,2);
pot_same = exp(1.8 + .3*1/(1+abs(Xstd(n1)-Xstd(n2))));
edgePot(:,:,e) = [pot_same 1;1 pot_same];
end
In the node and edge potentials, we have used some carefully selected constants, {-1, -2.5, 1.8, .3}.
These constants were obtained as a maximum pseudolikelihood estimator, subject to the constraint that the edge potentials are attractive. Having attractive potentials is what will allow us to do efficient decoding in the model.
Approximate Decoding
The 2D lattice structure has 1024 nodes, so brute force decoding is not
possible. Further, the loopy structure has a high enough treewidth that it is
impractical to use junction trees (the treewidth of a lattice structure is the minimum between the number of rows and the number of columns). We might therefore
consider an approximate decoding method. One of the simplest approximate
decoding methods (that will be discussed further in the next demo) is the
iterated conditional mode (ICM) algorithm. Beginning from some initial
assignment of states to nodes, the ICM algorithm cycles through the nodes,
greedily maximizing the potential of each node given its neighbors. This
continues until no further local improvements are possible. The disadvantage
of the ICM method is that it may converge to a local maximum that is not the
global maximum. If we take the assignment that maximizes the node potentials to initialize
ICM, it converges to the following approximate decoding:
This looks much better than the independent model, but as we will see next the locally optimal
decoding found by ICM is not as good as the globally optimal decoding.
Graph Cuts for Optimal Decoding in Binary Sub-Modular UGMs
Somewhat surprisingly, it is possible to find the optimal decoding in this
model in polynomial time. This is because the UGM is binary and all its edge
potentials satisfy the inequality ep(1,1)*ep(2,2) >= ep(1,2)*ep(2,1), which we refer to as attractive potentials (since they encourage neighboring nodes to take the same state). These two simple restrictions allow us to formulate finding
the optimal decoding as
a maximum flow
problem (that can solved in polynomial-time), where the optimal decoding is
given by the minimal graph cut in the flow network.
Indeed, we can formulate decoding as a maximum flow problem whenever these two
restrictions are satisfied, no matter what graph structure is used.
We can use UGM to find the optimal decoding in a model satisfying these
restrictions using:
optimalDecoding = UGM_Decode_GraphCut(nodePot,edgePot,edgeStruct);
imagesc(reshape(optimalDecoding,nRows,nCols));
colormap gray
The optimal decoding of the UGM gives the following more satisfactory result:
Notes
As mentioned above, decoding by graph cuts will be substantially faster if the mex wrapper to the maxflow code is found on the path.
Graph cuts were originally proposed for finding the optimal decoding in an image
denoising problem in:
- D.M. Greig, B.T. Porteous, and A.H. Seheult. Exact Maximum A Posteriori Estimation for
Binary Images. Journal of the Royal Statistical Society Series B, 1989.
The formulation of the decoding problem as a maximum flow problem for binary
models under the associative condition is due to:
- V. Kolmogorv and R. Zabih. What energy functions can be minimized via graph
cuts? IEEE Transactions on Pattern Analysis, 2004.
PREVIOUS DEMO
NEXT DEMO
Mark Schmidt > Software > UGM > GraphCuts Demo