Last updated 10 May 2006.
cd CRF2D % wherever you put the zip file addpath(genpath(pwd)) % assumes matlab 7 demoSingle demoMultiIf you have a Mac, you may have to compile and put the following mex files onto your path: BP_GeneralComplex_C.mexmac, BP_General_C.mexmac MF_GeneralComplex_C.mexmac.
nodeFeatures(:,i,s) = [1, raw(:,r,c,s))] if expandNode=0 nodeFeatures(:,i,s) = [1, quadratic(raw(:,r,c,s))] if expandNode=1where i corresponds to node (r,c) and quadratic(v) is a quadratic expansion of the vector v. We can also create edge features using
egdeFeatures(:,e,s) = [1, |raw(:,r,c) - raw(:,r',c')| ] if expandEdge = 0 edgeFeatures(:,e,s) = [1, Quadratic(|raw(:,r,c) - raw(:,r',c')|) ] if expandEdge = 1where e corresponds to the egde connecting (r,c)-(r',c').
The node potentials are given by
nodepot(xi, i) = exp[ xi w'* nodeFeatures(:,i,s) ]where xi is the state (+1, -1) of node i, which is in row r, column c. Similiarly, the edge potentials for edge e (connecting nodes i, j) have the form
edgepot(xi,xj,e) = exp[ xi xj v' * edgeFeatures(:,e) ]
The gradient is the observed minus the expected features:
gradNode(:,i,s) = label(i,s)* nodeFeatures(:,i,s) - sum_x bel(x,i,s) * x * nodeFeatures(:,i,s)where bel(x,i,s) = P(X(i)=x|s) is the marginal belief for case s. Similarly,
gradEdge(:,e,s) = label(i,s)*label(j,s)* edgeFeatures(:,e,s) - sum_{x,x'} bel2(x,x',e,s) * x * x' * edgeFeatures(:,e,s)where bel2(x,x',e,s) = P(X(i)=x,X(j)=x'|s) is the pairwise belief and edge e connects i and j.
You may also be interested in some matlab/C code for inference in undirected MRFs (with pairwise potentials) by Talya Meltzer.
label = sign(double(imread('X.PNG'))-1); label=label(:,:,1); noisy = label+randn(32,32); imshow([label noisy]); title('Labels Observed');
featureEng = latticeFeatures(0,0); features = permute(noisy,[4 1 2 3]); traindata.nodeFeatures = mkNodeFeatures(featureEng,features); traindata.edgeFeatures = mkEdgeFeatures(featureEng,features); traindata.nodeLabels = label; traindata.ncases = 1; trainNdx = 1:traindata.ncases;We first initialize our feature engine object. We have specified (0,0) since we want to use our features directly, as opposed to performing a quadratic basis function expansion. The 'mkNodeFeatures' and 'mkEdgeFeatures' functions put our extracted features in the appropriate format, using the parameters of the feature engine. A location (i,r,c,case) in the 4D array 'features' stores the ith feature for (row r, column c) of image case. The nodeLabels field has the form (r,c,case), and is the class label for (row r, column c) of image case. Here, we have only 1 case. Note that the edge features used in the by this feature engine are the normalized absolute differences between raw edge features at adjacent locations in the lattice.
nNodeFeatures = size(traindata.nodeFeatures,1); nEdgeFeatures = size(traindata.edgeFeatures,1); weights = initWeights(featureEng,nNodeFeatures,nEdgeFeatures);Finally, we will try doing inference with our initial parameters (using the mean field approximation for inference):
nstates = 2; infEng = latticeInferMF(nstates); caseNum = 1; featureEng = enterEvidence(featureEng, traindata, caseNum); % store data [nodePot, edgePot] = mkPotentials(featureEng, weights); [nodeBel, MAPlabels] = infer(infEng, nodePot, edgePot); imshow(MAPlabels);Above, we first created a mean field inference engine. We then entered case 1 of our training data into the feature engine. The feature engine subsequently lets us make the node and edge potentials, needed to perform inference with the inference engine. The inference engine returns the node beliefs (marginal probabilities) and assigns labels. Inference with random parameters tends to be a sub-optimal labeling strategy, below are some examples of running this demo:
reg = 1; maxIter = 10; options = optimset('Display','iter','Diagnostics','off','GradObj','on',... 'LargeScale','off','MaxFunEval',maxIter);We specify the pseudolikelihood objective and gradient function, and set up the associated arguments:
gradFunc = @pseudoLikGradient; gradArgs = {featureEng,traindata,reg};And we optimize the objective function:
winit = initWeights(featureEng, nNodeFeatures,nEdgeFeatures); weights = fminunc(gradFunc,winit,options,trainNdx,gradArgs{:});We can plot the results as before
[nodePot, edgePot] = mkPotentials(featureEng, weights); [nodeBel, MAPlabels] = infer(infEng, nodePot, edgePot); imshow(MAPlabels);For this data set, the optimal pseudolikelihood parameters do not prove to be very interesting (except to show that pseudolikelihood tends to over-smooth, in this case to an all-white image):
infEng = latticeInferMF(nstates); gradFunc = @scrfGradient; gradArgs = {featureEng, infEng, traindata, reg}; weights = fminunc(gradFunc,winit,options,trainNdx,gradArgs{:});The results are much better.
infEng = latticeInferBP(nstates); gradFunc = @scrfGradient; gradArgs = {featureEng, infEng, traindata, reg}; weights = fminunc(gradFunc,winit,options,trainNdx,gradArgs{:});Example results of an LBP trained model with LBP inference are shown below:
imshow(medfilt2(noisy)>0)This gives poor results in this case.
reg = 1; maxIter = 3; options = optimset('Display','iter','Diagnostics','off','GradObj','on',... 'LargeScale','off','MaxFunEval',maxIter); gradFunc = @scrfGradient; gradArgs = {featureEng, infEng, traindata, reg}; weights = fminunc(gradFunc,winit,options,trainNdx,gradArgs{:});After processing each of the 50 training images 3 times, the batch training procedure does not make significant progress, given the redundency in the data:
eta = 0.0001; batch_size = 1; anneal = 0; tau = 0;Training follows a similar form to the batch scenario:
weights = stochgrad(gradFunc,winit,trainNdx,'gradArgs',gradArgs,... 'maxIter',maxIter,'eta',eta,'batch_size',batch_size);This demo uses only 1 iteration, so training is on-line. Some result images are shown below:
eta0 = 0.0001; mu = 0.001; lambda = 0.9;Training is done using the following:
weights = smd(gradFunc,winit,trainNdx,'gradArgs',gradArgs,... 'maxIter',maxIter,'eta0',eta0,'batch_size',batch_size,... 'lambda', lambda, 'mu', mu);In this data, we have a very small number of features, so SMD does not give a substantial improvement. Below are some example results for the on-line training scenario after viewing all training images: