Matlab Code by Mark Schmidt

Summary

This package contains the most recent version of various Matlab codes I wrote during my PhD and postdocs. I would recommend downloading and using this package if you plan on using more than one of my Matlab codes. This is because this package includes all the more recent bug-fixes and efficiency-improvements, while in making this package I have updated my old code to make it compatible with the new code and newer versions of Matlab. Further, I typically do not update the individual packages unless I am making a major change (such as the updates of minFunc and UGM).

The particular packages included (from oldest to newest) are:

minFunc - Function for unconstrained optimization of differentiable real-valued multivariate functions.
lasso - Functions implementing a variety of the methods available to solve 'LASSO' regression (and basis selection) problems.
blogreg - Functions for MCMC simulation of binary probit/logistic regression posterior distributions over parameters.
DAGLearn Functions for structure learning in Gaussian and sigmoid directed acyclic graphical (DAG) models.
L1precision - Block coordinate descent function for fitting Gaussian graphical models with an L1-norm penalty on the matrix elements.
L1General - Functions implementing strategies for minimizing unconstrained functions subject to L1 regularization.
minConf - Functions for optimization of differentiable real-valued multivariate functions with simple constraints.
minFunc examples - A series of examples showing how to solve problems with minFunc.
UGM - Functions implementing exact and approximate decoding, inference, sampling, and parameter estimation in discrete undirected graphical models.
UGMlearn - Code for structure learning in discrete-state undirected graphical models using group L1-regularization.
crfChain - Learning/inference/decoding/sampling in chain-structured conditional random fields with categorical features.
PQN - Code for optimization of (costly) differentiable multivariate functions over simple convex sets.
L1General examples - A series of examples showing how to solve problems with L1General.
PQN examples - A series of examples showing how to solve problems with PQN.
thesis - Quasi-Newton methods for non-differentiable optimization, and structure learning in graphical models.
alphaBeta - Functions for approximate pairwise energy minimization with attractive energies.
batching - An optimizer that combines an L-BFGS line-search method with a growing batch-size strategy.
prettyPlot - A wrapper that uses Matlab's plot function to make nicer-looking plots.
TMP examples - A series of examples showing how to solve problems with TMP.
QNST examples - A series of examples showing how to solve problems with QNST.
SAG - Matlab mex files implementing the stochastic average gradient method for L2-regularized logistic regression.
SAG4CRF - Matlab mex files implementing non-uniform stochastic average gradient for fitting conditional random fields.
minFunc_extras - Implementations of various trust-region, derivative-free, and coordinate descent methods

Examples

Each of the packages includes one or more demos that show how to use the code. Below is a list of all the available demos (the ones highlighted in blue consist of a single function that contains a series of demos):

minFunc

example_minFunc % Runs various limited-memory solvers in minFunc
example_minFunc_LR % Shows how to use preconditioning and Hessian-vector products in minFunc
example_derivativeCheck % Shows how to check user-supplied derivatives against numerical approximations

lasso

example_lasso % Runs various LASSO solvers
example_lasso_warmStart % Warm-starting active set LASSO solver

blogreg

example_ProbRegSamp % Probit regression auxiliary variables samplers
example_LogRegSamp % Logistic regression auxiliary variable samplers
example_MultLogRegSamp % Multinomial logistic regression auxiliary variable samplers
example_LogRegRJSamp % Reversible-jump sparse logistic regression auxiliary variable sampler
example_blogreg % Runs various logistic regression samplers on digits data

DAGlearn

example_DAGlearn % Runs various DAG structure learning methods

L1precision

example_L1precision % Runs block coordinate descent for graphical LASSO

L1General

example_L1General % Runs various solvers for L1-regularized logistic regression

minConf

example_minConf % Shows how to use various methods in minConf

minFunc_examples.m

Huber and Student T robust regression
Robust Regression with Basis Expansion
Logistic and Probit regression
L2-regularized logistic regression
Weighted Logistic regression
Kernel logistic regression
Multinomial logistic regression with L2-regularization
Kernel multinomial logistic regression
Density estimation with multivariate student T
Data Visualization with Multi-Dimensional Scaling
Regression with neural networks
Classification with Neural Network with multiple hidden layers
Smooth support vector machine
Huberized support vector machine
Smooth support vector regression
Kernel smooth support vector machine
Multi-class smooth support vector machine
Extreme-value regression
Sparse Gaussian graphical model precision matrix estimation
Chain-structured conditional random field
Tree-structured Markov random field with exp(linear) potentials
Lattice-structured conditional random field

UGM

example_UGM_QuickStart % A demo that shows how various parts of UGM work
example_UGM_Small % Runs UGM on small graph where exact decoding/inference/sampling is possible
example_UGM_Chain % Runs UGM for decoding/inference/sampling in chain-structured data
example_UGM_Tree % Runs UGM for decoding/inference/sampling in tree-structured data
example_UGM_Condition % Runs UGM for conditional decoding/inference/sampling
example_UGM_Cutset % Runs cutset-conditioning in UGM for exact decoding/inference/sampling
example_UGM_Junction % Runs junction tree in UGM for exact decoding/inference/sampling
example_UGM_GraphCuts % Runs graph cuts in UGM for exact decoding in binary sub-modular models
example_UGM_ICM % Runs local search methods in UGM for approximate decoding
example_UGM_MCMC % Runs MCMC sampling methods in UGM for approximate sampling
example_UGM_Variational % Runs variational methods in UGM for approximate decoding/inference/sampling
example_UGM_Block % Runs block versions of approximate decoding/inference/sampling methods
example_UGM_AlphaBeta % Alpha-beta swaps in UGM for approximate decoding in pairwise sub-modular models
example_UGM_Linprog % Runs linear programming relaxation in UGM for approximate decoding
example_UGM_TrainMRF % Shows how to train MRFs with UGM
example_UGM_TrainCRF % Shows how to train CRFs with UGM
example_UGM_TrainHidden % Shows how to train CRFs with missing labels
example_UGM_OCR % Shows how to train CRFs where different examples have different structres
example_UGM_TrainApprox % Approximate training of UGMs (pseudo-likelihood, mean field, loopy belief propagation)
example_UGM_TrainApprox2 % Approximate training of UGMs (score matching, piecewise, min probability flow, product of marginals, block pseudolikelihood, sub-modular constraints)
example_UGM_TrainSGD % Stochastic gradient training of UGMs
example_UGM_TrainSGD2 % Stochastic gradient training of UGMs (projected, max-margin, contrastive divergence, stochastic maximum likelihood)

L1GeneralGroup

example_L1GeneralGroup % Computes regularization paths with L1 and group-L1 regularization using SPG
example_L1GeneralGroupPath % Regularization path for group Lasso with iterative soft-thresholding

UGMlearn

example_UGMlearn % Runs different approaches to structure learning in conditional random fields

crfChain

example_crfChain % Training a chain-structured conditional random field with categorical features

L1General_examples.m

LASSO
Elastic Net
Logistic Regression
Logistic Regression (larger number of variables)
Lasso Regularization Path
Huber Robust Regression Regularization Path
Logistic Regression Regularization Path
Probit Regression, Smooth SVM, Huberized SVM
Non-Parametric Logistic Regression with Sparse Prototypes
Multinomial Logistic Regression
Compressed Sensing
Chain-structured conditional random field
Graphical Lasso
Markov Random Field Structure Learning
Neural Network with Sparse Connections
Deep Network with Sparse Connections

PQN_examples.m

Linear Regression on the Simplex
Lasso regression
Lasso with Complex Variables
Group-Sparse Linear Regression with Categorical Features
Group-Sparse Simultaneous Regression
Group-Sparse Multinomial Logistic Regression
Group-Sparse Multi-Task Classification
Low-Rank Multi-Task Classification
L_1,inf Blockwise-Sparse Graphical Lasso
L_1,2 Blockwise-Sparse Graphical Lasso
Linear Regression with the Over-Lasso
Kernelized dual form of support vector machines
Smooth (Primal) Support Vector Machine with Multiple Kernel Learning
Conditional Random Field Feature Selection
Approximating node marginals in undirected graphical models with variational mean field
Multi-State Markov Random Field Structure Learning
Conditional Random Field Structure Learning with Pseudo-Likelihood

thesis

demo_L1General % Runs various solvers for L1-regularized logistic regression
demo_L1Generalpath % Computes L1-regularization path for logistic regression with warm-starting and an active set method
demo_minConf % Runs various solvers for group L1-regularized multi-task logistic regression
demo_minConfpath % Computes group L1-regularization paths for multi-task logistic regression with warm-starting and an active-set method
demo_DAGlearnGOrdered % Runs various methods to estimate a Gaussian DAG given a topological ordering
demo_DAGlearnG % Runs various methods to estimate a Gaussian DAG given interventional data
demo_DAGlearn2Ordered % Runs various methods to estimate a sigmoid belief network given a topological ordering
demo_DAGlearn2 % Runs various methods to estimate a sigmoid belief network given interventional data
demo_UGMlearn % Trains pairwise log-linear models with different parameterizations and regularization types
demo_UGMlearn2 % Computes regularization path for a pairwise log-linear model
demo_UGMlearn3 % Train pairwise log-lienar models with different approximations and regularization types
demo_LLM % Trains log-linear models with different order restrictions and regularization types
demo_LLM2 % Computes regularization path for a hierarchical log-linear model
demo_LLM3 % Compares parameterizations and regularization types for fitting hierarchical log-linear model

alphaBeta

example_alphaBeta_UGMep % Compares various energy minimization methods on a stereo reconstruction task (using the UGMep codes)
example_alphaBeta_UGM % Compares various energy minimization methods on a stereo reconstruction task (using the UGM codes)

batching

example_batching % Compares different strategies for parameter estimation with a redundant training set under a severe function evaluation restriction
example_batchingL2SVM % Compares different strategies for avoiding looking at all training examples on every iteration when training smooth support vector machines.

prettyPlot

example_prettyPlot % Demo showing basic functionality of prettyPlot, including not placing a marker at all points.
example_prettyPlot2 % Demo showing how to plot sequences of different lenghts, using repeating sequences of lineStyles/markers/colors, and having unspecified upper/lower limts.
example_prettyPlot3 % Demo showing how to plot upper/lower error lines.
example_prettyPlot4 % Demo showing how to directly label lines.

TMP_examples.m

Non-negative Least Squares
L1-Regularization Least Squares
Logistic Regression with Bounded Coefficients
Dual Support Vector Machines (no bias or regularized bias)
Ordinal Logistic Regression
Kernel Ordinal Logistic Regression
Graphical LASSO
Associative Conditional Random Fields (trained with pseudo-likelihood)

QNST_examples.m

L1-Regularization
Group L_{1,2}-Regularization
Group L_{1,inf}-Regularization
Combined L1- and Group L1-Regularization
Nuclear norm-regularization
Sparse plus low-rank

SAG

example_SAG % Run all of the basic stochastic methods on a sparse data set.
example_SAG2 % Run all of the basic methods (plus non-uniform sampling) on a dense data set

SAG4CRF

example_SAG4CRF % Run/compare all the methods
example_SAG4CRF_warmStart % Shows how to use warm-starting and stopping criterion.

minFunc_extras

example_minFuncTR % Runs various first-order and second-order trust-region optimizers
example_minFuncDFO % Runs various derivative-free optimization methods
example_minFuncCD % Runs various coordinate descent methods

Updates

April 2023: Added minFunc_extras, which contains some old optimization development codes in case anyone finds them useful:

minFunc_TR: trust-region optimization method. Can solve trust region sub-problem using cauchy, binary search with schur factorization (deafult), dogleg, Steihaug CG, L-infinity or L1 norm, or using SPG first-order method. Defaults to expecting the user to provide the Hessian but can alternately use BFGS, SR1, L-BFGS, L-SR1, or Hessian-free with Steihaug CG.
minFunc_DFO: derivative-free optimization methods. Includes trying random directions, numerical differencing, using interpolating models, coordinate search, Hooke-Jeeves, pattern search, conjugate directions, and Nelder-Mead. Allows golden section search or Brent's method for one-dimensional minimization.
minFunc_CD: coordinate descent methods. Requires that the user define a class implementing certain operations, including an updating operation that allows the code to be efficient for certain problem structures. Includes several ways to set the step size including Armijo, Wolfe, and Newton.

June 2022:

Fixed the non-working Hessian-vector product code when using complex-step derivative.
Fixed potential divide by zero in Hessian-vector code.

March 2022:

Updates to minFunc (these were made at earlier dates, but were not previously mentioned here):

User can now set initial step length (t0), can avoid problems if initial step length doesn't cause change in function values (suggested by Rainer Heintzmann).
Added logic to the line-search to deal with illegal gradients but legal function values.
Extrpolation now checks that sufficient progress is being made (thanks to Rainer Heintzmann).
Added the alternating BB method of "Accelerating gradient projection methods for l1-constrained signal recovery by steplength selection rules", which is now the default BB method.
The optimization toolbox linesearch works again (thanks to David Sall).

Updates to minConf (these were made at earlier dates, but were not previously mentioned here):

Split 'optTol' into 'optTol' and 'progTol' for some of the methods, now uses infinity-norm to test optimality.
Turned off bbInit in PQN as this is unsound.
Fixed surrogate objective in PQN.
added Hessian-free version of minConf_TMP.

September 2020:

crfChain: Added the 64-bit Windows binaries, and a fast loss and testing function.
UGM: Fixed a mex file indexing issue that arose when doing conditioning oprations on macs
SAG4CRF: Added this code for training chain-structured discrete-feature CRFs with SAG.

May 2020:

Update mex files for MacOs.
Commented out the call to drawGraph.m (I'm sick of updating my code for each new graphViz version).
UGM: Modified to avoid in-place scalar updates, fixed sampling dimensions match in some places, and fixed graph cuts interface.
SAG: Updated compile flags so these codes would work with sparse data on newer computers, modified to avoid in-place scalar updates.
L1General: Modified codes relying on quadprog to not choose a solver, as this was throwing an error.

July 2019:

L1precision: Commented out the call to drawGraph.m (I'm sick of updating my code for each new graphViz version).
UGM: Updated the integer programming decoding to use the new interface to the integer programming solver in the optimization toolbox.

June 2019:

Updated mexAll.m to use -compatibleArrayDims, so that mex files work in more recent versions. Also removed the commands to compile files relying on BLAS, which doesn't seem to work in more-recent versions.
lasso: Modified codes relying on quadprog to not choose a solver, as this was throwing an error.

September 2013:

SAG: Added these functions implementing various stochastic methods for L2-regularized logistic regression.
prettyPlot: Added the ability to place labels directly on lines instead of in a legend box.
blogreg: A bug in the sampleLambda.m function was fixed. Thanks to Sharon Chiang for pointing this out.

November 2012:

UGM: Added score matching, piecewise training, block pseudo-likelihood, max-margin, contrastive divergence, and stochastic maximum likelihood.
QNST: Added this series of demos showing how to use minConf_QNST to solve problems with non-smooth regularizers.

July, 2012:

UGM: Added demos showing how to train CRFs with hidden variables, and how to train when different training examples have different graph structures.
TMP examples: Added this series of demos showing how to use minConf_TMP to solve various bound-constrained problems.

April, 2012:

minFunc: Added the fastDerivativeCheck function, which quickly checks the accuracy of a user-supplied gradient using only 2 or 3 function evaluations. This is preferable to the standard derivativeCheck function, which requires n+1 function evaluations for an n-dimensional problem. For an example, see example_derivativeCheck.

March, 2012:

prettyPlot: Added this function that helps to make nicer looking plots. For example, it uses bigger lines/markers/fonts, does not require a marker at every point, allows specifying only a lower/upper limit, allows lines of different lengths, allows (separate) repeating sequences of colors/lineSyltes/markers, and plotting upper/lower error bars.
minFunc: Re-wrote the core L-BFGS code to make it faster and more memory-efficient.
UGM: Added example_UGM_QuickStart, a new demo that demonstrates how the different different data structures work and how to call the different functions.
minFunc examples: Added the extra examples, including robust regression with the Huber and student t losses, density estimation with the multivariate t, data visualization with multi-dimensional scaling, Huberized support vector machines, and extreme-value regression.
minFunc: Added the option of using central-differencing when using finite-difference derivatives.
UGM: Fixed a bug in the junction tree sampler.

February, 2012:

thesis: Added the canonical and spectral binary parameterizations described in the AI/Stats 2012 paper. These parameterizations can model any binary distribution, yet have no redundant parameters. See demo_LLM3.m.

January, 2012:

example_batchingL2SVM

November, 2011:

UGM: Updated the variant of UGM contained in this package to the 2011 version. This includes a variety of improvements and extensions detailed on the UGM webpage. The most important change in this version is a completely different way to set up parameter estimation, that allows arbitrary parameter tying schemes and makes it easier to have different graph structures for different training examples.
Various: updated older packages to use the new version of UGM.

August, 2011:

batching: Added this code for optimization with a large number of training examples, but where there is a large amount of redundancy in the training examples.
alphaBeta: Added this code implementing alpha-expansion beta-shrink moves for MAP estimation in UGMs. Note that these functions use the mex wrapper to the maxflow code.
thesis: Added the specialized code for log-linear MRFs (LLM2) and demos from my thesis of using group L1-regularization with different norms for structure learning in undirected graphical models. Also added the code for learning hierarchical log-linear models using overlapping group L1-regularization and a hierarchical search heuristic.
Mex files: Removed the crfChain files from the mexAll command (since these don't compile on many systems). To compile these files, you now need to use mexCrfChain.

March 2011:

demo_minConfpath: Added this demo of computing the simultaneous logistic regression group L1-regularization path for the group L2-norm and Linf-norm.
DAGlearnG/DAGlearn2: Added these variants of the DAGlearn code from my thesis.
Mex files: Added all files for 64-bit Athlon and Intel Mac architectures (mexa64 and maxmaci64). Thanks to Kevin Murphy for these.
UGM_Decode_GraphCuts: Efficiency improvements, fixed graph construction in cases where the edge potentials are asymmetric (thanks to Mohamed Ghalwash), added ability to use Boykov and Kolmogorov's fast maxflow code via the mex interface available here (if the mex file is found on the path).
UGM_Decode_AlphaExpansion: Added the truncation trick described here (Section 3.3) for approximate expansions in non-metric models.

Download

To get the complete set of most recent Matlab .m and .mex files, download code.zip.

Note that some of these packages use Matlab's optimization and/or statistics toolbox (they will say that functions like 'quadprog' or 'exprnd' are not found if these toolboxes are present). To use the UGMep code (and substantially speed up the methods based on graph-cuts in UGM), you will need to have the mex wrapper to the maxflow code.

To add all folders in the zip file to the Matlab path, extract the contents of the zip file and then (in Matlab) type 'addpath(genpath(markSchmidt_dir))', where 'markSchmidt_dir' is the location of the extracted directory.

I have included compiled mex files for several operating systems, but if they are not included for your operating system then you will get an error that looks like: ??? Undefined function or method 'UGM_Decode_ExactC'. To compile most of the mex files for your operating system, run mexAll. This functions tries to compile all files except those associated with the crfChain code.

To compile the mex files associated with the crfChain code on Linux, run mexCrfChain. In order to compile the crfChain mex files on Windows, you can use MinGW and Gnumex. For compiling on Macs, you need to install Xcode.

Mark Schmidt > Software > Current Versions