CPSC 440/550: Advanced Machine Learning – 24W2 (Jan-Apr 2025)

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X539.
Lecture info: Mondays/Wednesdays, 3:30 - 4:50pm, Swing 122.
Piazza; Gradescope; course recordings and office hour calendar are linked from Piazza.

Previous offerings: 23w2, 22w2 by me, or 21w2, 20w2 by Mark Schmidt. This time will be broadly similar to these, with some changes.

Schedule

First, miscellaneous notes mentioned in the lectures and homeworks (some of which are background material, some are supplements to things we'll see in this course): and assignment submission instructions.

Italicized entries are tentative; in particular, the timing and even number of assignments might change. Textbook acronyms are explained below.

DateTopic/slidesSupplements
M Jan  6Syllabus
Binary density estimation
ML vs. Stats, 3 Cultures of ML
Math for ML, Essence of Linear Algebra
PML1 2.1-2.4
W Jan  8Bernoulli MLE and MAPPML1 4.5, 4.6.2
W Jan  8Assignment 1 released — pdf, tex, zip
M Jan 13Multivariate models; generative classifiersPML1 9.3
W Jan 15Categorical data; discriminative modelsPML1 2.5, 9.4, 10.2, 13.2
F Jan 17Assignment 1 due at 5pm
W-SaJan 15-18Quiz 1
F Jan 17Add/drop deadline
M Jan 20Discriminative models and deep learningPML1 13, 14
W Jan 22Gaussians and Bayesian learningPML1 2.6, 4.6.7
M Jan 27Multivariate GaussiansPML1 3.2
W Jan 29Learning with Gaussians; start empirical BayesPML1 3.3, 11.7
W-SaJan 29-Feb 1Quiz 2
M Feb  3Finish empirical Bayes; exponential familiesPML2 3.7; PML2 2.4
W Feb  5Mixtures and EMPML1 8.7.2 / PML2 6.5; PML2 16.3
M Feb 10Finish mixtures/EM
W Feb 12Monte Carlo, Laplace approximationPML2 11
W-SaFeb 12-15Quiz 3
M Feb 17No class: Family Day + midterm break
TuFeb 18Project proposal guidelines released
W Feb 19No class: midterm break
M Feb 24Variational inference, VAEsPML2 10.1-10.2, 21.1-2
PML1 20.3
W Feb 26Transposed convolutions, representation learningPML1 14.4; PML2 32
M Mar 3Markov chainsPML2 2.6
W Mar 5Message passing
Start MCMC
PML2 9.2; 12.1-12.2
W-SaMar 5-8Quiz 4
F Mar 7Withdrawal deadline
M Mar 10Finish MCMC; directed graphical modelsPML2 4.2, bonus material on PML2 9
TuMar 11Assignment 2 belatedly released — pdf, tex, zip
W Mar 12Finish directed models
Start Undirected graphical models
PML2 4.3-4.4; bonus on PML2 9, 28.5
M Mar 17Finish undirected graphical models
Deep sequence models: RNNs
PML1 15.2
W Mar 19seq2seq and LSTMsPML1 15.2
W-SaMar 19-22Quiz 5
M Mar 24Class cancelled, office hours instead
TuMar 25Assignment 2 due at 11:59pm (before late days)
W Mar 26Attention and TransformersPML1 15.4-15.7; PML2 16.2.7, 16.3.5
Assignment 3 released — pdf, tex, zip
SuMar 30Project proposal due at 11:59pmguidelines
M Mar 31More representation learning
W Apr 2Diffusion models
W-SaApr 2-5Quiz 6
M Apr 7A tour of some things we missed
TuApr 8Assignment 3 due at 11:59pm (before late days)
TuApr 15Final exam (in person, handwritten) at noon in MCML 360
SuApr 27Final project due at 11:59pmstyle files, instructions

Overview

This course is intended as a second or third university-level course on machine learning, a field that focuses on using automated data analysis for tasks like pattern recognition and prediction. The class is intended as a continuation of CPSC 340 (also called 540, or previously 532M); it will assume a strong background in math and computer science. Topics will (roughly) include deep learning, generative models, latent-variable models, Markov models, probabilistic graphical models, and Bayesian methods.

Note that the numbers for graduate cross-listings of our machine learning courses changed last year: previously 340 was also called 532M, and 440 was also called 540. Now 340 is also called 540, and 440 is also called 550.

Logistics

The course meets in person in Swing 122. I plan to release recordings, but encourage you to come to class in person if you can.

Grading scheme:

Further details in the syllabus slides.

Registration and Prerequisites

Registration: Graduate and undergraduate students from any department are welcome to take the class. Undergraduate students should enroll in CPSC 440, and graduate students should enroll in CPSC 550. Below are more details on registration for each course: My expectation (no guarantee) is that everyone on both waitlists will probably get in, and we should also have room for auditors. Join the waiting list by January 15th if you want to register.

Starting in the second week of classes, we'll have weekly tutorials run by the TAs. These will do things like go through provided assignment code, review background material, review big concepts, and/or do exercises. You can register for particular tutorial sections if you want to save a seat at a particular time, but note that you do not need to register in a tutorial section.

CPSC 340/540 vs. CPSC 440/550: CPSC 340 and 440 are roughly structured as one full-year course. CPSC 340 (which is sometimes cross-listed as CPSC 540 for graduate students; formerly 532M) covers more data mining methods and the methods that are most widely-used in applications of machine learning. CPSC 440 (cross-listed as CPSC 545 for graduate students) focuses on probabilistic methods which appear in more niche applications, as well as various other topics not covered in 340/540. It is strongly recommended that you take CPSC 340/540 first, as it covers the most fundamental ideas as well as the most common and practically-useful techniques. In 440/550 it will be assumed that you are basically familiar with all the material in the current offering of CPSC 340/540. Note that online machine learning courses and courses from many other universities may not be an adequate replacement for CPSC 340; they typically have more overlap with our applied machine learning course, CPSC 330. If you're not sure, look at last term's 340 website and see if it all seems familiar.

Prerequisites

Undergraduate students will not be able to take the class without these prerequisites. Graduate students may be asked to show how they satisfy prerequisites.

Resources

Textbook: There is no textbook for the course, but the textbook with the most extensive coverage of many of the course's topics is Kevin Murphy's Probabilistic Machine Learning series. While the one-volume 2012 version covers most of the material, we'll refer to the very recent two-volume version (2022/2023), PML1 and PML2, both of which have free Creative Commons draft pdfs through those links. I'll try to refer to the relevant sections of both versions as we go, as well as links to various other free online resources.

If you need to refresh your linear algebra or other areas of math, check out Mathematics for Machine Learning (Marc Deisenroth, Aldo Faisal, Cheng Soon Ong; 2020).

Related courses: Besides CPSC340, there are several 500-level graduate courses in CPSC and STAT that are relevant: check out the graduate courses taught by people on the ML@UBC page and the MILD list. CPSC 422/425/436N, DSCI 430, EECE 360/592, EOSC 510/550, and STAT 305/306/406/460/461 are also all relevant.

Some related courses that have online notes are:

A YouTube playlist covering in detail many of the core topics in the course: