CPSC 532D: Modern Statistical Learning Theory – Fall 2024 (24W1)

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X563.
TA: TBD.
Lecture info: Mondays/Wednesdays, 13:00 - 14:30, Dempster 201.
Office hours: TBD, hybrid in ICICS X563 + Zoom unless announced otherwise.
We'll use Piazza and Gradescope; links coming closer to the start of term.

Previously offered in 23W1, 22W1, and (with the name 532S) 21W2; this instance will be broadly similar.

This is a course on the mathematical foundations of machine learning. When should we expect ML algorithms to work (or not work), and what kinds of assumptions do we need to make to rigorously prove this?

Schedule

Italicized entries are tentative. The lecture notes (to be linked) are self-contained, but the supplements column also refers to the following books (all available as free pdfs) for more details / other perspectives:
DateTopicSupplements
M Sep  2No class: Labour Day
W Sep  4Course intro, ERMSSBD 1-2; MRT 2; Bach 2
M Sep  4Assignment 1 posted
M Sep  9Uniform convergence with finite classesSSBD 2-4; MRT 2
W Sep 11Concentration inequalitiesSSBD B; MRT D; Bach 1.2
Zhang 2; Wainwright 2
M Sep 16Assignment 1 due at noon
M Sep 16PAC learning; covering numbersSSBD 3, MRT 2
Bach 4.4.4, Zhang 3.4/4/5
M Sep 16Drop deadline
W Sep 18Rademacher complexityMRT 3.1; SSBD 26
Bach 4.5; Zhang 6
M Sep 23
W Sep 25
M Sep 30No class: National Day for Truth and Reconciliation
W Oct  2
M Oct  7
W Oct  9
M Oct 14No class: Thanksgiving Day
W Oct 16
M Oct 21
W Oct 23
F Oct 25Withdrawal deadline
M Oct 28
W Oct 30
M Nov  4
W Nov  6
M Nov 11No class: midterm break / Remembrance Day
W Nov 13No class: midterm break
M Nov 18
W Nov 20
M Nov 25
W Nov 27
M Dec  2
W Dec  4
? Dec ??Final exam (in person, handwritten) — sometime Dec 10-21, likely Dec 16-21 to avoid NeurIPS

Logistics

The course meets in person in Dempster 201, with possible rare exceptions (e.g. if I get sick but can still teach, I'll move it online). Note that this room does not have a recording setup; I may do a janky workaround, and will definitely publish thorough lecture notes (see last year's website for examples), but you should plan on usually coming to class.

Grading scheme: 70% assignments, 30% final.

Overview

Definitely covered: PAC learning, VC dimension, Rademacher complexity, concentration inequalities, margin bounds, stability. Also, most of: PAC-Bayes, analysis of kernel methods, limitations of uniform convergence, analyzing deep nets via neural tangent kernels, provable gaps between kernel methods and deep learning, online learning, feasibility of private learning, compression-based bounds.

Prerequisites

There are no formal prerequisites. TL;DR: if you've done well in CS 340/540 or 440/550, didn't struggle with the probability stuff there, and are also pretty comfortable with proofs, you'll be fine. If not, keep reading.

I will roughly assume the following; if you're missing one of them, you can probably get by, but if you're missing multiple, talk to me about it.

If you have any specific questions about your background, please ask.

Resources

If you need to refresh your linear algebra or other areas of math:

In addition to the books above, some other points of view you might like:

Measure-theoretic probability is not required for this course, but there are instances and related areas where it could be helpful:

Similar courses: