CS 540 (Machine learning) Fall 2008 (term 1)

Projects

Click here.

Admin

Lectures: TR 11-12.30. Room: Macmillan 154, opposite CS on Main Mall.

Office hours: Fri 1-2pm.

If you cannot register, but you feel you have the required background, please send your student id number to Joyce Poon (poon@cs.ubc.ca). If you are from another UBC department, fill out this form.

Outline

This is a graduate class on machine learning, covering the foundations, such as (Bayesian) statistics and information theory, as well as topics such as supervised learning (classification, regression), and unsupervised learning (clustering, dimensionality reduction). (I will cover graphical models in Stat521A in Spring 2009; note that CS540 is highly recommended as a pre-requiste for Stat 521A.) Examples of applications in the areas of vision, speech/ language and biology will be used throughout.

Pre-requisites

This will be a fast-paced class, so prior exposure to machine learning at the undergraduate level (such as CS340 or Stat 306) is highly desirable. However, the only official pre-requisites are: linear algebra, probability theory, multivariate calculus and programming skills (preferably matlab or R).

If you do not have the pre-requisites, but are still interested in learning about machine learning, I recommend you take CS340, the undergrad version of this class, taught by Nando de Freitas Fall 2008.

Workload

This class will be quite time consuming. Attending lectures: 3h. Weekly homeworks: about 6h. Weekly reading: about 6h. Total: 15h/week.
If you cannot handle this, I recommend you take CS340, the undergrad version of this class.

Textbook

Machine Learning: a probabilistic approach. Students will be able to buy a copy of this book, which I am writing, after Sept 8th, from
Copiesmart Centre, 103-5728 U. Blvd, right next to McDonald's in the UBC Village.

If you find typos, please follow the procedure outline here.

In addition to my book, you may find the following useful:

Pattern Recognition and Machine Learning, Chris Bishop, Springer 2006.
The elements of statistical learning, Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer 2001.
All of Statistics, Larry Wasserman, Springer 2004.
Information theory, inference and learning algorithms, David Mackay, CUP 2003
Bayesian Computation with R, Jim Albert, Springer 2007.
Pattern Classification (2nd ed.), Duda, Hart, Stork, Wiley 2001.
John Langford's blog
Radford Neal's blog
Andrew Gelman's blog

Grading

Midterm (open-book): 30%, Weekly assignments: 30%, Final project: 40%.

Homeworks

Homeworks are listed below. Numbers refer to exercises in my book. (M) after a homework exercise refers to Matlab. Data and supporting code for the homeworks can be found by downloading PMTK.

Tentative Timetable

Reading material refers to the 7 Sep 08 version. New means the midterm (8 Oct 08) version.

L# Date Topic Reading Homework

L1 Tue Sep 9
Intro Ch 1, Matlab tutorial hw1.pdf

L2 Thu Sep 11 Data visualization, probabilistic models, MLE Ch 2 .

L3 Tue Sep 16 Basic concepts New version of ch 2 hw2.pdf prostate.mat (same as in BLT/Data). hw2Sol.pdf

L4 Thu Sep 18 Linear regression 19.2, 19.3, Review ch 38 .

L5 Tue Sep 23 Linear algebra, Ridge regression 19.4, Review ch 38 Hw3.pdf , hw3Sol.pdf

L6 Thu Sep 25 Logistic regression 22.1, 22.2 .

L7 Tue Sep 30 MVN, LDA/QDA 3.2, 4.2 hw4.pdf, naiveBayesExCode.zip, hw4Sol.pdf

L8 Thu Oct 2 Naive Bayes; Beta-Binomial model Ch 4, 9.3 .

L9 Tue Oct 7 Bayesian concept learning; Beta-Binomial; Dirichlet-Multinomial 8.1-8.3, 9.1-9.4 hw5.pdf, NBLRcode.zip

L10 Thu Oct 9 Bayesian parameter estimation for Gaussians, generative classifiers, linear and logistic regression 5.6, 22.1.3, 9.6 .

L11 Tue Oct 14 Decision theory ; model selection New ch 5, new ch 6, new 3.3, new 8.6 .

L12 Thu Oct 16 Midterm . .

L13 Tue Oct 21 Feature selection 20.1-20.3, 21.1-21.3 .

L14 Thu Oct 23 L1 regularization . .

L15 Tue Oct 28 Mixture models, EM, non-parametric models 3.3-3.4, 14.1-14.5, 17.1-1.3 HW6

L16 Thu Oct 30 Guest lecture by Matt Brown on applications of non-parametric regression . .

L17 Tue Nov 4 Directed graphical models . Project proposals due

L18 Thu Nov 6 Conditioanl mixture models, sparse Bayesian learning, EM as bound optimization . .

L19 Tue Nov 11 Remembrance day . .

L20 Thu Nov 13 Kalman filters . .

L21 Tue Nov 18 PCA . .

L22 Thu Nov 20 Markov models . .

L23 Tue Nov 25 HMMs . .

L24 Thu Nov 27 MCMC . .

Final projects: presentation, Thur Dec 4th, written report Mon Dec 15th.

L#	Date	Topic	Reading	Homework
L1	Tue Sep 9	Intro	Ch 1, Matlab tutorial	hw1.pdf
L2	Thu Sep 11	Data visualization, probabilistic models, MLE	Ch 2	.
L3	Tue Sep 16	Basic concepts	New version of ch 2	hw2.pdf prostate.mat (same as in BLT/Data). hw2Sol.pdf
L4	Thu Sep 18	Linear regression	19.2, 19.3, Review ch 38	.
L5	Tue Sep 23	Linear algebra, Ridge regression	19.4, Review ch 38	Hw3.pdf , hw3Sol.pdf
L6	Thu Sep 25	Logistic regression	22.1, 22.2	.
L7	Tue Sep 30	MVN, LDA/QDA	3.2, 4.2	hw4.pdf, naiveBayesExCode.zip, hw4Sol.pdf
L8	Thu Oct 2	Naive Bayes; Beta-Binomial model	Ch 4, 9.3	.
L9	Tue Oct 7	Bayesian concept learning; Beta-Binomial; Dirichlet-Multinomial	8.1-8.3, 9.1-9.4	hw5.pdf, NBLRcode.zip
L10	Thu Oct 9	Bayesian parameter estimation for Gaussians, generative classifiers, linear and logistic regression	5.6, 22.1.3, 9.6	.
L11	Tue Oct 14	Decision theory ; model selection	New ch 5, new ch 6, new 3.3, new 8.6	.
L12	Thu Oct 16	Midterm	.	.
L13	Tue Oct 21	Feature selection	20.1-20.3, 21.1-21.3	.
L14	Thu Oct 23	L1 regularization	.	.
L15	Tue Oct 28	Mixture models, EM, non-parametric models	3.3-3.4, 14.1-14.5, 17.1-1.3	HW6
L16	Thu Oct 30	Guest lecture by Matt Brown on applications of non-parametric regression	.	.
L17	Tue Nov 4	Directed graphical models	.	Project proposals due
L18	Thu Nov 6	Conditioanl mixture models, sparse Bayesian learning, EM as bound optimization	.	.
L19	Tue Nov 11	Remembrance day	.	.
L20	Thu Nov 13	Kalman filters	.	.
L21	Tue Nov 18	PCA	.	.
L22	Thu Nov 20	Markov models	.	.
L23	Tue Nov 25	HMMs	.	.
L24	Thu Nov 27	MCMC	.	.