Course Syllabus
About
The course will primarily cover computational methods that are used to process and analyze high-throughput genomic sequencing data, with a focus on developing probabilistic models for single-cell genomics. Some topics include statistical inference, probabilistic graphical models, latent variable models, deep latent variable models, and their applications in single-cell genomics.
Lectures
Lectures will be held Monday/Wednesday from 3:00-4:30pm PST. Attendance is expected at all lectures, as course participation forms part of the final mark.
Textbooks
AOS: Larry Wasserman. All of statistics.
MML: Marc Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for machine learning.
PML: Kevin Murphy. Probabilistic machine learning.
ECB: Alberts Bruce, Dennis Bray, Karen Hopkin, Alexander D. Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter. Essential cell biology.
MBC: Alberts Bruce, Dennis Bray, Karen Hopkin, Alexander D. Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter. Molecular biology of the cell.
BSA: Richard Durbin, Sean Eddy, Anders Krogh, Graeme Mitchison. Biological Sequence Analysis.
Lauren M. Sompayrac. How the immune system works.
Homework
Schedule
Lecture | Date | Topic | Slides | Reading | Paper |
Others |
Scribe |
1 | 07-09, 2022 |
Introduction Course logistics Vector, Norm, Matrix SVD, Single-cell RNA-sequencing |
lec01-intro.pdf | MML Ch.1-4 |
Actually, you can use Overleaf to write latex https://www.overleaf.com/latex/templates
Using Rstudio for data analysis: |
||
2 | 12-09-2022 |
Cells Cells, Nucleus, Chromosomes, DNA, RNA, Protein, The Central Dogma scRNA-seq, Ambient RNA, Droplets, Empty Droplets, Doublets, Cell capture rates UMI, 3'tag
|
ECB Ch. 1,5,7 | CPSC545-lec02-scribe.pdf | |||
3 | 14-09-2022 |
Probability Primer Random Experiments Random Variables CDF, PDF, PMF Important Continus and Discrete R.V.s Condition/Independence/Bayes Expectation, Variance, Covariance Conditional Expectations
|
AOS Ch. 1-3 | lec03-probability-primer-scribe.pdf | |||
19/09-2022 |
No class |
|
|||||
4 | 21-09-2022 |
Statistical Inference Bayesian and Frequentist Inference Bayesian+MAP+ML Beta-Binomial Model for Variant Detection Conjugate Priors |
lec04-statistical-inference.pdf | AOS Ch. 6, Ch. 11.1-2 |
SNVMixhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2832826/ JointSNVMixhttps://pubmed.ncbi.nlm.nih.gov/22285562/
|
||
5 | 26-09-2022 |
Generalized Linear Models Linear Regression Logistic Regression Multi-class Logistic Regression Poisson Regression Negative Binomial Regression GLM and the Exponential Family |
PML Ch. 11.1-11.2.2 Ch.12 |
|
|
|
|
6 |
28-09-2022 |
Latent Variable Models & Probabilistic Graphical Models Joint Distributions / Global and Local Latent Variables Random Variables/Fixed Parameters/Plate Notations Conditional Independence/D-separation Markov Blankets |
PML Ch. 3.6 |
Data Analysis with Latent Variable Models http://www.cs.columbia.edu/~blei/papers/Blei2014b.pdf
Optional: xseq |
|||
7 | 03-10-2022 |
Finite Mixture Models Mixture of Binomial Monte Carlo Integration, Importance Sampling, Rejection Sampling, Gibbs Sampling, MCMC Bayesian Mixture Models |
PML Book2 Ch11-12 (Optional) |
SNVMixhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2832826/
|
|||
8 | 05-10-2022 |
Finite Mixture Models II EM Algorithm for Missing Data Problems Properties of the EM Algorithm |
|||||
9 | 10-10-2022 | Probabilistic Principal Component Analysis | |||||
10 | 12-10-2022 | Principal Component Analysis II | |||||
11 | 17-10-2022 | Variational Autoencoders | |||||
12 | 19-10-2022 | Variational Autoencoders II | |||||
13 | 24-10-2022 |
Variational Autoencoders III |
|||||
14 | 26-10-2022 |
HMM Viterbi |
|||||
15 | 31-10-2022 |
HHM II EM for Learning Parameters Sequence Alignment Profile HMM |
|||||
16 | 02-11-2022 |
Topic Model
|
|||||
17 | 07-11-2022 | Probabilistic Matrix Factorization | |||||
09-11-2022 | Midterm break, no class | ||||||
18 | 14-11-2022 | Graph Neural Networks | |||||
19 | 16-11-2022 | Diffusion Models | |||||
20 | 21-11-2022 | Causal Inference | |||||
21 | 23-11-2022 | Guest Lectures | |||||
22 | 28-11-2022 | Student Presentations | |||||
23 | 30-11-2022 | Student Presentations | |||||
24 | 05-12-2022 | Project Presentations | |||||
25 | 07-12-2022 | Project Presentations |
Grading
The marks for the course will be distributed as follows (note: the instructors reserve the right to modify the marking scheme at any time, although the final marking scheme should be fairly close to that given here):
- A project (60%)
- A project proposal: 10%, a one page writeup of your proposed project
- Project writeup and presentation: 50%. Your will need to turn in your report (8 pages excluding references), code, and a presentation.
- Course participation (20%)
- Each student will present and lead the discussion of at least one paper (10%)
-
Taking notes (for each lecture, one person needs to take notes and sends the notes to me, and I will post the notes to Canvas (10%)
- Homework (20%)
- Two homework assignments, submitted on Canvas, with math derivative, programming, and result interpretation
Participation
For course participation, each student will be responsible for leading the discussion of one paper in class. They will create a thread on Piazza one week before the paper presentation. Second, all other students will briefly post their thoughts on the paper in this thread. All posts must be submitted 24 hours before the paper presentation.
Final Project
The final project will form the major assessment in this course. The project can be done in teams of 1-2 people. The project should be a novel piece of research in the field of bioinformatic algorithms. Suitable topics would be the development of a new algorithm, theoretical result, computational method etc. The final project will be assessed in two ways. First, you will deliver a 25 minute presentation to the class with 5 minutes of question time. Second, you will deliver a written report (8 pages excluding references) in the style of bioinformatics journal article.
Course communication
Piazza will be used for course communication including announcements, questions about lectures and any other logistics. A link to the Piazza group can be found on the left of the Canvas page.
Course Summary:
Date | Details | Due |
---|---|---|