Selected Chapters of Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
by Daniel Jurafsky, James H. Martin. (J&M). We will follow the draft chapters from planned 3rd Edition.
Week | Lectures | Readings | Notes/Supplemental Materials |
---|---|---|---|
Jan 10-14 |
Intro Course Overview (ppt , pdf ) Finite State Text Processing, Morphology, Pynini (ppt , pdf ) |
Chp, 2 |
FIRST CLASS on Jan 11 Background Reading on Finite State Automata ONLY-Sec-2.2.1-2.2.2-2.2.3 (if you need it) Quizz1 (questions 1-6) background/quizzes: FSA Reg Expressions (out 10 - due 17) |
Jan 17-21 |
Finish FST + Text normalization, Spelling (ppt , pdf )
|
Appendix B
Chp. 3&7 |
Background reading On Probability and Information Theory (if you need it) Quizz1 (questions 7-12) background/quizzes: Conditional Prob., Bayes rule, ...) (out 10 - due 17) Hw1 Text Normalization, Pynini (W-FST) and Spelling Checker (out 18 - due 26) background/quizzes: MLP? |
Jan 24-28 |
Text Classification (Sentiment) (ppt , pdf )
|
Chp. 4&5 |
Hw2 Language Models Traditional vs. neural (out 27 - due Feb 3) |
Jan 31 - Feb 4 |
Sequence labeling: Markov Models -POS tagging and NER (ppt , pdf ) |
Chp8&Chp9 |
Hw3 text classification newsgroup BOW and fixed embeddings (out Feb 4 - due 13)
|
Feb 7-11 | Chp. 9&10 | Hw4 + Seq modeling traditional & neural (only LSTM) (out 14 - due 23) | |
Feb 14-18 |
Finish Transformers (ppt , pdf ) Pre-trained language models, Transfer Learning with Contextual Embeddings (ppt , pdf ) |
Chp. 11 (draft now available) |
BERT, BART, roBERTa.... BERT, The Illustrated BERT, ELMo, and co., Chen2019, BERTScore
|
Feb 21-25 | Winter Session Term 2 mid-term break | ||
Feb 28 -Mar 4 |
March 1st MIDTERM (Practice Questions )
Intro to syntax, Context Free Grammars and Parsing (ppt , pdf ) |
Chp. 12-13 |
|
Mar 7-11 |
Chunking (Shallow Costituency Parsing by Fine Tuning), Dependency Parsing, Treebanks (ppt , pdf ) (cont') Dependency/Constituency Parsing PCFG Traditional CKY / Neural for Both Const. and Dep. (ppt , pdf ) |
Chp. 13-14 |
Dependency/Constituency Traditional CKY / Neural Hw5 newsgroup classification (a) with Roberta document embeddings (b) With syntactic features (POS and dependency relations), |
Mar 14-18 |
|
Chp. 16 |
SemEval (Semantic tasks) Embeddings, Wordnet, Concept Graphs, Lexicons for Sentiment, Hw6 Syntax & Lexical Semantics |
Mar 21-25 |
Chp. 18-19 Chp. 6 |
Hw7 LDA |
|
Mar28-Apr1 |
- Intro Discourse & Discourse Parsing and Neural topic segmentation/labeling (ppt , pdf ) |
Chp. 21&22 |
Coreference, Discourse Parsing (shift-reduce neural)... for e.g. argumentation mining - mention debater IBM system Hw8 Discourse Parsing (?just to "explore" SOTA system?) |
Apr 4- Apr 8 |
- Intro Summarization (ppt , pdf ) - Apr 7. (read paper: PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization (ICML 2020), paper blog-post) (ppt , pdf ) QUESTIONS (part1) |
Not on textbook | Extractive / Abstractive
(introduce pre-training objectives tailored for a specific task) Hw9 Summarization (?just to "explore" SOTA system?) |
Wed Apr 20 7pm | FINAL EXAM room IRC-4 |
|