|
CPSC 525: Course Outline and Reading List
Instructor:
David Lowe
January-April 2014
Course home page:
http://www.cs.ubc.ca/~lowe/525
Textbook: While most of the course is based on original research
papers, we will also consult the following textbook by Richard
Szeliski. It is available for free on-line, or can be purchased in
printed form.
-
Computer Vision: Algorithms and Applications by Richard Szeliski
The following is a tentative list of topics and readings for the
course. It will be changed and updated as the course proceeds.
Introduction
The first class will provide an overview of the computer vision field and
its applications.
-
Read Chapter 1 of Szeliski's book
for an introduction to computer vision and a brief history of the field.
Stereo vision
Topics: Epipolar geometry and rectification. Correlation and feature
matching. Discussion of the first assignment. Belief propagation.
-
Pascal Fua, "A parallel stereo algorithm that produces dense depth
maps and preserves image features," Machine Vision and Applications,
6 (1993), 35--49.
[PDF]
D. Scharstein and R. Szeliski. "A Taxonomy and Evaluation of
Dense Two-Frame Stereo Correspondence Algorithms,"
International Journal of Computer Vision,
47 (2002), pp. 7-42.
[Web site with data and source code]
[PDF]
Pedro Felzenszwalb and Daniel Huttenlocher,
"Efficient Belief Propagation for Early Vision,"
Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
[Presentation]
[Web site with source code]
[PDF]
Image matching and recognition with invariant local features
Interest points. Rotation, scale, and illumination invariance. Image region
descriptors. RANSAC. The Hough transform.
Section 4.1, Feature Detection and Matching, from
Szeliski's book
David G. Lowe,
"Distinctive image features from scale-invariant keypoints,"
International Journal of Computer Vision,
60, 2 (2004), pp. 91-110.
[PDF]
M. Calonder, V. Lepetit, C. Strecha, and P. Fua,
"BRIEF: Binary Robust Independent Elementary Features,"
European Conference on Computer Vision (ECCV), 2010.
[PDF]
Articles from Wikipedia:
RANSAC;
The Hough Transform;
Image registration and 3D reconstruction
Non-linear least-squares with Gauss-Newton.
Levenberg-Marquardt. Robust solutions. Solving for 3D structure and
camera pose. Dense surface reconstruction.
Section 6.1, Feature-based alignment, from
Szeliski's book
Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring
photo collections in 3D," ACM Transactions on Graphics (SIGGRAPH),
25(3), 2006, 835-846.
[PDF]
Optional:
Michael Goesele, Noah Snavely, Brian Curless, Hugues Hoppe, Steven M. Seitz,
"Multi-View Stereo for Community Photo Collections,"
ICCV (2007).
[PDF]
[Project web site]
Background reading:
Linear Least Squares;
Gauss-Newton Algorithm;
Levenberg-Marquardt;
Matching and recognition in large datasets
Scaling recognition to large image collections.
K-means clustering algorithm. K-d trees.
Approximate nearest-neighbour matching in high-dimensional spaces.
FLANN.
David Nister and Henrik Stewenius,
"Scalable recognition with a vocabulary tree,"
Conference on Computer Vision and Pattern Recognition, 2006.
[PDF]
Marius Muja and David G. Lowe,
"Fast approximate nearest neighbors with automatic algorithm configuration,"
International Conference on Computer
Vision Theory and Applications (VISAPP), 2009.
[PDF];
[Source code]
Articles from Wikipedia:
K-means clustering;
K-d trees
Learning to recognize object categories
Face detection. The AdaBoost alogorithm.
Learning generative and discriminative models. The bag-of-features approach
versus learned geometry. Object segmentation from recognition.
Paul Viola and Michael Jones,
"Rapid object detection using a boosted cascade of simple features,"
Conference on Computer Vision and Pattern Recognition, 2001,
pp. 511-518.
[PDF]
For background on AdaBoost, read Freund and Schapire,
"A short introduction to boosting," JJSAI, 1999.
[PDF]
Chapter 14, Recognition, from
Szeliski's book
Li Fei-Fei, Rob Fergus, Antonio Torralba,
"ICCV 2009 Short Course: Recognizing and Learning Object Categories."
[Course page, including Matlab code]
Optional:
Bastian Leibe, Edgar Seemann, and Bernt Schiele,
"Pedestrian detection in crowded scenes,"
CVPR 2005, San Diego (June 2005).
[PDF]
Scene perception
Recognition of scene categories. Recognition from low-resolution images.
Discriminative features for location recognition.
S. Lazebnik, C. Schmid, and J. Ponce,
"Beyond Bags of Features: Spatial Pyramid Matching for Recognizing
Natural Scene Categories,"
IEEE Conference on Computer Vision and Pattern Recognition,
New York (June 2006).
[PDF]
A. Torralba, R. Fergus, W. T. Freeman,
"80 million tiny images: a large dataset for non-parametric object and
scene recognition,"
PAMI, 30, 11 (2008).
[PDF]
Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros,
"What Makes Paris Look like Paris?"
SIGGRAPH (2012).
[PDF]
[Project page]
Motion tracking and interpretation
Measuring optical flow. Structure from motion. Kalman filter and
estimation theory. Color histograms. Tracking with particle filters.
Action recognition.
-
Andrew J. Davison, Ian Reid, Nicholas Molton and Olivier Stasse,
"MonoSLAM: Real-Time Single Camera SLAM,"
IEEE PAMI, (June 2007).
[PDF]
[Davison's web site]
P. Pérez, C. Hue, J. Vermaak and M. Gangnet,
"Color-based probabilistic tracking,"
European Conference on Computer Vision, ECCV 2002,
Copenhagen, Denmark (June 2002).
[PDF]
Alexei A. Efros, Alexander C. Berg, Greg Mori and Jitendra Malik,
"Recognizing Action at a Distance,"
International Conference on Computer Vision, Nice, France (2003).
[PDF]
Neurophysiology of vision
Structure of the visual cortex. Higher-level neurophysiology of
vision. "What" vs. "where" pathways in the brain. Models of
recognition in the brain.
Simon A.J. Winder, "A brief survey of central mechanisms in primate
visual perception," (2002).
[PDF]
R. Quiroga, et al., "Invariant visual representation by single neurons in
the human brain," Nature (2005).
[PDF]
T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio,
"Object recognition with cortex-like mechanisms,"
IEEE PAMI (2007).
[PDF]
Deep Learning for Vision
The back-propagation algorithm. Convolutional nets. Applications to object
category recognition.
Yann Lecun, Marc'Aurelio Ranzato,
"Deep Learning Tutorial",
ICML (2013).
[PDF]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,
"ImageNet Classification with Deep Convolutional Neural Networks,"
NIPS (2012).
[PDF]
M.D. Zeiler, R. Fergus,
"Visualizing and Understanding Convolutional Networks,"
arXiv:1311.2901 (November 2013)
[PDF]
Andrej Karpathy,
ConvNetJS: Deep Learning in your browser. Deep learning code that allows for training
within your browser using JavaScript.
Articles from Wikipedia:
The backpropagation algorithm;
Deep learning;
Colour vision (optional)
Colour spaces. Colour constancy. The use of colour for recognition.
Section 2.3.2, Color, from
Szeliski's book
Brian Funt, Kobus Barnard and Lindsay Martin, "Is colour constancy
good enough?" European Conference on Computer Vision,
(1998), pp. 445-459.
[PDF]
Project presentations
The final few classes will consist of project presentations.
|