Outline
Introduction
Defining machine learning and data mining
Relation to other fields (stats, probability, information theory)
Computational and I/O complexity
Scalability and on-line processing
Privacy issues and social impact
Applications in AI, computer vision, computer games, search engines,
bioinformatics, robotics, HCI and graphics.
Exploratory Data Analysis
Data extraction and preprocessing
Missing value imputation
Descriptive summarisation
Visualisation
Spectral methods and latent semantic indexing
Probabilistic component analysis
Multi-dimensional scaling
Examples: text mining, image compression and multimedia databases
Regression
Linear regression (least squares and ridge)
Model assessment and cross-validation
Batch and on-line optimization
Nonlinear regression (neural nets and kernel machines)
Example: predicting commodity prices
Classification
Linear classification
Bayes risk
Decision trees
Boosting
Support vector machines
Example: DNA microarray classification
Clustering
Nearest neighbours and K-means
Spectral kernel methods
The EM algorithm
Mixture models for discrete and continuous data
Hidden Markov models
Examples: web mining, collaborative filtering, music and image clustering,
automatic translation, computer games and object recognition.