Outline

Introduction
  • Defining machine learning and data mining
  • Relation to other fields (stats, probability, information theory)
  • Computational and I/O complexity
  • Scalability and on-line processing
  • Privacy issues and social impact
  • Applications in AI, computer vision, computer games, search engines, bioinformatics, robotics, HCI and graphics.

    Exploratory Data Analysis
  • Data extraction and preprocessing
  • Missing value imputation
  • Descriptive summarisation
  • Visualisation
  • Spectral methods and latent semantic indexing
  • Probabilistic component analysis
  • Multi-dimensional scaling
  • Examples: text mining, image compression and multimedia databases

    Regression
  • Linear regression (least squares and ridge)
  • Model assessment and cross-validation
  • Batch and on-line optimization
  • Nonlinear regression (neural nets and kernel machines)
  • Example: predicting commodity prices

    Classification
  • Linear classification
  • Bayes risk
  • Decision trees
  • Boosting
  • Support vector machines
  • Example: DNA microarray classification

    Clustering
  • Nearest neighbours and K-means
  • Spectral kernel methods
  • The EM algorithm
  • Mixture models for discrete and continuous data
  • Hidden Markov models
  • Examples: web mining, collaborative filtering, music and image clustering, automatic translation, computer games and object recognition.
  •