2nd edition. — Boston: The MIT Press, 2018. — 505 p.
What is machine learning?
What kind of problems can be tackled using machine learning?
Some standard learning tasks
Learning stages
Learning scenarios
Generalization
The PAC learning model
Guarantees for finite hypothesis sets — consistent case
Guarantees for finite hypothesis sets — inconsistent case
Deterministic versus stochastic scenarios
Bayes error and noise
Exercises
Rademacher Complexity and VC-Dimension
Rademacher complexity
Growth function
VC-dimension
Lower bounds
Exercises
Estimation and approximation errors
Empirical risk minimization (ERM)
Structural risk minimization (SRM)
Cross-validation
n-Fold cross-validation
Regularization-based algorithms
Convex surrogate losses
Exercises
Linear classification
Separable case
Primal optimization problem
Dual optimization problem
Leave-one-out analysis
Non-separable case
Primal optimization problem
Support vectors
Dual optimization problem
Margin theory
Exercises
Definitions
Reproducing kernel Hilbert space
Properties
SVMs with PDS kernels
Learning guarantees
Negative definite symmetric kernels
Sequence kernels
Weighted transducers
Rational kernels
Approximate kernel feature maps
Exercises
AdaBoost
Bound on the empirical error
Relationship with coordinate descent
VC-dimension-based analysis
L-geometric margin
Margin-based analysis
Margin maximization
Game-theoretic interpretation
L-regularization
Discussion
Exercises
On-Line Learning
Prediction with expert advice
Mistake bounds and Halving algorithm
Weighted majority algorithm
Randomized weighted majority algorithm
Exponential weighted average algorithm
Perceptron algorithm
Winnow algorithm
On-line to batch conversion
Game-theoretic connection
Exercises
Multi-class classification problem
Generalization bounds
Multi-class SVMs
Multi-class boosting algorithms
Decision trees
Aggregated multi-class algorithms
One-versus-one
Error-correcting output codes
Structured prediction algorithms
Exercises
Ranking
The problem of ranking
Generalization bound
Ranking with SVMs
RankBoost
Bound on the empirical error
Relationship with coordinate descent
Margin bound for ensemble methods in ranking
Bipartite ranking
Boosting in bipartite ranking
Area under the ROC curve
Second-stage ranking problem
Deterministic algorithm
Randomized algorithm
Other ranking criteria
Exercises
The problem of regression
Finite hypothesis sets
Rademacher complexity bounds
Pseudo-dimension bounds
Linear regression
Kernel ridge regression
Support vector regression
Lasso
On-line regression algorithms
Exercises
Density estimation problem
Maximum Likelihood (ML) solution
Density estimation problem augmented with features
Maxent principle
Dual problem
Generalization bound
Coordinate descent algorithm
Extensions
L-regularization
Exercises
Learning problem
Conditional Maxent models
Dual problem
Properties
Feature vectors
Generalization bounds
Logistic model
L-regularization
Proof of the duality theorem
Exercises
Definitions
Stability-based generalization guarantee
Stability of kernel-based regularization algorithms
Application to regression algorithms: SVR and KRR
Application to classification algorithms: SVMs
Exercises
Dimensionality Reduction
Principal component analysis
Kernel principal component analysis (KPCA)
Isomap
Laplacian eigenmaps
Locally linear embedding (LLE)
Johnson-Lindenstrauss lemma
Exercises
Finite automata
Efficient exact learning
Passive learning
Learning with queries
Learning automata with queries
Identification in the limit
Learning reversible automata
Exercises
Learning scenario
Markov decision process model
Definition
Optimal policies
Policy evaluation
Value iteration
Policy iteration
Linear programming
Learning algorithms
Stochastic approximation
TD() algorithm
Q-learning algorithm
TD(λ) algorithm
Large state space
Norms
Dual norms
Matrix norms
Symmetric positive semidefinite (SPSD) matrices
Convexity
Constrained optimization
Subgradients
Conjugate functions
Exercises
Random variables
Expectation and Markov’s inequality
Variance and Chebyshev’s inequality
Momentgenerating functions
Exercises
Hoeffding’s inequality
Sanov’s theorem
Multiplicative Chernoff bounds
Binomial distribution tails: Lower bound
Azuma’s inequality
McDiarmid’s inequality
Khintchine-Kahane inequality
Maximal inequality
Exercises
Entropy
Relative entropy
Bregman divergences
Exercises