Mehryar M., Rostamizadeh A., Talwalkar A. Foundations of Machine Learning

pdf file
size 8,30 MB

added by Masherov 01/29/2019 10:45
info modified 09/28/2021 17:15

Mehryar M., Rostamizadeh A., Talwalkar A. Foundations of Machine Learning

2nd edition. — Boston: The MIT Press, 2018. — 505 p.

What is machine learning?
What kind of problems can be tackled using machine learning?
Some standard learning tasks
Learning stages
Learning scenarios
Generalization
The PAC learning model
Guarantees for finite hypothesis sets — consistent case
Guarantees for finite hypothesis sets — inconsistent case
Deterministic versus stochastic scenarios
Bayes error and noise
Exercises
Rademacher Complexity and VC-Dimension
Rademacher complexity
Growth function
VC-dimension
Lower bounds

Exercises
Estimation and approximation errors
Empirical risk minimization (ERM)
Structural risk minimization (SRM)
Cross-validation
n-Fold cross-validation
Regularization-based algorithms
Convex surrogate losses

Exercises
Linear classification
Separable case
Primal optimization problem
Dual optimization problem
Leave-one-out analysis
Non-separable case
Primal optimization problem
Support vectors
Dual optimization problem
Margin theory
Exercises

Definitions
Reproducing kernel Hilbert space
Properties
SVMs with PDS kernels
Learning guarantees
Negative definite symmetric kernels
Sequence kernels
Weighted transducers
Rational kernels
Approximate kernel feature maps

Exercises

AdaBoost
Bound on the empirical error
Relationship with coordinate descent
VC-dimension-based analysis
L-geometric margin
Margin-based analysis
Margin maximization
Game-theoretic interpretation
L-regularization
Discussion

Exercises
On-Line Learning
Prediction with expert advice
Mistake bounds and Halving algorithm
Weighted majority algorithm
Randomized weighted majority algorithm
Exponential weighted average algorithm
Perceptron algorithm
Winnow algorithm
On-line to batch conversion
Game-theoretic connection

Exercises
Multi-class classification problem
Generalization bounds
Multi-class SVMs
Multi-class boosting algorithms
Decision trees
Aggregated multi-class algorithms
One-versus-one
Error-correcting output codes
Structured prediction algorithms

Exercises
Ranking
The problem of ranking
Generalization bound
Ranking with SVMs
RankBoost
Bound on the empirical error
Relationship with coordinate descent
Margin bound for ensemble methods in ranking
Bipartite ranking
Boosting in bipartite ranking
Area under the ROC curve
Second-stage ranking problem
Deterministic algorithm
Randomized algorithm
Other ranking criteria

Exercises
The problem of regression
Finite hypothesis sets
Rademacher complexity bounds
Pseudo-dimension bounds
Linear regression
Kernel ridge regression
Support vector regression
Lasso
On-line regression algorithms

Exercises
Density estimation problem
Maximum Likelihood (ML) solution
Density estimation problem augmented with features
Maxent principle
Dual problem
Generalization bound
Coordinate descent algorithm
Extensions
L-regularization

Exercises
Learning problem
Conditional Maxent models
Dual problem
Properties
Feature vectors
Generalization bounds
Logistic model
L-regularization
Proof of the duality theorem

Exercises
Definitions
Stability-based generalization guarantee
Stability of kernel-based regularization algorithms
Application to regression algorithms: SVR and KRR
Application to classification algorithms: SVMs

Exercises
Dimensionality Reduction
Principal component analysis
Kernel principal component analysis (KPCA)
Isomap
Laplacian eigenmaps
Locally linear embedding (LLE)
Johnson-Lindenstrauss lemma
Exercises

Finite automata
Efficient exact learning
Passive learning
Learning with queries
Learning automata with queries
Identification in the limit
Learning reversible automata

Exercises
Learning scenario
Markov decision process model
Definition
Optimal policies
Policy evaluation
Value iteration
Policy iteration
Linear programming
Learning algorithms
Stochastic approximation
TD() algorithm
Q-learning algorithm
TD(λ) algorithm
Large state space

Norms
Dual norms
Matrix norms
Symmetric positive semidefinite (SPSD) matrices
Convexity
Constrained optimization
Subgradients
Conjugate functions

Exercises
Random variables
Expectation and Markov’s inequality
Variance and Chebyshev’s inequality
Momentgenerating functions
Exercises
Hoeffding’s inequality
Sanov’s theorem
Multiplicative Chernoff bounds
Binomial distribution tails: Lower bound
Azuma’s inequality
McDiarmid’s inequality
Khintchine-Kahane inequality
Maximal inequality
Exercises
Entropy
Relative entropy
Bregman divergences

Exercises

Mehryar M., Rostamizadeh A., Talwalkar A. Foundations of Machine Learning

See also

Goodfellow Ian, Bengio Yoshua, Courville Aaron. Deep Learning Book

Murphy K.P. Machine Learning: A Probabilistic Perspective

Nagy Z. Artificial Intelligence and Machine Learning Fundamentals