Schölkopf B., Smola A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

pdf file
size 9,21 MB

added by Михаил 04/15/2014 19:15
info modified 09/28/2021 17:15

Schölkopf B., Smola A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Massachusetts Institute of Technology, 2002. — 645 p. — ISBN: 0262194759, 978-0262194754.
Series: Adaptive Computation and Machine Learning series.

In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs — kernels — for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics. Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.

A Tutorial Introduction.
Data Representation and Similarity, A Simple Pattern Recognition Algorithm,
Some Insights From Statistical Learning Theory, Hyperplane Classifiers, Support Vector Classification,
Support Vector Regression, Kernel Principal Component Analysis, Empirical Results and Implementations.

I. Concepts and Tools.
Kernels.
Product Features, The Representation of Similarities in Linear Spaces,
Examples and Properties of Kernels, The Representation of Dissimilarities in Linear Spaces.
Risk and Loss Functions.
Loss Functions, Test Error and Expected Risk, A Statistical Perspective, Robust Estimators.
Regularization.
The Regularized Risk Functional, The Representer Theorem, Regularization Operators,
Translation Invariant Kernels, Translation Invariant Kernels in Higher Dimensions, Dot Product Kernels,
Multi-Output Regularization, Semiparametric Regularization, Coefficient Based Regularization.
Elements of Statistical Learning Theory.
The Law of Large Numbers, When Does LearningWork: the Question of Consistency,
Uniform Convergence and Consistency, How to Derive a VC Bound, A Model Selection Example.
Optimization.
Convex Optimization, Unconstrained Problems, Constrained Problems,
Interior Point Methods, Maximum Search Problems.

II. Support Vector Machines.
Pattern Recognition.
Separating Hyperplanes, The Role of the Margin, Optimal Margin Hyperplanes, Nonlinear Support Vector Classifiers,
Soft Margin Hyperplanes, Multi-Class Classification, Variations on a Theme, Experiments.
Single-Class Problems: Quantile Estimation and Novelty Detection.
A Distribution’s Support and Quantiles, Algorithms, Optimization, Theory, Discussion, Experiments.
Regression Estimation.
Linear Regression with Insensitive Loss Function, Dual Problems.
ν-SV Regression, Convex Combinations and l₁-Norms.
Parametric Insensitivity Models, Applications.
Implementation.
Tricks of the Trade, Sparse Greedy Matrix Approximation, Interior Point Algorithms,
Subset Selection Methods, Sequential Minimal Optimization, Iterative Methods.
Incorporating Invariances.
Prior Knowledge, Transformation Invariance, The Virtual SV Method,
Constructing Invariance Kernels, The Jittered SV Method.
Learning Theory Revisited.
Concentration of Measure Inequalities, Leave-One-Out Estimates,
PAC-Bayesian Bounds, Operator-Theoretic Methods in Learning Theory.

III. Kernel Methods.
Designing Kernels, Tricks for Constructing Kernels, String Kernels, Locality-Improved Kernels, Natural Kernels.
Kernel Feature Extraction.
Kernel PCA, Kernel PCA Experiments, A Framework for Feature Extraction, Algorithms for Sparse KFA, FA Experiments.
Kernel Fisher Discriminant.
Fisher’s Discriminant in Feature Space, Efficient Training of Kernel Fisher Discriminants, Probabilistic Outputs, Experiments.
Bayesian Kernel Methods.
Bayesics, Inference Methods, Gaussian Processes, Implementation of Gaussian Processes,
Laplacian Processes, Relevance Vector Machines.
Regularized Principal Manifolds.
A Coding Framework, A Regularized Quantization Functional, An Algorithm for Minimizing R_reg[f ].
Connections to Other Algorithms, Uniform Convergence Bounds, Experiments.
Pre-Images and Reduced Set Methods.
The Pre-Image Problem, Finding Approximate Pre-Images, Reduced Set Methods,
Reduced Set Selection Methods, Reduced Set Construction Methods, Sequential Evaluation of Reduced Set Expansions.
A. Addenda: Data Sets, Proofs.
B. Mathematical Prerequisites: Probability, Linear Algebra, Functional Analysis.

Home

Schölkopf B., Smola A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

See also

Cristianini N., Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

Flach P. Machine Learning: The Art and Science of Algorithms that Make Sense of Data

Goodfellow Ian, Bengio Yoshua, Courville Aaron. Deep Learning Book

Murphy K.P. Machine Learning: A Probabilistic Perspective

Nisbet R., Elder J., Miner G. Handbook of Statistical Analysis and Data Mining Applications

Theodoridis S. Machine Learning. A Bayesian and Optimization Perspective