Springer, 2002. — 218 p.
Text classification, or the task of automatically assigning semantic categories to natural language text, has become one of the key methods for organizing online information. Since hand-coding such classification rules is costly or even impractical, most modem approaches employ machine learning techniques to automatically learn text classifiers from examples. However, none of these conventional approaches combines good prediction performance, theoretical understanding, and efficient training algorithms.
Based on ideas from Support Vector Machines (SVMs), this book presents a new approach to learning text classifiers from examples. It provides not only learning methods that empirically have state-of-the-art predictive performance, but also a theory that connects the properties of text-classification tasks with the generalization accuracy of the learning methods, as well as algorithms that efficiently implement the methods. In particular, the results show that the SVM approach to learning text classifiers is highly effective without greedy heuristic components. To explain these empirical findings, this book analyzes the statistical properties of textclassification tasks and presents a theoretical learning model that leads to bounds on the expected error rate of an SVM. The bounds are based on improved results about leave-one-out estimators for SVMs. These results also lead to a new group of performance estimators for SVMs, called ξα-estimators, and to an improved algorithm for computing the leave-one-out error of an SVM. While all results mentioned so far were for the inductive setting, this book also introduces the idea of transduction to text classification. It shows how and why exploiting the location of the test points during learning can improve predictive performance. To make the SVM approach to learning text classifiers applicable in practice, this book also derives new algorithms for training SVMs. For both the inductive and the transductive setting, these algorithms substantially extend the scalability of SVMs to large-scale problems.
While the work presented in this book is driven by the application, the techniques it develops are not limited to text classification, but can be applied in other contexts as well. In addition, not only can individual techniques be transferred to other tasks, this work as a whole can be useful as a case study for how to approach high-dimensional learning tasks.
Text Classification
Support Vector Machines
Part I TheoryA Statistical Learning Model of Text Classification for SVMs
Efficient Performance Estimators for SVMs
Part II MethodsInductive Text Classification
Transductive Text Classification
Part III AlgorithmsTraining Inductive Support Vector Machines
Training Transductive Support Vector Machines