Dissertation. l'Universite Paris VI - Pierre et Marie Curie. 2010. - 178 p.
In this thesis, we propose solutions to reduce training time and memory requirements of learning algorithms while keeping strong performances in accuracy. In particular, among all the machine learning models, we focus on Support Vector Machines (SVMs) that are standard methods mostly used for automatic classification. We extensively describe them in Chapter 2 Throughout this dissertation, we propose different original algorithms for learning SVMs, depending on the final task they are destined to. First, in Chapter 3, we study the learning process of Stochastic Gradient Descent for the particular case of linear SVMs. This leads us to define and validate the new SGD-QN algorithm. Then we introduce a brand new learning principle: the Process/Reprocess strategy. We present three algorithms implementing it. The Huller and LaSVM are discussed in Chapter
4. They are designed towards training SVMs for binary classification. For the more complex task of structured output prediction, we refine intensively LaSVM: this results in the LaRank algorithm which is detailed in Chapter
5. Finally, in Chapter 6 is introduced the original framework of learning under ambiguous supervision which we apply to the task of semantic parsing of natural language. Each algorithm introduced in this thesis achieves state-of-the-art performances, especially in terms of training speed. Almost all of them have been published in international peer-reviewed journals or conference proceedings. Corresponding implementations have also been released. As much as possible, we always keep the description of our innovative methods as generic as possible because we want to ease the design of any further derivation. Indeed, many directions can be followed to carry on with what we present in this dissertation. We list some of them in Chapter 7.