Sign up
Forgot password?
FAQ: Login

Setia L. Machine learning strategies for content based image retrieval

  • pdf file
  • size 8,08 MB
  • added by
  • info modified
Setia L. Machine learning strategies for content based image retrieval
University of Freiburg, 2008. — 142 p.
The ever increasing amount of digital information has created a need for effective information retrieval systems. As it is said, information which cannot be found easily is as good as lost. As information comes in various formats and types, their retrieval mechanisms also need to differ correspondingly. In this work, we deal with the task of content based image retrieval, in which the system facilitates the interaction between a user and an image database by automatic analysis of the image content.
We know about the ambiguities that can exist even in the simplest of phrases in a natural language. As an example, a sentence such as "Flying planes can be dangerous" can either mean "Flying planes are dangerous" or "Flying planes is dangerous". Images are no different. In fact, the old saying "A picture is worth a thousand words" is as true here as it is anywhere. In this work, we take the view that these ambiguities are natural, and thus the system should not take the one or the other viewpoint from the very beginning. The earliest systems for image retrieval allowed the user to specify his or her viewpoint by giving access to its internal system parameters, which can be complicated or tiring for the user. A modern system, on the other hand, seeks to learn this viewpoint by imposing less responsibilities on the user. This can be achieved using relevance feedback, in which the user progressively gives the system more and more information, in return for better results. Relevance feedback can be short-term, in which the data collected is discarded as soon as the session is over, and long-term, in which the data can be collected over multiple sessions of one user, or even over multiple users. In this work, however, we will constrain ourselves to short-term relevance feedback, as in our view the ambiguities or the multiple interpretations present in an image cannot be handled otherwise.
The later part of the thesis delves into image search as an offshoot of the traditional text-based search engines. To this end, we explore the possibility of annotating an image database using keywords. One advantage of this approach is that the user does not need to provide a suitable starting image for the query. An equally important advantage is that the annotation process can be carried out offline for the whole database, unlike relevance feedback which must be carried out in real time. Apart from annotation, we show that further data mining operations can also be carried out on image databases, which can contribute to improving the effectiveness of the image search engine. We conclude with the demonstration of various algorithms on a real life medical image database in which very competitive results could be achieved in recent international benchmarks.
  • Sign up or login using form at top of the page to download this file.
  • Sign up
Up