Springer, 2011, -263 p.
Searching for information is perhaps the most important application of today’s computing systems. In the new century, all the World’s citizens have become accustomed to thinking of the Web as the source for answering their information needs, and search engines as their Web interface. Websites reporting a movie’s plot, tomorrow’s weather in our next destination, the risks of a surgical procedure, the fastest route to a friend’s house, the video of the last opera at La Scala can all be found as result of a keyword search. If any Web page in the world stores the answer to our information need, then we expect the search engine to link that page and describe it through a snippet appearing in the first page of the search results.
A few search engine companies are able to meet such expectations, and completely cover the search engine market. However, offering a link to a Web page does not cover all information needs. Many problems cannot be solved by simple keyword-based queries. The notion of best page for solving a given problem is typically inadequate when the problem requires solutions spanning over multiple pages.We are indeed accustomed to using a variety ofWeb resources to solve our problems: while the search engine hints to useful information, the user’s brain is the fundamental platform for information integration.
Problems such as who is the best doctor to cure insomnia in a nearby hospital can be solved by using the Web multiple times, searching for partial results. Once a hospital’s website is located and the listing of its doctors is extracted, one can find out that there is one doctor in the list who has published recent papers on insomnia. While doing so, the user is performing information integration in her brain; specifically, she is applying ranking while extracting hospitals based on proximity and doctors based on their publications on insomnia, then matching on the basis of doctor names. Of course, the best way to build the matching is to use the search engine itself, by entering doctors’ names as keywords, extracted from either the hospital search or the literature search, but then the search is less focused and result interpretation is more difficult.
Complex queries are supported in certain domains, such as travels, hotel booking, and book purchasing, by specialized, domain-specific search systems or search engine integrators. It is important to assess how travel assistants solve the problem: they offer a few predefined queries to build the itinerary, then offer additional services (e.g., car rentals, hotels, local events, insurance) so as to complete the plan around the itinerary. Thus, they perform specialized steps of integration by substituting the user’s brain, and then let the user enrich the solution incrementally and interactively, with customized interfaces. In other words, they solve complex queries in the context of given domains, which are supported by a substantial business; such specialized search systems dominate over general purpose ones in their domain of expertise, and therefore attract users.
The search computing project (SeCo), funded by the European Research Council as an advanced IDEAS grant, aims at building concepts, algorithms, tools, and technologies to support complex Web queries. The project is now entering the third of a five-year lifespan (November 2008 – November 2013); it proposes a new paradigm for solving complex queries based on combining data extraction from distinct sources and data integration by means of specialized integration engines. Data extraction retrieves data from different sources, ordered based on local rankings, and data integration merges such results into result combinations, with an associated global ranking, such that combinations with the highest ranking are produced as fast as possible; a result combination represents the solution of a complex search problem. Thus, the search computing project has the ambitious goal of lowering the technological barrier required for building complex search applications, thereby enabling the development of many new applications which will cover relevant search needs.
Search computing covers many research directions, which are all required in order to provide an overall solution to a complex search. The core of the project is the technology for search service integration, which requires both theoretical investigation and engineering of efficient technological solutions. The core theory concerns the development of result integration methods that not only denote top-k optimality, but also the need of dealing with proximity, approximation, and uncertainty. Such a theory is supported by an open, extensible and scalable architecture for computing queries over data services, designed so as to incorporate the project’s results by adding new operations, by encoding new join methods, and by injecting new features dealing with incremental evaluation and adaptivity.
A number of further research dimensions complement such core. Formulation of a complex query and browsing over solutions is a complex cognitive task, whose intrinsic difficulty has to be lowered as much as possible so as to meet usability requirements. Therefore, we are investing a consistent effort in the development of user-friendly interfaces which are targeted at assisting users in expressing their needs and then browsing on results. Solving a complex problem requires supporting users in the interactive and incremental design of their queries, thereby assisting search as a long-term process for exploring the solution space; result differences can be better appreciated by visualizing results (e.g., through maps or timelines). The project success also depends on the ability of registering new sources and making them available for solving complex problems; therefore, we have designed abstractions, architectural solutions, and model-driven design tools for service registration and for application development, aiming at assisting service publishing, application design, and query execution tuning. While the current description of Web resources is very simple, so as to enable an equally simple description of Web interactions, we aim at linking the service description to ontological sources, so as to enable high-level expressive interfaces covering the gap from high-level interactions to query expression.
While focusing on technological dimensions, we are also investigating crucial aspects to the project success, such as the business models and user involvement in the design process through user-centered design. We are additionally investigating the use of search computing for scientific applications, such as supporting bio-informatics research by enabling the access to genetic and proteomic data sources.
This book reports the proceedings of the workshop New Trends in Search Computing, held in Como and Milan during May 25–31, 2010, as the follow-up of the workshop Search Computing Challenges and Directions, also published by Springer in 2010 (LNCS 5950).
Part 1: The Search ProcessThe New Frontier of Web Search Technology: Seven Challenges
Information Exploration in Search Computing
Trends in Search Interaction
Part 2: Interaction DesignContext and Action in Search Interfaces
Desktop, Tabletop or Mobile?
Visualization of Multi-domain Ranked Data
Part 3: Semantic DescriptionSemantic Resource Framework
Automatic Normalization and Annotation for Discovering Semantic Mappings
Towards an Ontological Representation of Services in Search Computing
Part 4: Rank JoinThe Rank Join Problem
Proximity Rank Join in Search Computing
Uncertainty in Rank Join
Trends in Rank Join
Efficient Computation of Search Computing Queries
Run-Time Adaptivity for Search Computing
Part 6: Tools and MashupsTools Supporting Search Computing Application Development
Distributed User Interface Orchestration: On the Composition of Multi-User (Search) Applications
On Development Practices for End Users
Part 7: Bio-SeCoBio-SeCo: Integration and Global Ranking of Biomedical Search Results
Workflows for Information Integration in the Life Sciences
Complex Search, Ranks, and Biological Discovery: A User’s Perspective
Part 8: Towards a Sustainable ExploitationAn Experience in Applying User Centered Design to Search Computing
Analysis of Business Models for Search Computing