Browsing Posts tagged Linked Data

In the model proposed by Kuhlthau in 1991, the following phases are envisioned in the Information Seeking Process:

  • Initialization marks introduction of a problem, including the problem definition and, based on this, the provision of suggested solutions based on previously operated processes.
  • During the second phase, Selection, users identify the general area for investigation though an overview of all available topics, together with automatic topic suggestions.
  • The third stage, Exploration, is known as the most difficult of the entire process, as it includes  the exploration of the general topic(s) in order to extend understanding and to relate it with what is already known: it requires a graphical representation of information that can be understood by the user, interaction possibilities  that are self-explanatory and easy to use, the accessibility of details on demand, sorting and paging  techniques to handle large datasets,  functionalities capable of showing information in different levels of detail,  grouping and clustering  operations to highlight a specific data dimension, and suggestions on how to expand the current set of topics with related ones.
  • The fourth stage, Formulation, is the phase in which a focused perspective on the topic emerges, by selecting hypotheses through filtering or information re-shaping. It requires the interactive and intuitive formulation and change of filters, the combination of different filters, the traceability of effects caused by each of them and a possibility to change the focus (pivoting).
  • The Collection task is used to gather information and requires easy mechanisms to select interesting findings and to export the selected information for further use in other systems. Finally, the presentation task is devoted to presenting the collected information, by providing different opportunities to visualize the results.

In our recent study we evaluated the exploration tools based on the coverage of these phases.

We classified the tools in four categories:

Linked Data Search Engines are devoted to crawling and indexing all Linked Data published on the Web, implementing effective ranking algorithms, and exposing search services to machines and humans, but with no specific focus on supporting information seeking processes. The first two Linked Data search engines to appear were Swoogleand Watson. The other attempts differentiate on extending or improving one of the three basic features above. Sindice is a remarkable example of modern LD search engine, as it focuses on scaling to very large quantities of data and on supporting simple reasoning mechanisms for inverse-functional properties.

Linked Data Browsers were first demonstrated in Tabulator. Using outline and table modes, Tabulator provides a way to browse RDF data published on the Web. Disco is a lighter version of Tabulator, meant for debugging sites that publish LD. Another notable example of LD Browser is Marble, which aims instead to format linked data content for XHTML clients using Fresnel lenses and formats, including colors for providing provenance information. The first industrial result is OpenLink Data Explorer, which allows Web users to explore the linked data that may underlie a Web page. LD browsers are naturally inclined to triple exploration, with no support for search, filtering, data re-shaping or alternative visualizations; the only exception in the group is represented by Tabulator, which, thanks to its collapsible hierarchical visualization, it enables custom exploration.

Visual Interfaces for Linked Data Repositories allow users to navigate Linked Data repositories, both by means of a SPARQL endpoint and a rich visual interface. This aspect is relevant since many academic and industrial research labs claim to able to keep updated in real-time, by mirroring changes from the original data sources, a repository containing all (or part of) the datasets published as Linked Data. RDK Explorer, Sig.ma, Haystack  and Uberblic are few popular examples of these visual interfaces for Linked Data Repositories.
RDK Explorer offers a panel-based interface to heterogeneous large sets of information about people, publications, research topics and projects. The interface displays detailed information regarding the current resource, a graph that shows the context of related resources (permitting the user to change the current resource). VisiNav is a system based on a visual query construct paradigm exploiting six atomic query operations such as keyword search, facet selection, path traversal, etc.  Sig.ma offers an interface where the displayed links are collected from multiple Linked Data sources and merged. A key distinguish feature of Sig.ma is the incremental display of data while relevant sources are discovered, thus enhancing the user experience; users can also highlight data provenance and favor (or discard) given data sources. Haystack aggregates Linked Data from multiple arbitrary locations and presents it to the user in a human-readable fashion, with point and click semantics that let the user navigate from one piece of data to another. Display is controlled by presentation recommendations, i.e., sort of stylesheets that can be used for obtaining multiple views (e.g., as thumbnails, Web pages, or taxonomies).
Uberblic is an  industrial research result that provides an integration service tying together all that Linked Data into a more coherent experience. Other solutions, like Microsoft Pivot, have been adapted to Linked Data browsing too, thus letting the user explore results by zooming, panning, or pivoting. Our proposal is located in the same application space as the above mentioned tools, sharing several key features such as native support for incremental exploration, data visualization, data relationships highlighting and navigation. However, only RDK Explorer provides support for the initialization and selection steps, whereby only Pivot and VisiNav support pivoting.

Facet-based systems also share several characteristics with our approach, as they try to build semantically unique search queries by enabling faceted search through facets and results navigation. In Facet Graphs, facets and result set are represented as nodes in a graph visualization. The semantic relations that exist between facets and result set as well as facets and other facets are represented by labeled directed edges between the nodes. Other tools like mSpace, Humboldt and Parallax also allow for hierarchical filtering. Parallax also provide support for expansion with related topics, where the available relationships are the ones pre-defined in the underlying collection.

Comparison.

Linked data exploration tools. Coverage of ISP phases

As it emerges from the Table above, the based Liquid Query approach is the one that currently provides the widest coverage of the information seeking process requirements. Differently from all the analyzed solutions, our approach is not aimed at producing one general user interface: thanks to application configurations, we provide a methodology to configure ad-hoc vertical solutions for navigating data within specific domains. Our join-based approach saves the user several exploratory link navigations between concepts and our tunable global ranking function provides a customizable ranking of combinations of objects. Furthermore, in our work exploration is not confined to data aggregated in one repository, but, thanks to value-based joins, can span linked data and arbitrary data sources wrapped as Web services. Solution and topic suggestions are not currently covered, but they can be obtained by mining user behavior (studies on these aspects are part of our future agenda). Traceability and exporting facilities are not currently implemented in our prototypes, but the approach is ready to support them and we plan to close their implementation in the near future. The online demo covers the stages of Exploration, Selection, Collection, and Presentation of results, while Initialization and Selection are covered by the configuration of the application, that is performed through specifically devised design tools.

After presenting our Liquid Query paper at 2010, we took some time to analyze the overall scientific and technical program, spotting some papers that relates a lot to some of the research problems addressed by .

continue reading…

Georgi Kobilarov, the CEO of a German, startup called Uberblic, issued an open challenge on his blog, asking: if we had a Web of Data, what would you build?

Here’s Georgi idea:

Here’s my idea: If we had a Web of Data, I would built an application for painless travel planning. It would integrate flight plans, train timetables, bus routes, car rental offers, etc. And the user would be able to just say: I want to go from A to B: Find me the best/cheapest/fastest routes.

continue reading…

Powered by WordPress Web Design by SRS Solutions © 2012 Search Computing Blog Design by SRS Solutions
Rss Feed Tweeter button Facebook button Linkedin button Delicious button Digg button