| Flexible and efficient querying and ranking on hyperlinked data sources |
| Full text |
Pdf
(1.33 MB)
|
| Source
|
Extending Database Technology; Vol. 360
archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Heterogeneous & distributed
table of contents
Pages 553-564
Year of Publication: 2009
ISBN:978-1-60558-422-5
|
|
Authors
|
|
Ramakrishna Varadarajan
|
Florida International University, Miami, FL
|
|
Vagelis Hristidis
|
Florida International University, Miami, FL
|
|
Louiqa Raschid
|
University of Maryland, College Park, MD
|
|
Maria-Esther Vidal
|
Universidad Simón Bolívar, Caracas, Venezuela
|
|
Luis Ibáñez
|
Universidad Simón Bolívar, Caracas, Venezuela
|
|
Héctor Rodríguez-Drumond
|
Universidad Simón Bolívar, Caracas, Venezuela
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 13, Downloads (12 Months): 83, Citation Count: 0
|
|
|
ABSTRACT
There has been an explosion of hyperlinked data in many domains, e.g., the biological Web. Expressive query languages and effective ranking techniques are required to convert this data into browsable knowledge. We propose the Graph Information Discovery (GID) framework to support sophisticated user queries on a rich web of annotated and hyperlinked data entries, where query answers need to be ranked in terms of some customized ranking criteria, e.g., PageRank or ObjectRank. GID has a data model that includes a schema graph and a data graph, and an intuitive query interface. The GID framework allows users to easily formulate queries consisting of sequences of hard filters (selection predicates) and soft filters (ranking criteria); it can also be combined with other specialized graph query languages to enhance their ranking capabilities. GID queries have a well-defined semantics and are implemented by a set of physical operators, each of which produces a ranked result graph. We discuss rewriting opportunities to provide an efficient evaluation of GID queries. Soft filters are a key feature of GID and they are implemented using authority flow ranking techniques; these are query dependent rankings and are expensive to compute at runtime. We present approximate optimization techniques for GID soft filter queries based on the properties of random walks, and using novel path-length-bound and graph-sampling approximation techniques. We experimentally validate our optimization techniques on large biological and bibliographic datasets. Our techniques can produce high quality (Top K) answers with a savings of up to an order of magnitude, in comparison to the evaluation time for the exact solution.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
S. Agrawal, S. Chaudhuri and G. Das: "DBXplorer: A System for Keyword-Based Search Over Relational Databases", IEEE ICDE, 2002.
|
| |
4
|
|
| |
5
|
|
| |
6
|
G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti and S. Sudarshan: "Keyword Searching and Browsing in Databases using BANKS", IEEE ICDE, 2002.
|
 |
7
|
Ronald Fagin , Ravi Kumar , Mohammad Mahdian , D. Sivakumar , Erik Vee, Comparing and aggregating rankings with ties, Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 14-16, 2004, Paris, France
[doi> 10.1145/1055558.1055568]
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
 |
11
|
Guang Feng , Tie-Yan Liu , Ying Wang , Ying Bao , Zhiming Ma , Xu-Dong Zhang , Wei-Ying Ma, AggregateRank: bringing order to web sites, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148187]
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
V. Hristidis, Y. Papakonstantinou and A. Balmin: "Keyword Proximity Search on XML Graphs", IEEE ICDE, 2003.
|
| |
18
|
L. Katz: "A New Status Index derived from Sociometric Analysis". Psychometrika, 1953, vol. 18, issue 1.
|
| |
19
|
G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, G. Weikum: NAGA: Searching and Ranking Knowledge. ICDE 2008: 953--962.
|
| |
20
|
|
| |
21
|
A. Mendelzon, G. Mihalia, T. Milo: Querying the World Wide Web. Journal on Digital Libraries 1(1):54--67, 1997.
|
 |
22
|
|
| |
23
|
L. Page, S. Brin, R. Motwani and T. Winograd: "The pagerank citation ranking: Bringing order to the web", Technical report, Stanford University, 1998.
|
| |
24
|
|
 |
25
|
Louiqa Raschid , Yao Wu , Woei-Jyh Lee , María Esther Vidal , Panayiotis Tsaparas , Padmini Srinivasan , Aditya Kumar Sehgal, Ranking target objects of navigational queries, Proceedings of the 8th annual ACM international workshop on Web information and data management, November 10-10, 2006, Arlington, Virginia, USA
[doi> 10.1145/1183550.1183558]
|
| |
26
|
|
| |
27
|
A. Singhal: "Modern Information Retrieval: A Brief Overview". Google, IEEE Data Eng. Bull, 2001.
|
| |
28
|
SPARQL: Query Language for RDF: http://www.w3.org/TR/rdf-sparql-query/
|
| |
29
|
R. Varadarajan, V. Hristidis, L. Raschid: Explaining and Reformulating Authority Flow Queries. IEEE ICDE, 2008.
|
| |
30
|
R. Varadarajan, V. Hristidis, L. Raschid, M. Vidal, L. Lbanez and H. Drumond: Flexible and Efficient Querying and Ranking of Hyperlinked Data Source (extended version). http://dbir.cs.fiu.edu/WebSearch/GID.pdf.
|
|