ACM Home Page
Please provide us with feedback. Feedback
Hexastore: sextuple indexing for semantic web data management
Full text PdfPdf (647 KB)
Source
Proceedings of the VLDB Endowment archive
Volume 1 ,  Issue 1  (August 2008) table of contents
SESSION: Indexing and query processing table of contents
Pages 1008-1019  
Year of Publication: 2008
ISSN:2150-8097
Authors
Cathrin Weiss  University of Zurich, Zurich, Switzerland
Panagiotis Karras  National University of Singapore, Singapore
Abraham Bernstein  University of Zurich, Zurich, Switzerland
Publisher
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 146,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1453856.1453965
What is a DOI?

ABSTRACT

Despite the intense interest towards realizing the Semantic Web vision, most existing RDF data management schemes are constrained in terms of efficiency and scalability. Still, the growing popularity of the RDF format arguably calls for an effort to offset these drawbacks. Viewed from a relational-database perspective, these constraints are derived from the very nature of the RDF data model, which is based on a triple format. Recent research has attempted to address these constraints using a vertical-partitioning approach, in which separate two-column tables are constructed for each property. However, as we show, this approach suffers from similar scalability drawbacks on queries that are not bound by RDF property value. In this paper, we propose an RDF storage scheme that uses the triple nature of RDF as an asset. This scheme enhances the vertical partitioning idea and takes it to its logical conclusion. RDF data is indexed in six possible ways, one for each possible ordering of the three RDF elements. Each instance of an RDF element is associated with two vectors; each such vector gathers elements of one of the other types, along with lists of the third-type resources attached to each vector element. Hence, a sextuple-indexing scheme emerges. This format allows for quick and scalable general-purpose query processing; it confers significant advantages (up to five orders of magnitude) compared to previous approaches for RDF data management, at the price of a worst-case five-fold increase in index space. We experimentally document the advantages of our approach on real-world and synthetic data sets with practical queries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Longwell browser, http://simile.mit.edu/longwell.
 
2
MIT Libraries Barton Catalog Data. http://simile.mit.edu/rdf-test-data/barton/.
 
3
The SIMILE Project, http://simile.mit.edu/.
4
 
5
 
6
D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Using the Barton Libraries dataset as an RDF benchmark. Technical Report MIT-CSAIL-TR-2007-036, MIT, 2007.
 
7
D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. Madden. Materialization strategies in a column-oriented DBMS. In ICDE, 2007.
 
8
S. Alexaki, V. Christophides, G. Karvounarakis, and D. Plexousakis. On storing voluminous RDF descriptions: The case of web portal catalogs. In WebDB, 2001.
 
9
S. Alexaki, V. Christophides, G. Karvounarakis, D. Plexousakis, and K. Tolle. The ICS-FORTH RDFSuite: Managing voluminous RDF description bases. In SemWeb, 2001.
 
10
R. Angles and C. Gutiérrez. Querying RDF data from a graph database perspective. In ESWC, 2005.
11
 
12
 
13
T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 284(5):34--43, 2001.
 
14
 
15
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, 2005.
 
16
V. Bönström, A. Hinze, and H. Schweppe. Storing RDF as a graph. In LA-WEB, 2003.
 
17
18
19
 
20
21
 
22
R. V. Guha. rdfDB: An RDF database. http://www.guha.com/rdfdb/.
 
23
Y. Guo, J. Heflin, and Z. Pan. Benchmarking DAML+OIL repositories. In ISWC, 2003.
 
24
Y. Guo, Z. Pan, and J. Heflin. An evaluation of knowledge base systems for large OWL datasets. In ISWC, 2004.
 
25
S. Harris and N. Gibbins. 3store: Efficient bulk RDF storage. In PSSS, 2003.
 
26
S. Harris and N. Shadbolt. SPARQL query processing with conventional relational database systems. In SSWS, 2005.
 
27
 
28
J. Hayes and C. Gutiérrez. Bipartite graphs as intermediate model for RDF. In ISWC, 2004.
 
29
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, 2007.
30
31
 
32
M. L. Kersten and S. Manegold. Cracking the database store. In CIDR, 2005.
 
33
C. Kiefer, A. Bernstein, and M. Stocker. The fundamentals of iSPARQL - a virtual triple approach for similarity-based Semantic Web tasks. In ISWC, 2007.
 
34
Y. Kim, B. Kim, J. Lee, and H. Lim. The path index for query processing on RDF and RDF Schema. In ICACT, 2005.
 
35
E. Liarou, S. Idreos, and M. Koubarakis. Continuous RDF query processing over DHTs. In ISWC, 2007.
36
 
37
F. Manola and E. Miller, editors. RDF Primer. W3C Recommendation. WWW Consortium, 2004.
 
38
 
39
Z. Pan and J. Heflin. DLDB: Extending relational databases to support Semantic Web queries. In PSSS, 2003.
40
41
 
42
43
 
44
R. Volz, D. Oberle, S. Staab, and B. Motik. KAON SERVER - A Semantic Web Management System. In WWW (Alternate Paper Tracks), 2003.
 
45
K. Wilkinson. Jena property table implementation. In SSWS, 2006.
 
46
K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient RDF storage and retrieval in Jena2. In SWDB, 2003.
 
47
D. Wood, P. Gearon, and T. Adams. Kowari: A platform for Semantic Web storage and analysis. In XTech, 2005.


Collaborative Colleagues:
Cathrin Weiss: colleagues
Panagiotis Karras: colleagues
Abraham Bernstein: colleagues