|
ABSTRACT
Despite the intense interest towards realizing the Semantic Web vision, most existing RDF data management schemes are constrained in terms of efficiency and scalability. Still, the growing popularity of the RDF format arguably calls for an effort to offset these drawbacks. Viewed from a relational-database perspective, these constraints are derived from the very nature of the RDF data model, which is based on a triple format. Recent research has attempted to address these constraints using a vertical-partitioning approach, in which separate two-column tables are constructed for each property. However, as we show, this approach suffers from similar scalability drawbacks on queries that are not bound by RDF property value. In this paper, we propose an RDF storage scheme that uses the triple nature of RDF as an asset. This scheme enhances the vertical partitioning idea and takes it to its logical conclusion. RDF data is indexed in six possible ways, one for each possible ordering of the three RDF elements. Each instance of an RDF element is associated with two vectors; each such vector gathers elements of one of the other types, along with lists of the third-type resources attached to each vector element. Hence, a sextuple-indexing scheme emerges. This format allows for quick and scalable general-purpose query processing; it confers significant advantages (up to five orders of magnitude) compared to previous approaches for RDF data management, at the price of a worst-case five-fold increase in index space. We experimentally document the advantages of our approach on real-world and synthetic data sets with practical queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Longwell browser, http://simile.mit.edu/longwell.
|
| |
2
|
MIT Libraries Barton Catalog Data. http://simile.mit.edu/rdf-test-data/barton/.
|
| |
3
|
The SIMILE Project, http://simile.mit.edu/.
|
 |
4
|
|
| |
5
|
|
| |
6
|
D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Using the Barton Libraries dataset as an RDF benchmark. Technical Report MIT-CSAIL-TR-2007-036, MIT, 2007.
|
| |
7
|
D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. Madden. Materialization strategies in a column-oriented DBMS. In ICDE, 2007.
|
| |
8
|
S. Alexaki, V. Christophides, G. Karvounarakis, and D. Plexousakis. On storing voluminous RDF descriptions: The case of web portal catalogs. In WebDB, 2001.
|
| |
9
|
S. Alexaki, V. Christophides, G. Karvounarakis, D. Plexousakis, and K. Tolle. The ICS-FORTH RDFSuite: Managing voluminous RDF description bases. In SemWeb, 2001.
|
| |
10
|
R. Angles and C. Gutiérrez. Querying RDF data from a graph database perspective. In ESWC, 2005.
|
 |
11
|
|
| |
12
|
|
| |
13
|
T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 284(5):34--43, 2001.
|
| |
14
|
|
| |
15
|
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, 2005.
|
| |
16
|
V. Bönström, A. Hinze, and H. Schweppe. Storing RDF as a graph. In LA-WEB, 2003.
|
| |
17
|
|
 |
18
|
|
 |
19
|
Jeremy J. Carroll , Ian Dickinson , Chris Dollin , Dave Reynolds , Andy Seaborne , Kevin Wilkinson, Jena: implementing the semantic web recommendations, Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, May 19-21, 2004, New York, NY, USA
[doi> 10.1145/1013367.1013381]
|
| |
20
|
|
 |
21
|
|
| |
22
|
R. V. Guha. rdfDB: An RDF database. http://www.guha.com/rdfdb/.
|
| |
23
|
Y. Guo, J. Heflin, and Z. Pan. Benchmarking DAML+OIL repositories. In ISWC, 2003.
|
| |
24
|
Y. Guo, Z. Pan, and J. Heflin. An evaluation of knowledge base systems for large OWL datasets. In ISWC, 2004.
|
| |
25
|
S. Harris and N. Gibbins. 3store: Efficient bulk RDF storage. In PSSS, 2003.
|
| |
26
|
S. Harris and N. Shadbolt. SPARQL query processing with conventional relational database systems. In SSWS, 2005.
|
| |
27
|
|
| |
28
|
J. Hayes and C. Gutiérrez. Bipartite graphs as intermediate model for RDF. In ISWC, 2004.
|
| |
29
|
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, 2007.
|
 |
30
|
|
 |
31
|
|
| |
32
|
M. L. Kersten and S. Manegold. Cracking the database store. In CIDR, 2005.
|
| |
33
|
C. Kiefer, A. Bernstein, and M. Stocker. The fundamentals of iSPARQL - a virtual triple approach for similarity-based Semantic Web tasks. In ISWC, 2007.
|
| |
34
|
Y. Kim, B. Kim, J. Lee, and H. Lim. The path index for query processing on RDF and RDF Schema. In ICACT, 2005.
|
| |
35
|
E. Liarou, S. Idreos, and M. Koubarakis. Continuous RDF query processing over DHTs. In ISWC, 2007.
|
 |
36
|
Li Ma , Zhong Su , Yue Pan , Li Zhang , Tao Liu, RStar: an RDF storage and query system for enterprise resource management, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA
[doi> 10.1145/1031171.1031264]
|
| |
37
|
F. Manola and E. Miller, editors. RDF Primer. W3C Recommendation. WWW Consortium, 2004.
|
| |
38
|
Akiyoshi Matono , Toshiyuki Amagasa , Masatoshi Yoshikawa , Shunsuke Uemura, A path-based relational RDF database, Proceedings of the 16th Australasian database conference, p.95-103, January 01, 2005, Newcastle, Australia
|
| |
39
|
Z. Pan and J. Heflin. DLDB: Extending relational databases to support Semantic Web queries. In PSSS, 2003.
|
 |
40
|
|
 |
41
|
Markus Stocker , Andy Seaborne , Abraham Bernstein , Christoph Kiefer , Dave Reynolds, SPARQL basic graph pattern optimization using selectivity estimation, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
[doi> 10.1145/1367497.1367578]
|
| |
42
|
Mike Stonebraker , Daniel J. Abadi , Adam Batkin , Xuedong Chen , Mitch Cherniack , Miguel Ferreira , Edmond Lau , Amerson Lin , Sam Madden , Elizabeth O'Neil , Pat O'Neil , Alex Rasin , Nga Tran , Stan Zdonik, C-store: a column-oriented DBMS, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
 |
43
|
|
| |
44
|
R. Volz, D. Oberle, S. Staab, and B. Motik. KAON SERVER - A Semantic Web Management System. In WWW (Alternate Paper Tracks), 2003.
|
| |
45
|
K. Wilkinson. Jena property table implementation. In SSWS, 2006.
|
| |
46
|
K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient RDF storage and retrieval in Jena2. In SWDB, 2003.
|
| |
47
|
D. Wood, P. Gearon, and T. Adams. Kowari: A platform for Semantic Web storage and analysis. In XTech, 2005.
|
|