|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
Efficient management of RDF data is an important factor in realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance for RDF databases and consider a recent suggestion, "property tables." We then discuss practically and empirically why this solution has undesirable features. As an improvement, we propose an alternative solution: vertically partitioning the RDF data. We compare the performance of vertical partitioning with prior art on queries generated by a Web-based RDF browser over a large-scale (more than 50 million triples) catalog of library data. Our results show that a vertical partitioned schema achieves similar performance to the property table technique while being much simpler to design. Further, if a column-oriented DBMS (a database architected specially for the vertically partitioned case) is used instead of a row-oriented DBMS, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Library catalog data. http://simile.mit.edu/rdf-test-data/barton/.
|
| |
2
|
Longwell website. http://simile.mit.edu/longwell/.
|
| |
3
|
Redland RDF Application Framework. http://librdf.org/.
|
| |
4
|
Simile website. http://simile.mit.edu/.
|
| |
5
|
Swoogle. http://swoogle.umbc.edu/.
|
| |
6
|
Uniprot rdf dataset. http://dev.isb-sib.ch/projects/uniprot-rdf/.
|
| |
7
|
Wordnet rdf dataset. http://www.cogsci.princeton.edu/~wn/.
|
| |
8
|
World Wide Web Consortium (W3C). http://www.w3.org/.
|
| |
9
|
RDF Primer. W3C Recommendation. http://www.w3.org/TR/rdf-primer, 2004.
|
| |
10
|
RDQL - A Query Language for RDF. W3C Member Submission 9 January 2004. http://www.w3.org/Submission/RDQL/, 2004.
|
| |
11
|
SPARQL Query Language for RDF. W3C Working Draft 4 October 2006. http://www.w3.org/TR/rdf-sparql-query/, 2006.
|
| |
12
|
D. Abadi, A. Marcus, S. Madden, and K. Hollenbach. Using the Barton libraries dataset as an RDF benchmark. Technical Report MIT-CSAIL-TR-2007-036, MIT.
|
| |
13
|
D. J. Abadi. Column stores for wide and sparse data. In CIDR, 2007.
|
 |
14
|
|
| |
15
|
D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. R. Madden. Materialization strategies in a column-oriented DBMS. In Proc. of ICDE, 2007.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, pages 225--237, 2005.
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
| |
24
|
J. Corwin, A. Silberschatz, P. L. Miller, and L. Marenco. Dynamic tables: An architecture for managing evolving, heterogeneous biomedical data in relational database management systems. Journal of the American Medical Informatics Association, 14(1):86--93, 2007.
|
| |
25
|
D. Florescu and D. Kossmann. Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull., 22(3):27--34, 1999.
|
| |
26
|
S. Harris and N. Gibbins. 3store: Efficient bulk RDF storage. In In Proc. of PSSS'03, pages 1--15, 2003.
|
| |
27
|
|
| |
28
|
|
| |
29
|
Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999
|
| |
30
|
Mike Stonebraker , Daniel J. Abadi , Adam Batkin , Xuedong Chen , Mitch Cherniack , Miguel Ferreira , Edmond Lau , Amerson Lin , Sam Madden , Elizabeth O'Neil , Pat O'Neil , Alex Rasin , Nga Tran , Stan Zdonik, C-store: a column-oriented DBMS, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
31
|
K. Wilkinson. Jena property table implementation. In SSWS, 2006.
|
| |
32
|
K. Wilkinson, C. Sayers, H. Kuno, and D. Reynolds. Efficient RDF Storage and Retrieval in Jena2. In SWDB, pages 131--150, 2003.
|
CITED BY 19
|
|
Stefan Aulbach , Torsten Grust , Dean Jacobs , Alfons Kemper , Jan Rittinger, Multi-tenant databases for software as a service: schema-mapping techniques, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
Stelios Paparizos , Alexandros Ntoulas , John Shafer , Rakesh Agrawal, Answering web queries using structured data sources, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Michael Stonebraker , Samuel Madden , Daniel J. Abadi , Stavros Harizopoulos , Nabil Hachem , Pat Helland, The end of an architectural era: (it's time for a complete rewrite), Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
George Eadon , Eugene Inseok Chong , Shrikanth Shankar , Ananth Raghavan , Jagannathan Srinivasan , Souripriya Das, Supporting table partitioning by reference in oracle, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Li Ma , Chen Wang , Jing Lu , Feng Cao , Yue Pan , Yong Yu, Effective and efficient semantic web data management over DB2, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|