ACM Home Page
Please provide us with feedback. Feedback
Data mining-based fragmentation of XML data warehouses
Full text PdfPdf (339 KB)
Source
Data Warehousing and OLAP archive
Proceeding of the ACM 11th international workshop on Data warehousing and OLAP table of contents
Napa Valley, California, USA
SESSION: Performance optimization and tuning table of contents
Pages 9-16  
Year of Publication: 2008
ISBN:978-1-60558-250-4
Authors
Hadj Mahboubi  University of Lyon, Lyon, France
Jérôme Darmont  University of Lyon, Lyon, France
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 38,   Downloads (12 Months): 278,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458432.1458435
What is a DOI?

ABSTRACT

With the multiplication of XML data sources, many XML data warehouse models have been proposed to handle data heterogeneity and complexity in a way relational data warehouses fail to achieve. However, XML-native database systems currently suffer from limited performances, both in terms of manageable data volume and response time. Fragmentation helps address both these issues. Derived horizontal fragmentation is typically used in relational data warehouses and can definitely be adapted to the XML context. However, the number of fragments produced by classical algorithms is difficult to control. In this paper, we propose the use of a k-means-based fragmentation approach that allows to master the number of fragments through its k parameter. We experimentally compare its efficiency to classical derived horizontal fragmentation algorithms adapted to XML data warehouses and show its superiority.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
K. Aouiche, P.-E. Jouve, and J. Darmont. Clustering-based materialized view selection in data warehouses. In 10th East-European Conference on Advances in Databases and Information Systems (ADBIS 06), Thessaloniki, Greece, volume 4152 of LNCS, pages 81--95. Springer, September 2006.
 
4
S. Azefack, K. Aouiche, and J. Darmont. Dynamic index selection in data warehouses. In 4th International Conference on Innovations in Information Technology (Innovations 07), Dubai, United Arab Emirates. IEEE Computer Society, November 2007.
 
5
L. Bellatreche and K. Boukhalfa. An Evolutionary Approach to Schema Partitioning Selection in a Data Warehouse. In 7th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 05), Copenhagen, Denmark, volume 3589 of LNCS, pages 115--125. Springer, 2005.
6
 
7
 
8
A. Bonifati, A. Cuzzocrea, and B. Zinno. Fragmenting XML Documents via Structural Constraints. In 10th East European Conference on Advances in Databases and Information Systems (ADBIS 06), Thessaloniki, Greece, Local Proceedings, pages 17--29, 2006.
9
 
10
S. Bose and L. Fegaras. XFrag: A query Processing Framework for Fragmented XML Data. In 8th International Workshop on the Web and Databases (WebDB 05), Baltimore, Maryland, pages 97--102, 2005.
 
11
K. Boukhalfa, L. Bellatreche, and P. Richard. Fragmentations Primaire et Dérivée : Etude de Complexité, Algorithmes de Sélection et Validation sous Oracle 10g. In 4èmes journées francophones sur les Entrepôts de Données et l'Analyse en ligne (EDA 08), Toulouse, France, volume B-4 of RNTI, pages 123--139. Cépaduès, June 2008.
 
12
D. Boukraa, R. BenMessaoud, and O. Boussaïd. Proposition d'un Modèle physique pour les entrepôts XML. In Atelier Systèmes Décisionnels (ASD 06), 9th Maghrebian Conference on Information Technologies (MCSEAI 06), Agadir, Morocco, 2006.
 
13
O. Boussaïd, R. BenMessaoud, R. Choquet, and S. Anthoard. X-Warehousing: An XML-Based Approach for Warehousing Complex Data. In 10th East-European Conference on Advances in Databases and Information Systems (ADBIS 06), Thessaloniki, Greece, volume 4152 of LNCS, pages 39--54. Springer, 2006.
 
14
A. S. Darabant and A. Campan. Semi-supervised learning techniques: k-means clustering in OODB Fragmentation. In Second IEEE International Conference on Computational Cybernetics (ICCC 04), Vienna, Austria, pages 333--338. IEEE Computer Society, 2004.
 
15
 
16
R. L. de Carvalho Costa and P. Furtado. Data Warehouses in Grids with High QoS. In 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 06), Krakow, Poland, volume 4081 of LNCS, pages 207--217. Springer, 2006.
 
17
 
18
M. Gertz and J.-M. Bremer. Distributed XML Repositories: Top-down Design and Transparent Query Processing. Technical report, Departement of Computer Science, University of California, USA, 2003.
 
19
M. Golfarelli, D. Maio, and S. Rizzi. Vertical fragmentation of views in relational data warehouses. In Settimo Convegno Nazionale su Sistemi Evoluti Per Basi Di Dati (SEBD 99), Como, Italy, pages 19--33, 1999.
 
20
N. Gorla and P. W. Y. Betty. Vertical Fragmentation in Databases Using Data-Mining Technique. International Journal of Data Warehousing and Mining, 4(3):35--53, 2008.
21
 
22
G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. In Second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia, 1994.
 
23
A. Koreichi and B. L. Cun. On data fragmentation and allocation in distributed object oriented databases. Technical report, PRiSM, Versailles University, France, 1997.
 
24
H. Ma and K.-D. Schewe. Fragmentation of XML Documents. In XVIII Simpósio Brasileiro de Bancos de Dados, Manaus, Amazonas, Brasil, pages 200--214. UFAM, 2003.
 
25
H. Ma, K.-D. Schewe, S. Hartmann, and M. Kirchberg. Distribution Design for XML Documents. In 3rd International Conference on Electronic Commerce Engineering (ICECE 03), Hangzhou, China, pages 1007--1012. International Academic Publisher/World Publishing Corporation, 2003.
 
26
J. MacQueen. Some Methods for classification and Analysis of Multivariate Observations. In 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, USA, pages 281--297. University of California Press, 1967.
 
27
H. Mahboubi and J. Darmont. Benchmarking XML data warehouses. In Atelier Systèmes Décisionnels (ASD 06), 9th Maghrebian Conference on Information Technologies (MCSEAI 06), Agadir, Morocco, 2006.
 
28
H. Mahboubi and J. Darmont. Enhancing XML Data Warehouse Performance by Fragmentation. Technical report, Technical report, ERIC, University of Lyon 2, France, 2008.
 
29
H. Mahboubi, M. Hachicha, and J. Darmont. XML Warehousing and OLAP. Encyclopedia of Data Warehousing and Mining, Second Edition. IGI Publishing, August 2008.
 
30
D. Munneke, K. Wahlstrom, and M. K. Mohania. Fragmentation of multidimensional databases. In 10th Australasian Database Conference (ADC 99), Auckland, New Zealand, pages 153--164, 1999.
 
31
S. B. Navathe, K. Karlapalem, and M. Ra. A Mixed Fragmentation Methodology for Initial Distributed Database Design. Journal of Computer and Software Engineering, 3(4), 1995.
32
33
 
34
D. Pham, S. Dimov, and C. Nguyen. An Incremental K-means algorithm. Journal of Mechanical Engineering Science, 218(7):783--795, 2004.
 
35
 
36
 
37
N. Wiwatwattana, H. V. Jagadish, L. V. S. Lakshmanan, and D. Srivastava. X^3: A Cube Operator for XML OLAP. In 23rd International Conference on Data Engineering (ICDE 07), Istanbul, Turkey, pages 916--925, 2007.
 
38
M.-C. Wu and A. P. Buchmann. Research Issues in Data Warehousing. In Datenbanksysteme in Buro, Technik und Wissenschaft, pages 61--82, 1997.
39
 
40
Y. Zhang and O. Orlowska. On fragmentation approaches for distributed database design. Information Sciences, 1(3):117--132, 1994.


Collaborative Colleagues:
Hadj Mahboubi: colleagues
Jérôme Darmont: colleagues