| QoX-driven ETL design: reducing the cost of ETL consulting engagements |
| Full text |
Pdf
(726 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 35th SIGMOD international conference on Management of data
table of contents
Providence, Rhode Island, USA
SESSION: Industrial session 6: industrial directions
table of contents
Pages 953-960
Year of Publication: 2009
ISBN:978-1-60558-551-2
|
|
Authors
|
|
Alkis Simitsis
|
HP Labs, Palo Alto, CA, USA
|
|
Kevin Wilkinson
|
HP Labs, Palo Alto, CA, USA
|
|
Malu Castellanos
|
HP Labs, Palo Alto, CA, USA
|
|
Umeshwar Dayal
|
HP Labs, Palo Alto, CA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 41, Downloads (12 Months): 137, Citation Count: 0
|
|
|
ABSTRACT
As business intelligence becomes increasingly essential for organizations and as it evolves from strategic to operational, the complexity of Extract-Transform-Load (ETL) processes grows. In consequence, ETL engagements have become very time consuming, labor intensive, and costly. At the same time, additional requirements besides functionality and performance need to be considered in the design of ETL processes. In particular, the design quality needs to be determined by an intricate combination of different metrics like reliability, maintenance, scalability, and others. Unfortunately, there are no methodologies, modeling languages or tools to support ETL design in a systematic, formal way for achieving these quality requirements. The current practice handles them with ad-hoc approaches only based on designers' experience. This results in either poor designs that do not meet the quality objectives or costly engagements that require several iterations to meet them. A fundamental shift that uses automation in the ETL design task is the only way to reduce the cost of these engagements while obtaining optimal designs. Towards this goal, we present a novel approach to ETL design that incorporates a suite of quality metrics, termed QoX, at all stages of the design process. We discuss the challenges and tradeoffs among QoX metrics and illustrate their impact on alternative designs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
L. Chung, B.A. Nixon, E. Yu, J. Mylopoulos. Non-Functional Requirements in Software Engineering. Kluwer Academic Publishing, 1999.
|
 |
2
|
Nilesh N. Dalvi , Sumit K. Sanghai , Prasan Roy , S. Sudarshan, Pipelining in multi-query optimization, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.59-70, May 2001, Santa Barbara, California, United States
[doi> 10.1145/375551.375561]
|
 |
3
|
|
| |
4
|
|
| |
5
|
Informatica. How to Achieve Flexible, Cost-effective Scalability and Performance through Pushdown Processing. White paper, November 2007.
|
| |
6
|
|
| |
7
|
Ralph Kimball , Laura Reeves , Warren Thornthwaite , Margy Ross , Warren Thornwaite, The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom, John Wiley & Sons, Inc., New York, NY, 1998
|
| |
8
|
G. Papastefanatos, P. Vassiliadis, A. Simitsis, Y. Vassiliou. Policy-Regulated Management of ETL Evolution. In Springer JoDS, Vol. XIII, pp. 146--176, 2009.
|
| |
9
|
N. Polyzotis, S. Skiadopoulos, P. Vassiliadis, A. Simitsis, N.-E. Frantzell. Supporting Streaming Updates in an Active Data Warehouse. In ICDE, pp. 476--485, 2007.
|
 |
10
|
Prasan Roy , S. Seshadri , S. Sudarshan , Siddhesh Bhobe, Efficient and extensible algorithms for multi query optimization, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.249-260, May 15-18, 2000, Dallas, Texas, United States
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
P. Vassiliadis, A. Simitsis. Near Real Time ETL. In Springer Annals of Information Systems, Vol. 3, pp. 19--29, 2008.
|
| |
16
|
P. Vassiliadis, A. Simitsis, M. Terrovitis, S. Skiadopoulos. Blueprints and Measures for ETL Workflows. In ER, pp. 385--400, 2005.
|
|