ACM Home Page
Please provide us with feedback. Feedback
Conceptual modeling for ETL processes
Full text PdfPdf (472 KB)
Source Data Warehousing and OLAP archive
Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP table of contents
McLean, Virginia, USA
Pages: 14 - 21  
Year of Publication: 2002
ISBN:1-58113-590-4
Authors
Panos Vassiliadis  National Technical University of Athens, Athens, Greece
Alkis Simitsis  National Technical University of Athens, Athens, Greece
Spiros Skiadopoulos  National Technical University of Athens, Athens, Greece
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGMIS: ACM Special Interest Group on Management Information Systems
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 43,   Downloads (12 Months): 390,   Citation Count: 12
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/583890.583893
What is a DOI?

ABSTRACT

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. In this paper, we focus on the problem of the definition of ETL activities and provide formal foundations for their conceptual representation. The proposed conceptual model is (a) customized for the tracing of inter-attribute relationships and the respective ETL activities in the early stages of a data warehouse project; (b) enriched with a 'palette' of a set of frequently used ETL activities, like the assignment of surrogate keys, the check for null values, etc; and (c) constructed in a customizable and extensible manner, so that the designer can enrich it with his own re-occurring patterns for ETL activities.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ardent Software. DataStage Suite. http://www.ardentsoftware.com/
 
2
M. Bouzeghoub, F. Fabret, M. Matulovic. Modeling Data Warehouse Refreshment Process as a Workflow Application. In Proc. DMDW'99 (Heidelberg, Germany, 1999).
 
3
V. Borkar, K. Deshmuk, S. Sarawagi. Automatically Extracting Structure from Free Text Addresses. Bulletin of the Technical Committee on Data Engineering, 23, 4, 2000.
 
4
 
5
 
6
D. Calvanese, G. De Giacomo, M. Lenzerini, D. Nardi, R. Rosati. A principled approach to data integration and reconciliation in data warehousing. In Proc. DMDW'99, (Heidelberg, Germany, 1999).
 
7
DataMirror Corporation. Transformation Server. http://www.datamirror.com
 
8
M. Demarest. The politics of data warehousing. http://www.hevanet.com/demarest/marc/dwpol.html
 
9
Evolutionary Technologies Intl. ETI*EXTRACT. http://www.eti.com/
10
 
11
M. Golfarelli, D. Maio, S. Rizzi. The Dimensional Fact Model: a Conceptual Model for Data Warehouses. Invited Paper, International Journal of Cooperative Information Systems, 7, 2&3, 1998.
12
 
13
B. Husemann, J. Lechtenborger, G. Vossen. Conceptual data warehouse modeling. In Proc. DMDW (Stockholm, Sweden, 2000), pp. 6.1--6.11.
 
14
B. Inmon. The Data Warehouse Budget. DM Review Magazine, January 1997. www.dmreview.com/master.cfm?NavID=55&EdID=1315
 
15
 
16
M. Jarke, M.A. Jeusfeld, C. Quix, P. Vassiliadis: Architecture and quality in data warehouses: An extended repository approach. Information Systems, 24, 3, 1999, pp. 229--253.
 
17
 
18
 
19
20
 
21
Microsoft Corp. MS Data Transformation Services. www.microsoft.com/sq
 
22
D.L. Moody, M.A.R. Kortink: From enterprise models to dimensional models: a methodology for data warehouse and data mart design. In Proc. DMDW (Stockholm, Sweden, June 2000).
 
23
A. Monge. Matching Algorithms Within a Duplicate Detection System. Bulletin of the Technical Committee on Data Engineering, 23, 4, 2000.
 
24
 
25
Oracle Corp. Oracle9i™ Warehouse Builder User's Guide, Release 9.0.2. November 2001.
 
26
E. Rahm, H. Do. Data Cleaning: Problems and Current Approaches. Bulletin of the Technical Committee on Data Engineering, 23, 4, 2000.
 
27
 
28
 
29
C. Shilakes, J. Tylman. Enterprise Information Portals. Enterprise Software Team. http://www.sagemaker.com/company/downloads/eip/ indepth.pdf
30
 
31
 
32
A. Tsois. MAC: Conceptual data modeling for OLAP. In Proc. DMDW (Interlaken, Switzerland, 2001
 
33
P. Vassiliadis. Gulliver in the land of data warehousing: practical experiences and observations of a researcher. In Proc. DMDW (Stockholm, Sweden, 2000), pp. 12.1--12.16.
 
34
P. Vassiliadis, A. Simitsis, S. Skiadopoulos. Modeling ETL activities as graphs. In Proc. DMDW (Toronto, Canada, May 2002), pp. 52--61.
 
35
 
36

CITED BY  12

Collaborative Colleagues:
Panos Vassiliadis: colleagues
Alkis Simitsis: colleagues
Spiros Skiadopoulos: colleagues