ACM Home Page
Please provide us with feedback. Feedback
Natural language reporting for ETL processes
Full text PdfPdf (338 KB)
Source
Data Warehousing and OLAP archive
Proceeding of the ACM 11th international workshop on Data warehousing and OLAP table of contents
Napa Valley, California, USA
SESSION: Multidimensional design and ETL table of contents
Pages 65-72  
Year of Publication: 2008
ISBN:978-1-60558-250-4
Authors
Alkis Simitsis  HP Labs, Palo Alto, CA, USA
Dimitrios Skoutas  National Technical University of Athens, Athens, Greece
Malú Castellanos  HP Labs, Palo Alto, CA, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 170,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458432.1458444
What is a DOI?

ABSTRACT

The conceptual design of the Extract -- Transform -- Load (ETL) processes is a crucial, burdensome, and challenging procedure that takes places at the early phases of a Data Warehouse project. Several models have been proposed for the conceptual design and representation of ETL processes, but all share two inconveniences: they require intensive human effort from the designers to create them, as well as technical knowledge from the business people to understand them. In a previous work, we have relaxed the former difficulty by working on the automation of the conceptual design leveraging Semantic Web technology. In this paper, we built upon our previous results and we tackle the second issue by investigating the application of natural language generation techniques to the ETL environment. In particular, we provide a method for the representation of a conceptual ETL design as a narrative, which is the most natural means of communication and does not require knowledge of any specific model. We discuss how linguistic techniques can be used for the establishment of a common application vocabulary. Finally, we present a flexible and customizable template-based mechanism for generating natural language representations for the ETL process requirements and operations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bontcheva, K.: Generating Tailored Textual Summaries from Ontologies. In ESWC, 2005.
 
2
Bontcheva, K., Wilks, Y.: Automatic Report Generation from Ontologies: The MIAKT Approach. In NLDB, 2004.
 
3
 
4
 
5
IBM. IBM WebSphere DataStage. URL: http://www-306.ibm.com/software/data/integration/datastage/
 
6
Informatica. PowerCenter. URL: http://www.informatica.com/powercenter/
 
7
Kedad, Z., Métais, E.: Ontology-Based Data Cleaning. In NLDB, 2002.
 
8
Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit (chapter 11). Wiley Publishing, Inc., 2004.
 
9
 
10
Kiyavitskaya, N., Zeni, N., Mich, L., Mylopoulos, J.: Experimenting with Linguistic Tools for Conceptual Modelling: Quality of the Models and Critical Features. In NLDB, 2004.
 
11
Kof, L.: Natural Language Processing: Mature Enough for Requirements Documents Analysis? In NLDB, 2005.
 
12
Luján-Mora, S., Vassiliadis, P., Trujillo, J.: Data Mapping Diagrams for Data Warehouse Design with UML. In ER, 2004.
13
 
14
 
15
Microsoft. Data Transformation Services. URL: http://www.microsoft.com/sql/prodinfo/features/
 
16
Oracle. Oracle Warehouse Builder Product Page. URL: http://otn.oracle.com/products/warehouse/content.html
 
17
 
18
Reape, M., Mellish, C.: Just What is Aggregation Anyway? In ENLG, 1999.
 
19
Reiter, E., Mellish, C., Levine, J.: Automatic generation of technical documentation. In Applied Artificial Intelligence 9(3), 1995.
 
20
Rolland, C., Proix, C.: A Natural Language Approach for Requirements Engineering. In CAiSE, 1992.
21
22
23
24
 
25
Skoutas, D., Simitsis, A.: Flexible and Customizable NL Representation of Requirements for ETL processes. In NLDB, 2007.
 
26
Smith, M. K., Welty, C., McGuinness, D. L. OWL Web Ontology Language Guide. W3C Rec. 2004 (http://www.w3.org/TR/owl-guide)
 
27
 
28
 
29
Trujillo, J., Lujan-Mora, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In ER, 2003.
30
 
31
Wilcock, G.: Talking OWLs: Towards an Ontology Verbalizer. In ISWC, 2003.
 
32
Wilcock, G., Jokinen, K.: Generating Responses and Explanations from RDF/XML and DAML+OIL. In IJCAI, 2003.
33


Collaborative Colleagues:
Alkis Simitsis: colleagues
Dimitrios Skoutas: colleagues
Malú Castellanos: colleagues