ACM Home Page
Please provide us with feedback. Feedback
From rhetorical structures to document structure: shallow pragmatic analysis for document engineering
Full text PdfPdf (2.13 MB)
Source
Document Engineering archive
Proceedings of the 9th ACM symposium on Document engineering table of contents
Munich, Germany
SESSION: Document and linguistics (I) table of contents
Pages 185-192  
Year of Publication: 2009
ISBN:978-1-60558-575-8
Authors
Gersende Georg  Haute Autorité de Santé, Saint-Denis La Plaine Cedex, France
Hugo Hernault  The University of Tokyo, Tokyo, Japan
Marc Cavazza  University of Teesside, Middlesbrough, United Kingdom
Helmut Prendinger  National Institute of Informatics, Tokyo, Japan
Mitsuru Ishizuka  The University of Tokyo, Tokyo, Japan
Sponsors
SIGDOC: ACM Special Interest Group for Design of Communications
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 18,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1600193.1600235
What is a DOI?

ABSTRACT

In this paper, we extend previous work on the automatic structuring of medical documents using content analysis. Our long-term objective is to take advantage of specific rhetoric markers encountered in specialized medical documents (clinical guidelines) to automatically structure free text according to its role in the document. This should enable to generate multiple views of the same document depending on the target audience, generate document summaries, as well as facilitating knowledge extraction from text. We have established in previous work that the structure of clinical guidelines could be refined through the identification of a limited set of deontic operators. We now propose to extend this approach by analyzing the text delimited by these operators using Rhetorical Structure Theory. The emphasis on causality and time in RST proves a powerful complement to the recognition of deontic structures while retaining the same philosophy of high-level recognition of sentence structure, which can be converted into application-specific mark-ups. Throughout the paper, we illustrate our findings through results produced by the automatic processing of English guidelines for the management of hypertension and Alzheimer disease.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Afantenos S, Karkaletsis V, Stamatopoulos P. Summarization from medical documents: a survey. Artificial Intelligence in Medicine 33 (2), pp. 157--177 (2005).
 
2
Androutsopoulos I, Spiliotopoulos D, Stamatakis K, Dimitromanolaki A, Karkaletsis V, Spyropoulos CD. Symbolic authoring for multilingual natural language generation. In: Methods and Applications of Artificial Intelligence 2308, pp. 131--142 (2002).
 
3
Carlson L, Marcu D, Okurowski ME. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue. Annual Meeting of the ACL. Association for Computational Linguistics 16 pp. 1--10 (2001).
 
4
Fontan L, Saint-Dizier P. Constructing a know-how repository of advices and warnings from procedural texts. Proceedings of the 2008 ACM Symposium on Document Engineering. ACM Press, pp. 249--252 (2008).
 
5
Fuchs N E, Kaljurand K, Schneider G. Attempto Controlled English Meets the Challenges of Knowledge Representation, Reasoning, Interoperability and User Interfaces. FLAIRS Conference, pp. 664--669 (2006).
 
6
Gallardo S. Pragmatic support of medical recommendations in popularized texts. Journal of Pragmatics 37 (6), pp.813--835 (2005).
 
7
Georg G, Cavazza M. Integrating Document-based and Knowledge-based Models for Clinical Guideline Analysis. In: R. Bellazzi, A. Abu-Hanna, J. Hunter (eds). Proceedings of 11th Conference on Artificial Intelligence in Medicine in Europe. LNAI 4594, Springer-Verlag Berlin Heideberg, pp.421--430 (2007).
 
8
Georg G, Jaulent M-C. A Document Engineering Environment for Clinical Guidelines. In: Peter R. King & Steven J. Simske (eds). Proceedings of the 2007 ACM Symposium on Document Engineering. ACM Press, New York NY, USA, pp. 69--78 (2007).
 
9
Georg G, Jaulent M-C. An Environment for Document Engineering of Clinical Guidelines. In: Proceedings AMIA Symposium, pp.276--280 (2005).
 
10
Grasso F. Rhetorical coding of health promotion dialogues. In M. Dojat, E. Keravnou and P. Barahona (eds). Proceedings of the 9th Conference on Artificial Intelligence in Medecine in Europe. Berlin: Springer, pp. 179--188 (2003).
 
11
Kumar N, and Srinathan K, 2008. Automatic keyphrase extraction from scientific documents using N--gram filtration technique. Proceedings of the 2008 ACM Symposium on Document Engineering. ACM Press, pp. 199--208 (2008).
 
12
Mann W C, Thompson S. A. Rhetorical Structure Theory: Toward a functional theory of text organisation. Text 8, no. 3, pp. 243--281 (1988).
 
13
Moulin, B. and Rousseau, D., Knowledge acquisition from prescriptive texts. In Proceedings of the 3rd international conference on Industrial and engineering applications of artificial intelligence and expert systems, (Charleston, South Carolina, United States, 1990), pp. 1112--1121 (1990).
 
14
Patel V, Arocha J, Diermeier M, How J, Mottur-Pilson C. Cognitive psychological studies of representation and use of clinical practice guidelines. International Journal of Medical Informatics 63(3):147--167 (2001).
 
15
Piwek P, Hernault H, Prendinger H, Ishizuka M. T2D: Generating Dialogues Between Virtual Agents Automatically from Text. Intelligent Virtual Agents, pp. 161--174 (2007).
 
16
Reitteer D, Stede M. Step by Step: Underspecified markup in incremental rhetorical analysis. In: Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at the EACL, Budapest (2003).
 
17
Shiffman R, Karras B, Agrawal A, Chen R, Marenco L, Nath S. GEM: A proposal for a more comprehensive guideline document model using XML. J Am Med Informatics Assoc 7, pp.488--498 (2000).
 
18
Soricut R, Marcu D. Sentence level discourse parsing using syntactic and lexical information. In: Proceedings of the 2003 Conference of the North American Chapter of the Association For Computational Linguistics on Human Language Technology 1, pp.149--156 (2003).
 
19
Taboada M, Mann W C. Applications of Rhetorical Structure Theory. Discourse Studies 8 (4), pp. 567--588 (2006).
 
20
Zerida N, Lucas N, and Crémilleux B. Combining linguistic and structural descriptors for mining biomedical literature. Proceedings of the 2006 ACM Symposium on Document Engineering. ACM Press, pp. 62--64 (2006).