ACM Home Page
Please provide us with feedback. Feedback
A corpus analysis of simple account texts and the proposal of simplification strategies: first steps towards text simplification systems
Full text PdfPdf (429 KB)
Source
ACM Special Interest Group for Design of Communication archive
Proceedings of the 26th annual ACM international conference on Design of communication table of contents
Lisbon, Portugal
SESSION: Documentation and design table of contents
Pages 15-22  
Year of Publication: 2008
ISBN:978-1-60558-083-8
Authors
Sandra M. Aluísio  Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil
Lucia Specia  Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil
Thiago A. S. Pardo  Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil
Erick G. Maziero  Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil
Helena M. Caseli  Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil
Renata P. M. Fortes  Universidade de São Paulo, São Carlos/SP, Brasil
Sponsor
SIGDOC: ACM Special Interest Group for Design of Communications
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 106,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1456536.1456540
What is a DOI?

ABSTRACT

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese (BP). This study illustrates the need for text simplification to facilitate accessibility to information by poor readers and by people with cognitive disabilities. It also highlights features of simplification for BP, which may differ from other languages. Moreover, we propose simplification strategies and a Simplification Annotation Editor. This study consists of the first step towards building BP text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by news agencies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ribeiro, V. M.: Analfabetismo e alfabetismo funcional no Brasil. Boletim INAF. São Paulo: Instituto Paulo Montenegro (2006)
 
2
Rino, L. H. M., Pardo, T. A. S., Silla Jr., C. N., Kaestner, C. A., Pombo, M.: A Comparison of Automatic Summarization Systems for Brazilian Portuguese Texts. SBIA 2004, LNAI, vol. 3171, pp. 235--244. Springer, Heidelberg (2004)
 
3
Feltrim, V., Pelizzoni, J. M., Teufel, S., Nunes, M. G. V., Aluísio, S. M.: Applying Argumentative Zoning in an Automatic Critiquer of Academic Writing. SBIA 2004, LNAI, vol. 3171, pp. 1--10. Springer, Heidelberg (2004)
 
4
Pardo, T. A. S., Nunes, M. G. V.: Review and Evaluation of DiZer - An Automatic Discourse Analyzer for Brazilian Portuguese. PROPOR 2006, LNCS, vol. 3960, pp. 180--189. (2006)
 
5
Mapleson, D. L.: Post-Grammatical Processing for Discourse Segmentation. PhD Thesis. School of Computing Sciences, University of East Anglia, Norwich (2006)
 
6
Max, A.: Writing for Language-impaired Readers. In the Proceedings of Seventh International Conference on Intelligent Text Processing and Computational Linguistics. CICLing 2006, pp. 567--570. (2006).
 
7
Petersen, S. E., Ostendorf, M.: Text Simplification for Language Learners: A Corpus Analysis. Speech and Language Technology for Education workshop, October 2007, Pennsylvania, USA. Available at: www.sarahpetersen.net/portfolio/Petersen_Ostendorf_SLaTE2007_final.pdf (2007)
 
8
Siddharthan, A. Syntactic Simplification and Text Cohesion. PhD Thesis. University of Cambridge (2003)
 
9
 
10
Klebanov, B., Knight, K., Marcu, D.: Text Simplification for Information-Seeking Applications. On the Move to Meaningful Internet Systems. LNCS, vol. 3290, pp. 735--747. Springer-Verlag (2004)
11
 
12
 
13
Chandrasekar, R., Srinivas, B.: Automatic induction of rules for text simplification. Knowledge-Based Systems, 10, 183--190. (1997)
 
14
Williams, S.: Natural Language Generation (NLG) of discourse relations for different reading levels. PhD Thesis, University of Aberdeen. (2004)
 
15
Williams, S., Reiter, E.: A corpus analysis of discourse relations for Natural Language Generation Proceedings of Corpus Linguistics 2003, Lancaster University pp. 899--908. (2003)
 
16
Siddharthan, A.: Syntactic Simplification and Text Cohesion. Research on Language and Computation 4:77--109. Volume 4, Number 1 / June, (2006)
 
17
McNamara, D. S., Louwerse, M. M., Graesser, A. C.: Coh-Metrix: Automated cohesion and coherence scores to predict text readability and facilitate comprehension. Grant proposal. Available at: http://cohmetrix.memphis.edu/cohmetrixpr/publications.html (2002)
 
18
Cook, A. M., Hussey, S. M.: Assistive Technologies: Principles and Practice. Mosby (1995)
19
20
21
 
22
Meireles, V., Spinillo, A. G.: Uma análise da coesão textual e da estrutura narrativa em textos escritos por adolescentes surdos. Estudos de Psicologia, V. 9, N. 1, pp. 131--144. (2004)
 
23
 
24
Daelemans, W., Hothker, A., Sang, E. T. K.: Automatic Sentence Simplification for Subtitling in Dutch and English., LREC 2004, pp. 1045--1048. (2004)
 
25
Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J.: Practical simplification of English newspaper text to assist aphasic readers. In the Proceedings of AAAI-98 Workshop on Integrating Artificial Intelligence and Assistive Technology. (1998)
26
 
27
Ramos, W. M.: A compreensão leitora e a ação docente na produção do texto para o ensino a distância. Linguagem & Ensino, Vol. 9, No. 1, pp. 215--242. Universidade de Brasíília. (2006)
 
28
Widdowson, H. G.: Teaching language as communication. Oxford: Oxford University Press. (1978)
 
29
Williams S., Reiter E.: Generating basic skills reports for low-skilled readers. To appear in Natural Language Engineering. In press. (2008)
 
30
Williams S., Reiter E.: Generating Readable Texts for Readers with Low Basic Skills. Proceedings of ENLG-2005, pp. 140--147. (2005)
 
31
Carvalho Netto, J. R.: Ao Encontro da Lei: O Novo Código Civil ao alcance de todos. São Paulo: Imprensa Oficial. (2003)
 
32
Biderman, M. T. C. DICIONÁRIO ILUSTRADO DE PORTUGUÊS. São Paulo, Editora Ática. 1a. ed. São Paulo: Ática. (2005)
 
33
Janczura, G. A., Castilho, G. M., Rocha, N. O.: Normas de concretude para 909 palavras da língua portuguesa. Psic.: Teor. e Pesq. {online}., vol. 23, pp. 195--204. (2007)
 
34
Graesser, A., McNamara, D. S., Louwerse, M., & Cai, Z.: Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers, 36, pp.193--202. (2004)
 
35
Muniz, M., Paulovich, F. V., Minghim, R., Infante, K., Muniz, F., Vieira, R., Aluísio, S.: Taming the tiger topic: an XCES compliant corpus Portal to generate subcorpus based on automatic text topic identification. In: Proceedings of the Corpus Linguistics Conference. pp. 1--18 (2007)
 
36
Bick, E.: The Parsing System "Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. PhD thesis. Aarhus University. Denmark University Press. (2000)
 
37
Muller, C., Strube, M.: Multi-Level Annotation in MMAX. In Proceedings of the 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan. (2003)
 
38
Specia, L.; Aluisio, S. M.; Pardo, T. A. S.: Manual de Simplificação Sintática para o Português. Technical Report NILC-TR-08-06. São Carlos-SP. (2008)


Collaborative Colleagues:
Sandra M. Aluísio: colleagues
Lucia Specia: colleagues
Thiago A. S. Pardo: colleagues
Erick G. Maziero: colleagues
Helena M. Caseli: colleagues
Renata P. M. Fortes: colleagues