ACM Home Page
Please provide us with feedback. Feedback
Towards Brazilian Portuguese automatic text simplification systems
Full text PdfPdf (307 KB)
Source
Document Engineering archive
Proceeding of the eighth ACM symposium on Document engineering table of contents
Sao Paulo, Brazil
SESSION: Content processing table of contents
Pages 240-248  
Year of Publication: 2008
ISBN:978-1-60558-081-4
Authors
Sandra M. Aluísio  University of São Paulo, São Carlos, Brazil
Lucia Specia  University of São Paulo, São Carlos, Brazil
Thiago A.S. Pardo  University of São Paulo, São Carlos, Brazil
Erick G. Maziero  University of São Paulo, São Carlos, Brazil
Renata P.M. Fortes  University of São Paulo, São Carlos, Brazil
Sponsors
SIGDOC : ACM Special Interest Group on Systems Documentation
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 24,   Downloads (12 Months): 127,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1410140.1410191
What is a DOI?

ABSTRACT

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese and propose simplification strategies for this language. This study illustrates the need for text simplification to facilitate accessibility to information by poor literacy readers and potentially by people with other cognitive disabilities. It also highlights characteristics of simplification for Portuguese, which may differ from other languages. Such study consists of the first step towards building Brazilian Portuguese text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by relevant news agencies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ribeiro, V. M. 2006. Analfabetismo e alfabetismo funcional no Brasil. Boletim INAF. São Paulo: Instituto Paulo Montenegro.
 
2
Rino, L.H.M., Pardo, T.A.S., Silla Jr., C.N., Kaestner, C.A., Pombo, M. 2004. A Comparison of Automatic Summarization Systems for Brazilian Portuguese Texts. SBIA 2004, Lecture Notes in Artificial Inteligence. 3171, Springer-Verlag, Berlin Heidelberg New York, 235--244.
 
3
Feltrim, V., Pelizzoni, J.M., Teufel, S., Nunes, M.G.V., Aluísio, S.M. 2004. Applying Argumentative Zoning in an Automatic Critiquer of Academic Writing. SBIA 2004, Lecture Notes in Artificial Inteligence. 3171, Springer-Verlag, Berlin Heidelberg New York, 1--10.
 
4
Pardo, T.A.S., Nunes, M.G.V. 2006. Review and Evaluation of DiZer - An Automatic Discourse Analyzer for Brazilian Portuguese. PROPOR 2006, Lecture Notes in Computer Science. 3960, Springer-Verlag, Berlin Heidelberg New York, 180--189.
 
5
Mapleson, D.L. 2006. Post-Grammatical Processing for Discourse Segmentation. PhD Thesis. School of Computing Sciences, University of East Anglia, Norwich.
 
6
Max, A. 2006. Writing for Language-impaired Readers. In Proceedings of Seventh International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, Mexico, February 19-25, 2006). CICLing 2006. Springer-Verlag, Berlin Heidelberg New York, 567--570.
 
7
Petersen, S. E., Ostendorf, M.: Text Simplification for Language Learners: A Corpus Analysis. 2007. In Proceedings of the Speech and Language Technology for Education Workshop (Pennsylvania, USA, October 1-3, 2007). SLaTE-2007. Carnegie Mellon University and ISCA Archive, http://www.isca-speech.org/archive/slate_2007. 69--72.
 
8
Siddharthan, A. 2003. Syntactic Simplification and Text Cohesion. PhD Thesis. University of Cambridge.
 
9
 
10
Klebanov, B., Knight, K., Marcu, D. 2004. Text Simplification for Information-Seeking Applications. On the Move to Meaningful Internet Systems. Lecture Notes in Computer Science. 3290, Springer--Verlag, Berlin Heidelberg New York, 735--747.
11
 
12
 
13
Chandrasekar, R., Srinivas, B. 1997. Automatic induction of rules for text simplification. Knowledge-Based Systems, 10, 183--190.
 
14
Williams, S. 2004. Natural Language Generation (NLG) of discourse relations for different reading levels. PhD Thesis, University of Aberdeen.
 
15
Williams, S., Reiter, E. 2003. A corpus analysis of discourse relations for Natural Language Generation. In Proceedings of the Corpus Linguistics 2003 (Lancaster, England, March 28 - 31, 2003), CL2003, 899--908.
 
16
Siddharthan, A. 2006. Syntactic Simplification and Text Cohesion. Research on Language and Computation, Vol. 4, 1 (June, 2006), 77--109.
 
17
McNamara, D.S., Louwerse, M.M., Graesser, A.C. 2002. Coh-Metrix: Automated cohesion and coherence scores to predict text readability and facilitate comprehension. Grant proposal. http://cohmetrix.memphis.edu/cohmetrixpr/publications.html
 
18
Cook, A.M., Hussey, S.M. 1995. Assistive Technologies: Principles and Practice. Mosby.
19
20
21
 
22
Meireles, V., Spinillo, A.G. 2004. Uma análise da coesão textual e da estrutura narrativa em textos escritos por adolescentes surdos. Estudos de Psicologia, 9, 1, 131--144.
 
23
 
24
Daelemans, W., Hothker, A., Sang, E.T.K. 2004. Automatic Sentence Simplification for Subtitling in Dutch and English. In Proceedings of the 4th International Conference on Language Resources and Evaluation (Lisbon, Portugal, May 26-28, 2004), LREC 2004. ELRA Paris, France, 1045--1048.
 
25
Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J. 1998. Practical simplification of English newspaper text to assist aphasic readers. In Proceedings of the AAAI-98 Workshop on Integrating Artificial Intelligence and Assistive Technology.
26
 
27
Ramos, W. M. 2006. A compreensão leitora e a ação docente na produção do texto para o ensino a distância. Linguagem & Ensino, Vol. 9, No. 1, 215--242. Universidade de Brasíília.
 
28
Widdowson, H. G. 1978. Teaching language as communication. Oxford: Oxford University Press.
 
29
Williams S., Reiter E. 2008. Generating basic skills reports for low-skilled readers, Natural Language Engineering, First View article, (Apr. 2008), 1--31. Published online by Cambridge University Press 24 Apr 2008.
 
30
Williams S., Reiter E. 2005. Generating Readable Texts for Readers with Low Basic Skills. In Proceedings of the 10th European Workshop on Natural Language Generation (Aberdeen, Scotland, August 8-10, 2005). ENLG-2005, Association for Computational Linguistics, Morristown, NJ, USA, 140--147.
 
31
Carvalho Netto, J. R. 2003. Ao Encontro da Lei: O Novo Código Civil ao alcance de todos. São Paulo: Imprensa Oficial.
 
32
Biderman, M. T. C. 2005. DICIONÁRIO ILUSTRADO DE PORTUGUÊS. São Paulo, Editora Ática. 1ª. ed. São Paulo: Ática.
 
33
Janczura, G. A., Castilho, G. M., Rocha, N. O. 2007. Normas de concretude para 909 palavras da língua portuguesa. Psic.: Teor. e Pesq., vol. 23, 195--204.
 
34
Graesser, A., McNamara, D. S., Louwerse, M., & Cai, Z. 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers, 36, 193--202.
 
35
Muniz, M., Paulovich, F. V., Minghim, R., Infante, K., Muniz, F., Vieira, R., Aluísio, S. 2007. Taming the tiger topic: an XCES compliant corpus Portal to generate subcorpus based on automatic text topic identification. In Proceedings of the Corpus Linguistics 2007 (University of Birmingham, July 27-30, 2007). CL 2007.
 
36
Bick, E. 2000. The Parsing System "Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. PhD thesis. Aarhus University. Denmark University Press.
 
37
Specia, L.; Aluisio, S.M.; Pardo, T.A.S. 2008. Manual de Simplificação Sintática para o Português. Technical Report NILC-TR-08-06. São Carlos-SP. http://www.nilc.icmc.usp.br/nilc/publications.htm#TechnicalReports
38

Collaborative Colleagues:
Sandra M. Aluísio: colleagues
Lucia Specia: colleagues
Thiago A.S. Pardo: colleagues
Erick G. Maziero: colleagues
Renata P.M. Fortes: colleagues