|
ABSTRACT
In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese and propose simplification strategies for this language. This study illustrates the need for text simplification to facilitate accessibility to information by poor literacy readers and potentially by people with other cognitive disabilities. It also highlights characteristics of simplification for Portuguese, which may differ from other languages. Such study consists of the first step towards building Brazilian Portuguese text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by relevant news agencies.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Ribeiro, V. M. 2006. Analfabetismo e alfabetismo funcional no Brasil. Boletim INAF. São Paulo: Instituto Paulo Montenegro.
|
| |
2
|
Rino, L.H.M., Pardo, T.A.S., Silla Jr., C.N., Kaestner, C.A., Pombo, M. 2004. A Comparison of Automatic Summarization Systems for Brazilian Portuguese Texts. SBIA 2004, Lecture Notes in Artificial Inteligence. 3171, Springer-Verlag, Berlin Heidelberg New York, 235--244.
|
| |
3
|
Feltrim, V., Pelizzoni, J.M., Teufel, S., Nunes, M.G.V., Aluísio, S.M. 2004. Applying Argumentative Zoning in an Automatic Critiquer of Academic Writing. SBIA 2004, Lecture Notes in Artificial Inteligence. 3171, Springer-Verlag, Berlin Heidelberg New York, 1--10.
|
| |
4
|
Pardo, T.A.S., Nunes, M.G.V. 2006. Review and Evaluation of DiZer - An Automatic Discourse Analyzer for Brazilian Portuguese. PROPOR 2006, Lecture Notes in Computer Science. 3960, Springer-Verlag, Berlin Heidelberg New York, 180--189.
|
| |
5
|
Mapleson, D.L. 2006. Post-Grammatical Processing for Discourse Segmentation. PhD Thesis. School of Computing Sciences, University of East Anglia, Norwich.
|
| |
6
|
Max, A. 2006. Writing for Language-impaired Readers. In Proceedings of Seventh International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, Mexico, February 19-25, 2006). CICLing 2006. Springer-Verlag, Berlin Heidelberg New York, 567--570.
|
| |
7
|
Petersen, S. E., Ostendorf, M.: Text Simplification for Language Learners: A Corpus Analysis. 2007. In Proceedings of the Speech and Language Technology for Education Workshop (Pennsylvania, USA, October 1-3, 2007). SLaTE-2007. Carnegie Mellon University and ISCA Archive, http://www.isca-speech.org/archive/slate_2007. 69--72.
|
| |
8
|
Siddharthan, A. 2003. Syntactic Simplification and Text Cohesion. PhD Thesis. University of Cambridge.
|
| |
9
|
|
| |
10
|
Klebanov, B., Knight, K., Marcu, D. 2004. Text Simplification for Information-Seeking Applications. On the Move to Meaningful Internet Systems. Lecture Notes in Computer Science. 3290, Springer--Verlag, Berlin Heidelberg New York, 735--747.
|
 |
11
|
|
| |
12
|
|
| |
13
|
Chandrasekar, R., Srinivas, B. 1997. Automatic induction of rules for text simplification. Knowledge-Based Systems, 10, 183--190.
|
| |
14
|
Williams, S. 2004. Natural Language Generation (NLG) of discourse relations for different reading levels. PhD Thesis, University of Aberdeen.
|
| |
15
|
Williams, S., Reiter, E. 2003. A corpus analysis of discourse relations for Natural Language Generation. In Proceedings of the Corpus Linguistics 2003 (Lancaster, England, March 28 - 31, 2003), CL2003, 899--908.
|
| |
16
|
Siddharthan, A. 2006. Syntactic Simplification and Text Cohesion. Research on Language and Computation, Vol. 4, 1 (June, 2006), 77--109.
|
| |
17
|
McNamara, D.S., Louwerse, M.M., Graesser, A.C. 2002. Coh-Metrix: Automated cohesion and coherence scores to predict text readability and facilitate comprehension. Grant proposal. http://cohmetrix.memphis.edu/cohmetrixpr/publications.html
|
| |
18
|
Cook, A.M., Hussey, S.M. 1995. Assistive Technologies: Principles and Practice. Mosby.
|
 |
19
|
|
 |
20
|
|
 |
21
|
|
| |
22
|
Meireles, V., Spinillo, A.G. 2004. Uma análise da coesão textual e da estrutura narrativa em textos escritos por adolescentes surdos. Estudos de Psicologia, 9, 1, 131--144.
|
| |
23
|
Kentaro Inui , Atsushi Fujita , Tetsuro Takahashi , Ryu Iida , Tomoya Iwakura, Text simplification for reading assistance: a project note, Proceedings of the second international workshop on Paraphrasing, p.9-16, July 11-11, 2003, Sapporo, Japan
[doi> 10.3115/1118984.1118986]
|
| |
24
|
Daelemans, W., Hothker, A., Sang, E.T.K. 2004. Automatic Sentence Simplification for Subtitling in Dutch and English. In Proceedings of the 4th International Conference on Language Resources and Evaluation (Lisbon, Portugal, May 26-28, 2004), LREC 2004. ELRA Paris, France, 1045--1048.
|
| |
25
|
Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J. 1998. Practical simplification of English newspaper text to assist aphasic readers. In Proceedings of the AAAI-98 Workshop on Integrating Artificial Intelligence and Assistive Technology.
|
 |
26
|
|
| |
27
|
Ramos, W. M. 2006. A compreensão leitora e a ação docente na produção do texto para o ensino a distância. Linguagem & Ensino, Vol. 9, No. 1, 215--242. Universidade de Brasíília.
|
| |
28
|
Widdowson, H. G. 1978. Teaching language as communication. Oxford: Oxford University Press.
|
| |
29
|
Williams S., Reiter E. 2008. Generating basic skills reports for low-skilled readers, Natural Language Engineering, First View article, (Apr. 2008), 1--31. Published online by Cambridge University Press 24 Apr 2008.
|
| |
30
|
Williams S., Reiter E. 2005. Generating Readable Texts for Readers with Low Basic Skills. In Proceedings of the 10th European Workshop on Natural Language Generation (Aberdeen, Scotland, August 8-10, 2005). ENLG-2005, Association for Computational Linguistics, Morristown, NJ, USA, 140--147.
|
| |
31
|
Carvalho Netto, J. R. 2003. Ao Encontro da Lei: O Novo Código Civil ao alcance de todos. São Paulo: Imprensa Oficial.
|
| |
32
|
Biderman, M. T. C. 2005. DICIONÁRIO ILUSTRADO DE PORTUGUÊS. São Paulo, Editora Ática. 1ª. ed. São Paulo: Ática.
|
| |
33
|
Janczura, G. A., Castilho, G. M., Rocha, N. O. 2007. Normas de concretude para 909 palavras da língua portuguesa. Psic.: Teor. e Pesq., vol. 23, 195--204.
|
| |
34
|
Graesser, A., McNamara, D. S., Louwerse, M., & Cai, Z. 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers, 36, 193--202.
|
| |
35
|
Muniz, M., Paulovich, F. V., Minghim, R., Infante, K., Muniz, F., Vieira, R., Aluísio, S. 2007. Taming the tiger topic: an XCES compliant corpus Portal to generate subcorpus based on automatic text topic identification. In Proceedings of the Corpus Linguistics 2007 (University of Birmingham, July 27-30, 2007). CL 2007.
|
| |
36
|
Bick, E. 2000. The Parsing System "Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. PhD thesis. Aarhus University. Denmark University Press.
|
| |
37
|
Specia, L.; Aluisio, S.M.; Pardo, T.A.S. 2008. Manual de Simplificação Sintática para o Português. Technical Report NILC-TR-08-06. São Carlos-SP. http://www.nilc.icmc.usp.br/nilc/publications.htm#TechnicalReports
|
 |
38
|
Sandra M. Aluísio , Lucia Specia , Thiago A. S. Pardo , Erick G. Maziero , Helena M. Caseli , Renata P. M. Fortes, A corpus analysis of simple account texts and the proposal of simplification strategies: first steps towards text simplification systems, Proceedings of the 26th annual ACM international conference on Design of communication, September 22-24, 2008, Lisbon, Portugal
[doi> 10.1145/1456536.1456540]
|
|