| Classifying XML tags through "reading contexts" |
| Full text |
Pdf
(114 KB)
|
| Source
|
Document Engineering
archive
Proceedings of the 2005 ACM symposium on Document engineering
table of contents
Bristol, United Kingdom
SESSION: Document authoring, markup and manipulation 1
table of contents
Pages: 143 - 145
Year of Publication: 2005
ISBN:1-59593-240-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 13, Citation Count: 0
|
|
|
ABSTRACT
Some tags used in XML documents create arbitrary breaks in the natural flow of the text. This may constitute an impediment to the application of some methods of document engineering. This article introduces the concept of ``reading contexts'', and gives clues to handle it theorically and in practice. This work should notably allow to recognize emphasis tags in a text, to define a new concept of term proximity in structured documents, to improve indexing techniques, and also to open up the way to advanced linguistic analyses of XML corpora.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N. Fuhr, M. Lalmas, S. Malik, and Z. Szlàvik, editors. Advances in XML Information Retrieval. Third Workshop of the Initiative for the Evaluation of XML retrieval (INEX), volume 3493 of Lecture Notes in Computer Science, Schloss Dagstuhl, Germany, Dec. 2005. Springer-Verlag.
|
| |
2
|
L. Lini, D. Lombardini, M. Paoli, D. Colazzo, and C. Sartiani. XTReSy: A Text Retrieval System for XML documents. In D. Buzzetti, H. Short, and G. Pancalddella, editors, Augmenting Comprehension: Digital Tools for the History of Ideas. Office for Humanities Communication Publications, King's College, London, 2001.
|
| |
3
|
H. Schmid. Probabilistic Part-of-Speech Tagging Using Decision Trees. In International Conference on New Methods in Language Processing, Sept. 1994.
|
| |
4
|
X. Tannier. Dealing with XML structure through "Reading Contexts". Technical Report 2005-400-007, Ecole Nationale Supérieure des Mines de Saint-Etienne, Apr. 2005. http://www.emse.fr/\small$\sim$tannier/publications.html.
|
| |
5
|
Extensible Markup Language (XML). World Wide Web Consortium (W3C) Recommandation, 2004. http://www.w3.org/TR/2004/REC-xml-20040204/.
|
|