ACM Home Page
Please provide us with feedback. Feedback
Classifying XML tags through "reading contexts"
Full text PdfPdf (114 KB)
Source Document Engineering archive
Proceedings of the 2005 ACM symposium on Document engineering table of contents
Bristol, United Kingdom
SESSION: Document authoring, markup and manipulation 1 table of contents
Pages: 143 - 145  
Year of Publication: 2005
ISBN:1-59593-240-2
Authors
Xavier Tannier  École Nationale Supérieure des Mines, France
Jean-Jacques Girardot  École Nationale Supérieure des Mines, France
Mihaela Mathieu  École Nationale Supérieure des Mines, France
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 13,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1096601.1096638
What is a DOI?

ABSTRACT

Some tags used in XML documents create arbitrary breaks in the natural flow of the text. This may constitute an impediment to the application of some methods of document engineering. This article introduces the concept of ``reading contexts'', and gives clues to handle it theorically and in practice. This work should notably allow to recognize emphasis tags in a text, to define a new concept of term proximity in structured documents, to improve indexing techniques, and also to open up the way to advanced linguistic analyses of XML corpora.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
N. Fuhr, M. Lalmas, S. Malik, and Z. Szlàvik, editors. Advances in XML Information Retrieval. Third Workshop of the Initiative for the Evaluation of XML retrieval (INEX), volume 3493 of Lecture Notes in Computer Science, Schloss Dagstuhl, Germany, Dec. 2005. Springer-Verlag.
 
2
L. Lini, D. Lombardini, M. Paoli, D. Colazzo, and C. Sartiani. XTReSy: A Text Retrieval System for XML documents. In D. Buzzetti, H. Short, and G. Pancalddella, editors, Augmenting Comprehension: Digital Tools for the History of Ideas. Office for Humanities Communication Publications, King's College, London, 2001.
 
3
H. Schmid. Probabilistic Part-of-Speech Tagging Using Decision Trees. In International Conference on New Methods in Language Processing, Sept. 1994.
 
4
X. Tannier. Dealing with XML structure through "Reading Contexts". Technical Report 2005-400-007, Ecole Nationale Supérieure des Mines de Saint-Etienne, Apr. 2005. http://www.emse.fr/\small$\sim$tannier/publications.html.
 
5
Extensible Markup Language (XML). World Wide Web Consortium (W3C) Recommandation, 2004. http://www.w3.org/TR/2004/REC-xml-20040204/.

Collaborative Colleagues:
Xavier Tannier: colleagues
Jean-Jacques Girardot: colleagues
Mihaela Mathieu: colleagues