ACM Home Page
Please provide us with feedback. Feedback
Enhancing composite digital documents using XML-based standoff markup
Full text PdfPdf (696 KB)
Source Document Engineering archive
Proceedings of the 2005 ACM symposium on Document engineering table of contents
Bristol, United Kingdom
SESSION: Document authoring, markup and manipulation 2 table of contents
Pages: 177 - 186  
Year of Publication: 2005
ISBN:1-59593-240-2
Authors
Peter L. Thomas  University of Nottingham, Nottingham, UK
David F. Brailsford  University of Nottingham, Nottingham, UK
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 42,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1096601.1096647
What is a DOI?

ABSTRACT

Document representations can rapidly become unwieldy if they try to encapsulate all possible document properties, ranging from abstract structure to detailed rendering and layout.We present a composite document approach wherein an XML-based document representation is linked via a 'shadow tree' of bi-directional pointers to a PDF representation of the same document. Using a two-window viewer any material selected in the PDF can be related back to the corresponding material in the XML, and vice versa. In this way the treatment of specialist material such as mathematics, music or chemistry (e.g. via 'read aloud' or 'play aloud') can be activated via standard tools working within the XML representation, rather than requiring that application-specific structures be embedded in the PDF itself.The problems of textual recognition and tree pattern matching between the two representations are discussed in detail.Comparisons are drawn between our use of a shadow tree of pointers to map between document representations and the use of a code-replacement shadow tree in technologies such as XBL.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Adobe Systems Inc, PDF Reference (Third Edition; PDF 1.4), Addison Wesley, 2002. ISBN 0201758393.
 
4
OpenDoc Programmers' Guide, Addison Wesley Publishing Company, 1995. ISBN 0-202-47954-0.
 
5
6
7
 
8
Henry S. Thompson and David McKelvie, "Hyperlink semantics for standoff markup of read-only documents," in Proceedings of SGML Europe 1997, May 1997. Barcelona, Spain.
 
9
Jung Ding and Daniel Berleant, "Design of a Standoff Object-Oriented Markup Language (SOOML) for Annotating Biomedical Literature," in Proceedings of 7th International Conference on Enterprise Information Systems (ICEIS), May 24-28, 2005. Miami.
 
10
Steven DeRose, "Markup Overlap: A Review and a Horse," in Proceedings of Conference on Extreme Markup Languages, 2004.
 
11
XBL W3C Note. http://www.w3.org/TR/2001/NOTE-xbl-20010223/
 
12
W3C Comment on XBL Submission. http://www.w3.org/Submission/2001/05/Comment http://www.w3.org/Submission/2001/05/Comment
 
13
S-XBL Working Draft. http://www.w3.org/TR/sXBL/
 
14
Adobe Systems Incorporated, Acrobat Core API Reference., 2002. San Jose, CA: Adobe Systems Incorporated.
 
15
W. S. Lovegrove and D. F. Brailsford, " Document analysis of PDF documents: methods, results and implications." Electronic Publishing, Origination, Dissemination and Design. 1995, 8(2 and 3), pp. 207--220.
 
16
 
17
F. M. Wahl, K. Y. Wong, and R. G. Casey, "Block segmentation and text extraction in mixed text/image documents" Computer Graphics Image Processing, vol. 20, pp. 375--390., 1982.
 
18
Text Encoding Initiative Consortium, TEI Workgroup on Stand-Off Markup, XLink and XPointer {online}, October 2004. http://www.tei-c.org/Activities/SO/
 
19
World Wide Web Consortium, XML Inclusions (XInclude) Version 1.0 {online}, December 2004.Available at: http://www.w3.org/TR/xinclude/
 
20
 
21
World Wide Web Consortium, Mathematical Markup Language (MathML) Version 2.0 (2nd ed.) {online}. Available at: http://www.w3.org/TR/MathML2/
 
22
Recordare, MusicXML Definition {online}. Available at: http://www.recordare.com/xml.html
23


Collaborative Colleagues:
Peter L. Thomas: colleagues
David F. Brailsford: colleagues