ACM Home Page
Please provide us with feedback. Feedback
Quantifying software requirements for supporting archived office documents using emulation
Full text PdfPdf (157 KB)
Source International Conference on Digital Libraries archive
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries table of contents
Chapel Hill, NC, USA
SESSION: Digital preservation table of contents
Pages: 86 - 94  
Year of Publication: 2006
ISBN:1-59593-354-9
Authors
Thomas Reichherzer  Indiana University, Bloomington, IN
Geoffrey Brown  Indiana University, Bloomington, IN
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 52,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141753.1141770
What is a DOI?

ABSTRACT

This paper addresses the issues associated with building software images to support a collection of archived documents using machine emulators. Emulation has been proposed as a strategy for preservation of digital documents that require their original software for access. The creation of software images is a critical component in archiving documents via emulation. The software images include the operating system, application software, and supporting software artifacts such as fonts and Codecs (Compression-Decompression algorithm). A practical emulation environment to support a digital document requires both an emulator and a software image. This paper considers the issues associated with creating such software images to support Microsoft Office documents. In particular, we discuss a set of software tools and strategies that we developed to automatically analyze the dependencies of Microsoft Office documents on software resources and supporting files. As a proof of concept, the tools and strategies have been applied to establish dependencies of Office documents from a document library containing approximately 200,000 documents and to automatically collect missing resources such as fonts. The software tools are a first step toward an interactive system that aids in the construction of robust emulation environments for preserving digital artifacts. However, they may also be used in other contexts, for example, to support screening of documents for archiving and migration to new platforms to ensure correct visualization.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
AVICodec. http://avicodec.duby.info/.
 
2
Fonts supported by Create Adobe PDF Online. http://www.adobe.com/support/techdocs/328731.html.
 
3
KC Softwares. VideoInspector. http://www.kcsoftwares.com/.
 
4
Microsoft OpenType fonts specification. http://www.microsoft.com/OpenType/OTSpec/os2.htm.
 
5
Microsoft Typography: Internal Development Tools. http://www.microsoft.com/typography/tools/tools.aspx.
 
6
Microsoft PowerPoint Object Model. In MSDN Library, 2001.
 
7
F. Bellard. QEMU. http://en.wikipedia.org/wiki/QEMU.
 
8
 
9
 
10
11
 
12
S. Gilheany. Preserving digital information forever and a call for emulators. In Digital Libraries Asia 98: The Digital Era: Implications, Challenges, and Issues, 1998.
 
13
14
15
 
16
P. Mellor. CaMiLEON: emulation and BBC doomsday. RLG DigiNews, 7(2), 2003.
 
17
S. Miastkowski. Create two virtual PCs Out of One. PC World, 1999. http://www.vmware.com/news/articles/1999.html.
 
18
Microsoft. Virtual PC. http://en.wikipedia.org/wiki/Virtual PC.
 
19
 
20
R. Stallman. GNU Debugger. GNU General Public License, 1999. http://en.wikipedia.org/wiki/GDB.
 
21
R. Tansley, M. K. Smith, and J. Harford Walker. The DSpace open source digital asset management system: Challenges and opportunities. In Lecture Notes in Computer Science 3652: Research and Advanced Technology for Digital Libraries: 9th European Conference, pages 242--253, Vienna, Austria, September 2005.
22


Collaborative Colleagues:
Thomas Reichherzer: colleagues
Geoffrey Brown: colleagues