|
ABSTRACT
The concept of thumbnails is common in image representation. A thumbnail is a highly compressed version of an image that provides a small, yet complete visual representation to the human eye. We propose the adaptation of the concept of thumbnails to the domain of documents, whereby a thumbnail of any document can be generated from its semantic content, providing an adequate amount of information about the documents. However, unlike image thumbnails, document thumbnails are mainly for the consumption of software such as search engines, and other content processing systems. With the advent of the semantic web, the requirement for machine processing of documents has become extremely important. We give particular attention to electronic documents in XML and in RDF/XML, with a view towards the processing of documents in the semantic web.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Adobe Systems, San Jose, CA, USA. Adobe Reader 6.0 for Windows and Macintosh User Manual, 2003.
|
| |
2
|
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001.
|
| |
3
|
|
| |
4
|
|
| |
5
|
M. Cannataro, G. Carelli, A. Pugliese, and D. Sacca. Semantic lossy compression of XML data. In Knowledge Representation Meets Databases, 2001.
|
| |
6
|
M. Dalkilic and J. Costello. BioKnOT: Biological knowledge through ontologies and TFIDF. In Proceedings, Workshop on Search and Discovery in Bioinformatics, SIGIR-Bio, 2004.
|
 |
7
|
|
 |
8
|
Jade Goldstein , Mark Kantrowitz , Vibhu Mittal , Jaime Carbonell, Summarizing text documents: sentence selection and evaluation metrics, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.121-128, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312665]
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
K. McKeown, R. Barzilay, D. Evans, et al. Columbia multi-document summarization: Approach and evaluation. In Proceedings of the Workshop of Text Summarization, ACM SIGIR 2001, 2001.
|
| |
13
|
W. Ogden. Getting information from documents you cannot read: An interactive cross-language text retrieval and summarization system, 1999.
|
| |
14
|
W. Ogden, J. Cowie, M. Davis, E. Ludovik, S. Nirenburg, H. Molina-Salgado, and N. Sharples. Keizai: An interactive cross-language text retrieval system.
|
| |
15
|
|
| |
16
|
W. C. Ogden, M. W. Davis, and S. Rice. Document thumbnail visualization for rapid relevance judgments: When do they pay off? In Text REtrieval Conference, pages 528--534, 1998.
|
| |
17
|
G. Salton. Developments in automatic text retrieval. Science, 253:974--980, 1991.
|
| |
18
|
G. Salton, J. Allan, C. Buckley, and A. Singhal. Automatic analysis, term generation and summarization of machine readable texts. Science, 264:1421--1426, June 1994.
|
| |
19
|
G. Salton and C. Yang. On the specification of term values in automatic indexing. Journal of Documentation, 29:351--372, April 1973.
|
| |
20
|
B. Suh, A. Woodruff, R. Rosenholtz, and A. Glass. Popout prism: Adding perceptual principles to overview+detail document interfaces, 2002.
|
| |
21
|
P. Tolani and J. R. Haritsa. XGRIND: A query-friendly XML compressor. In ICDE, 2002.
|
| |
22
|
T. Welch. A technique for high-performance data compression. IEEE Computer, 17(6):8--19, 1984.
|
|