|
ABSTRACT
With the number and types of documents in digital library systems incr easing, tools for automatically organizing and presenting the content have to be found. While many approaches focus on topic-based organization and structuring, hardly any system incorporates automatic structural analysis and representation. Yet, genre information (unconsciously) forms one of the most distinguishing features in conventional libraries and in information searches. In this paper we present an approach to automatically analyze the structure of documents and to integrate this information into an automatically created content-based organization. In the resulting visualization, documents on similar topics, yet representing different genres, are depicted as books in differing colors. This representation supports users intuitively in locating relevant information presented in a relevant form.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Biber. Variations across Speech and Writing. Cambridge University Press, UK, 1988.
|
| |
2
|
D. Biber. A typology of english texts. Linguistics, 27:3 - 43, 1989.
|
| |
3
|
I. Bretan, J. Dewe, A. Hallberg, N. Wolkert, and J. Karlgren. Web-specific genre visualization. In Proc of WebNet '98, Orlando, FL, November 1998. http://www.stacken.kth.se/~dewe/.
|
| |
4
|
H. Chen, C. Schuels, and R. Orwig. Internet categorization and search: A self-organizing approach. Journal of Visual Communication and Image Representation, 7(1):88-102, 1996. http://ai.BPA.arizona.edu/papers/.
|
| |
5
|
H. Chernoff. The use of faces to represent points in k-dimensional space graphically. Journal American Statistical Association, 68:361-368, 1973.
|
| |
6
|
L. Cherra and W. Vesterman. Writing tools: The STYLE and DICTION programs. Technical Report 91, Bell Laboratories, Murray Hill, NJ, 1981. Republished as part 4.4BSD User's Supplementary Documents by O'Reilly.
|
| |
7
|
|
| |
8
|
J. Karlgren. Stylistic experiments in information retrieval. In T. Strzalkowski, editor, Natural Language Information Retrieval. Kluwer, 1999. http://www.sics.se/~jussi/Artiklar/.
|
| |
9
|
J. Karlgren, I. Bretan, J. Dewe, A. Hallberg, and N. Wolkert. Iterative information retrieval using fast clustering and usage-specific genres. In Proc Eighth DELOS Workshop on User Interfaces in Digital Libraries, pages 85-92, Stockholm, Sweden, October 1998. http://www.stacken.kth.se/~dewe/.
|
| |
10
|
|
| |
11
|
|
| |
12
|
T. Kohonen. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 1982.
|
| |
13
|
|
| |
14
|
T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela. Self-organization of a massive document collection. IEEE Transactions on Neural Networks, 11(3):574-585, May 2000. http://ieeexplore.ieee.org/.
|
| |
15
|
D. Merkl and A. Rauber. Document classification with unsupervised neural networks. In F. Crestani and G. Pasi, editors, Soft Computing in Information Retrieval, pages 102-121. Physica Verlag, 2000. http://www.ifs.tuwien.ac.at/~andi/LoP.html.
|
| |
16
|
A. Rauber. LabelSOM: On the labeling of self-organizing maps. In Proc Int'l Joint Conf on Neural Networks (IJCNN'99), Washington, DC, July 10 - 16. 1999. http://www.ifs.tuwien.ac.at/~andi/LoP.html.
|
 |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
K. Ries. Towards the detection and description of textual meaning indicators in spontaneous conversations. In Proc Europ. Conf on Speech Communication and Technology (EUROSPEECH99), Budapest, Hungary, September 5-9 1999.
|
| |
22
|
|
CITED BY 7
|
|
|
|
|
Eric Bier , Lance Good , Kris Popat , Alan Newberger, A document corpus browser for in-depth reading, Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2004, Tuscon, AZ, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Susan L. Price , Marianne Lykke Nielsen , Lois M. L. Delcambre , Peter Vedsted , Jeremy Steinhauer, Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective, Information Systems, v.34 n.8, p.778-806, December, 2009
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
General Terms:
Design,
Documentation,
Experimentation,
Human Factors,
Management,
Measurement,
Performance,
Theory
Keywords:
SOMLib,
document clustering,
genre analysis,
metaphor graphics,
self-organizing map (SOM),
visualization
|