| An initial evaluation of automated organization for digital library browsing |
| Full text |
Pdf
(243 KB)
|
| Source
|
International Conference on Digital Libraries
archive
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
table of contents
Denver, CO, USA
SESSION: Tools & techniques track: browsing and visualizing collections
table of contents
Pages: 246 - 255
Year of Publication: 2005
ISBN:1-58113-876-8
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 130, Citation Count: 4
|
|
|
ABSTRACT
In this article we present an evaluation of text clustering and classification methods for creating digital library browse interfaces, focusing on the particular case of collections made up of heterogeneous metadata records. This situation is common in "portal" style digital libraries, which are built by harvesting content from many disparate sources, typically using the Open Archives Protocol for Metadata Harvesting (OAI-PMH). By studying the activity of users in an experimental system, we find that taxonomies built or populated using machine-learning (or "AI") techniques provide a potentially useful avenue for browsing in this digital library scenario.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., Harshman, R. A. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
| |
3
|
Garner, S. R. WEKA: The Waikato environment for knowledge analysis. In Proceedings of the New Zealand Computer Science Research Students Conference, pp. 57--64, 1995.
|
| |
4
|
|
| |
5
|
Kaufman, L., Rousseeuw, P.J.Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, 1990.
|
| |
6
|
|
| |
7
|
Lee, D., Seung, H. S. Learning the parts of objects by non-negative matrix factorization.Nature, vol. 401, October 1999.
|
 |
8
|
|
| |
9
|
McCallum, A. K.Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering, 1996. http://www.cs.cmu.edu/~mccallum/bow.
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
Willett, P. Document clustering using an inverted file approach. Journal of Information Science, 2:223--231, 1990.
|
 |
14
|
|
CITED BY 4
|
|
|
|
|
David Newman , Kat Hagedorn , Chaitanya Chemudugunta , Padhraic Smyth, Subject metadata enrichment using statistical topic models, Proceedings of the 2007 conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.7
Digital Libraries
General Terms:
Experimentation,
Human Factors,
Measurement
Keywords:
NMF,
browsing,
categorization,
classification,
clustering,
digital libraries,
harvesting,
portals,
taxonomies
|