|
ABSTRACT
In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this article leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and is shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and cooccurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This article discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and the results of an experiment and validation of the proposed method using a standard bibliographic dataset.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Collins, A. and Loftus, E. 1975. A spreading activation theory of semantic processing. Psych. Rev. 82, 407--428.
|
| |
4
|
|
| |
5
|
|
| |
6
|
de Lin, S. and Chalupsky, H. 2004. Issues of verification for unsupervised discovery systems. In Proceedings of the ACM/SIKDD International Conference on Knowledge Discovery and Data Mining (KDD'04) Workshop on Link Discovery.
|
| |
7
|
Duval, E., Hodgins, W., Sutton, S., and Weibel, S. L. 2002. Metadata principles and practices. D-Lib Mag. 8, 4.
|
 |
8
|
Giovanni Giuffrida , Eddie C. Shek , Jihoon Yang, Knowledge-based metadata extraction from PostScript files, Proceedings of the fifth ACM conference on Digital libraries, p.77-84, June 02-07, 2000, San Antonio, Texas, United States
[doi> 10.1145/336597.336639]
|
| |
9
|
|
| |
10
|
Greenberg, J. 2004. Metadata extraction and harvesting: A comparison of two automatic metadata generation applications. J. Intern. Catalog. 6, 4, 59--82.
|
| |
11
|
Hui Han , C. Lee Giles , Eren Manavoglu , Hongyuan Zha , Zhenyue Zhang , Edward A. Fox, Automatic document metadata extraction using support vector machines, Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, May 27-31, 2003, Houston, Texas
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
Kuwano, H., Matsuo, Y., and Kawazoe, K. 2004. Reducing the cost of metadata generation by using video/audio indexing and natural language processing techniques. NTT Technical Review, 68--74.
|
 |
16
|
|
| |
17
|
|
| |
18
|
Mathes, A. 2004. Folksonomies—cooperative classification and communication through shared metadata. Computer Mediated Communication - LIS590CMC (graduate course), Univerity of Illinois Urbana-Champaisn.
|
 |
19
|
Amy McGovern , Lisa Friedland , Michael Hay , Brian Gallagher , Andrew Fast , Jennifer Neville , David Jensen, Exploiting relational structure to understand publication patterns in high-energy physics, ACM SIGKDD Explorations Newsletter, v.5 n.2, December 2003
[doi> 10.1145/980972.980999]
|
 |
20
|
Mor Naaman , Ron B. Yeh , Hector Garcia-Molina , Andreas Paepcke, Leveraging context to resolve identity in photo albums, Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2005, Denver, CO, USA
[doi> 10.1145/1065385.1065430]
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
Rorvig, M., Jeong, K.-T., Pachlag, A., Anusuri, R., and Oyarce, S. 2002. Content based image retrieval by integration of metadata encoded multimedia (image and text) features. In Proceedings of the World Wide Web Conference (WWW'02). ACM.
|
 |
25
|
|
| |
26
|
|
| |
27
|
|
|