|
ABSTRACT
Measurements of the impact and history of research literature provide a useful complement to scientific digital library collections. Bibliometric indicators have been extensively studied, mostly in the context of journals. However, journal-based metrics poorly capture topical distinctions in fast-moving fields, and are increasingly problematic with the rise of open-access publishing. Recent developments in latent topic models have produced promising results for automatic sub-field discovery. The fine-grained, faceted topics produced by such models provide a clearer view of the topical divisions of a body of research literature and the interactions between those divisions. We demonstrate the usefulness of topic models in measuring impact by applying a new phrase-based topic discovery model to a collection of 300,000 Computer Science publications, collected by the Rexa automatic citation indexing system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. W. Aksnes, T. B. Olsen, and P. O. Seglen. Validation of bibliometric indicators in the field of microbiology: A norwegian case study. Scientometrics, 49(1):7--22, 2000.
|
| |
2
|
|
| |
3
|
K. Börner, C. Chen, and K. W. Boyack. Visualizing knowledge domains. Annual Review of Information Science and Technology, 37, 2003.
|
| |
4
|
|
| |
5
|
Q. L. Burrell. The use of the generalized Waring process in modelling informetric data. Scientometrics, 64(3):247--270, 2005.
|
| |
6
|
|
| |
7
|
M. Christopherson. Identifying core documents with a multiple evidence relevance filter. Scientometrics, 61(3):385--394, 2004.
|
| |
8
|
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41 (6):391--407, 1990.
|
| |
9
|
L. Egghe and R. Rousseau. Introduction to Informetrics: quantitative methods in library, documentation, and information science. 1990.
|
| |
10
|
E. Erosheva, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. PNAS, 101(Suppl. 1):5220--5227, 2004.
|
| |
11
|
E. Garfield. Expected citation rates, half-life, and impact ratios: comparing apples to apples in evaluation research. Current Contents, 1994.
|
| |
12
|
E. Garfield. Historiographic mapping of knowledge domains literature. Journal of Information Science, 30(2):119--145, 2004.
|
| |
13
|
E. Garfield. The history and meaning of the journal impact factor. Journal of the American Medical Association, 293:90--93, January 2006.
|
 |
14
|
C. Lee Giles , Kurt D. Bollacker , Steve Lawrence, CiteSeer: an automatic citation indexing system, Proceedings of the third ACM conference on Digital libraries, p.89-98, June 23-26, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276675.276685]
|
| |
15
|
W. Glänzel. Towards a model for diachonous and synchronous citation analyses. Scientometrics, 60(3):511--522, 2004.
|
| |
16
|
P. Glenisson, W. Glänzel, and O. Persson. Combining full-text analysis and bibliometric indicators. a pilot study. Scientometrics, 63(1):163--180, 2005.
|
| |
17
|
A. Goodrum, K. W. McCain, S. Lawrence, and C. L. Giles. Scholarly publishing in the internet age: a citation analysis of computer science literature. Information Processing and Management, 37(5):661--675, 2001.
|
| |
18
|
|
| |
19
|
A. McCallum, A. Corrada-Emanuel, and X. Wang. Topic and role discovery in social networks. In International Joint Conference on Artificial Intelligence (IJCAI), 2005.
|
| |
20
|
|
| |
21
|
F. Peng and A. McCallum. Accurate information extraction from research papers using conditional random fields. In HLT-NAACL, 2004.
|
| |
22
|
Michal Rosen-Zvi , Thomas Griffiths , Mark Steyvers , Padhraic Smyth, The author-topic model for authors and documents, Proceedings of the 20th conference on Uncertainty in artificial intelligence, p.487-494, July 07-11, 2004, Banff, Canada
|
| |
23
|
I. Rowlands. Journal diffusion factors: a new approach to measuring research influence. Journal of Documentation, 54:77--84, 2002.
|
| |
24
|
|
| |
25
|
H. Small. A passage through science: crossing disciplinary boundaries. Library Trends, 48(1):72--108, 1999.
|
| |
26
|
X. Wang and A. McCallum. A note on topical n-grams. Technical Report UM-CS-2005-071, University of Massachusetts, Amherst, December 2005.
|
| |
27
|
Ben Wellner , Andrew McCallum , Fuchun Peng , Michael Hay, An integrated, conditional model of information extraction and coreference with application to citation matching, Proceedings of the 20th conference on Uncertainty in artificial intelligence, p.593-601, July 07-11, 2004, Banff, Canada
|
CITED BY 8
|
|
|
|
|
Ziming Zhuang , Ergin Elmacioglu , Dongwon Lee , C. Lee Giles, Measuring conference quality by mining program committee characteristics, Proceedings of the 2007 conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
|
|
|
|
|
|
|
|
|
David Newman , Kat Hagedorn , Chaitanya Chemudugunta , Padhraic Smyth, Subject metadata enrichment using statistical topic models, Proceedings of the 2007 conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
|
|
|
|
|
|
|
|
|
|
|