| Mining concepts from code with probabilistic topic models |
| Full text |
Pdf
(233 KB)
|
Source
|
Automated Software Engineering
archive
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
table of contents
Atlanta, Georgia, USA
POSTER SESSION: Posters
table of contents
Pages 461-464
Year of Publication: 2007
ISBN:978-1-59593-882-4
|
|
Authors
|
|
Erik Linstead
|
University of California, Irvine, Irvine, CA
|
|
Paul Rigor
|
University of California, Irvine, Irvine, CA
|
|
Sushil Bajracharya
|
University of California, Irvine, Irvine, CA
|
|
Cristina Lopes
|
University of California, Irvine, Irvine, CA
|
|
Pierre Baldi
|
University of California, Irvine, Irvine, CA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 113, Citation Count: 0
|
|
|
ABSTRACT
We develop and apply statistical topic models to software as a means of extracting concepts from source code. The effectiveness of the technique is demonstrated on 1,555 projects from SourceForge and Apache consisting of 113,000 files and 19 million lines of code. In addition to providing an automated, unsupervised, solution to the problem of summarizing program functionality, the approach provides a probabilistic framework with which to analyze and visualize source file similarity. Finally, we introduce an information-theoretic approach for computing tangling and scattering of extracted concepts, and present preliminary results
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Sushil Bajracharya , Trung Ngo , Erik Linstead , Yimeng Dou , Paul Rigor , Pierre Baldi , Cristina Lopes, Sourcerer: a search engine for open source code supporting structure-based search, Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications, October 22-26, 2006, Portland, Oregon, USA
[doi> 10.1145/1176617.1176671]
|
| |
2
|
|
| |
3
|
S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
| |
4
|
G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J. Loingtier, and J. Irwin. Aspect-oriented programming. In M. Akşit and S. Matsuoka, editors, Proceedings European Conference on Object-Oriented Programming, volume 1241, pages 220--242. Springer-Verlag, Berlin, Heidelberg, and New York, 1997.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
|