| Classification of source code archives |
| Full text |
Pdf
(67 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
table of contents
Toronto, Canada
POSTER SESSION: Posters
table of contents
Pages: 425 - 426
Year of Publication: 2003
ISBN:1-58113-646-3
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 50, Citation Count: 1
|
|
|
ABSTRACT
The World Wide Web contains a number of source code archives. Programs are usually classified into various categories within the archive by hand. We report on experiments for automatic classification of source code into these categories. We examined a number of factors that affect classification accuracy. Weighting features by expected entropy loss makes a significant improvement in classification accuracy. We show a Support Vector Machine can be trained to classify source code with a high degree of accuracy. We feel these results show promise for software reuse.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abramson N. Information Theory and Coding, McGraw-Hill, New York, 1963.
|
 |
2
|
|
| |
3
|
Chang C and Lin C. LIBSVM: A library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
|
| |
4
|
Chen A, Lee Y K, Yao A Y, and Michail A. Code search based on CVS comments: A preliminary evaluation (Technical Report 0106). School of Computer Science and Eng., University of New South Wales, Australia, 2001.
|
| |
5
|
Dumais S T. Using SVMs for text categorization. IEEE Intelligent Systems Magazine, Trends and Controversies, Vol. 13(4), 21--23, 1998.
|
| |
6
|
|
 |
7
|
|
| |
8
|
Merkl D. Content-based software classification by self-organization. In Proceedings of the IEEE International Conference on Neural Networks, 1086--1091, 1995.
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
|