ACM Home Page
Please provide us with feedback. Feedback
Detecting similar Java classes using tree algorithms
Full text PdfPdf (393 KB)
Source International Conference on Software Engineering archive
Proceedings of the 2006 international workshop on Mining software repositories table of contents
Shanghai, China
SESSION: Matching table of contents
Pages: 65 - 71  
Year of Publication: 2006
ISBN:1-59593-397-2
Authors
Tobias Sager  University of Zurich, Switzerland
Abraham Bernstein  University of Zurich, Switzerland
Martin Pinzger  University of Zurich, Switzerland
Christoph Kiefer  University of Zurich, Switzerland
Sponsors
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 128,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1137983.1138000
What is a DOI?

ABSTRACT

Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code while typing and that suggests similar code examples or existing implementations from a source code repository. Mining software repositories by means of similarity measures enables and enforces reusing existing code and reduces the developing effort needed by creating a shared knowledge base of code fragments. In information retrieval similarity measures are often used to find documents similar to a given query document. This paper extends this idea to source code repositories. It introduces our approach to detect similar Java classes in software projects using tree similarity algorithms. We show how our approach allows to find similar Java classes based on an evaluation of three tree-based similarity measures in the context of five user-defined test cases as well as a preliminary software evolution analysis of a medium-sized Java project. Initial results of our technique indicate that it (1) is indeed useful to identify similar Java classes, (2)successfully identifies the ex ante and ex post versions of refactored classes, and (3) provides some interesting insights into within-version and between-version dependencies of classes within a Java project.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
 
5
6
 
7
G. Mishne and M. de Rijke. Source Code Retrieval using Conceptual Similarity. 2004.
8
9
 
10
Object Technology International, Inc. Eclipse Platform Technical Overview. 2003.
11
12
 
13
S. Tichelaar. FAMIX Java Language Plug-in 1.0.1999.
 
14
S. Tichelaar, P. Steyaert, and S. Demeyer. FAMIX 2.0: The FAMOOS Information Exchange Model. 1999.
 
15
G. Valiente. Simple and Efficient Tree Pattern Matching. Technical Report LSI-00-72-R, Technical University of Catalonia, Dec.2000.
 
16
 
17
 
18

CITED BY  9

Collaborative Colleagues:
Tobias Sager: colleagues
Abraham Bernstein: colleagues
Martin Pinzger: colleagues
Christoph Kiefer: colleagues