| Visualizing textual redundancy in legacy source |
| Full text |
Pdf
(44 KB)
|
| Source
|
IBM Centre for Advanced Studies Conference
archive
Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
table of contents
Toronto, Ontario, Canada
Page: 32
Year of Publication: 1994
|
|
Author
|
|
J. Howard Johnson
|
Institute for Information Technology, National Research Council of Canada, Ottawa, Ontario K1A 0R6
|
|
| Sponsors |
|
| Publisher |
IBM Press
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 15, Citation Count: 9
|
|
|
ABSTRACT
As a result of maintenance activity legacy systems contain repeated text in the form of large and small blocks that appear in more or less the same form in several places. These repetitions define a structure that can contribute information about the development history of the source different from the documented version or the current directory structure.A strategy based on fingerprinting is used to obtain raw matches indicating where repetitions occur. The information inherent in these matches is then reorganized for easier processing, leading to a natural clustering of substrings. Suppression of detail is usually necessary to make further progress and can be done in several different ways.For example, matches of blocks of text identify associations within groups of files. In cases with complex clusters of files involving multiple overlapping subsets of files, Hasse diagrams can support visualization. Techniques useful for understanding such graphs can then be employed to provide significant insights into the structure of the redundancy and hence the source.The paper discusses this approach and shows results obtained from an example of reasonable size (40 Mbytes of source based on two releases of the GNU gcc compiler).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
{1} Brenda S. Baker, "A Program for Identifying Duplicated Code", Proceedings of Computing Science and Statistics: 24th Symposium on the Interface, (1992).
|
| |
2
|
|
| |
3
|
E. Buss , R. De Mori , W. M. Gentleman , J. Henshaw , H. Johnson , K. Kontogiannis , E. Merlo , H. A. Müller , J. Mylopoulos , S. Paul , A. Prakash , M. Stanley , S. R. Tilley , J. Troster , K. Wong, Investigating reverse engineering technologies for the CAS program understanding project, IBM Systems Journal, v.33 n.3, p.477-500, July 1994
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
CITED BY 9
|
|
|
|
|
|
|
|
Michael Whitney , Morris Bernstein , Renato De Mori , Kostas Kontogiannis , Brain Corrie , Hausi Müller , Scott Tilley , Ettore Merlo , John Mylopoulos , Kenny Wong , J. Howard Johnson , James McDaniel , Martin Stanley, Using an integrated toolset for program understanding, Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research, p.59, November 07-09, 1995, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|