|
ABSTRACT
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview
of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify
cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
ACM, 1994. ACM CR Classifications. ACM Computing Surveys 35, 5-16.
|
| |
3
|
AL-SULTAN, K.S. 1995. A tabu search approach to clustering problems. Pattern Recogn. 28, 1443-1451.
|
| |
4
|
|
| |
5
|
ALLEN, P. A. AND ALLEN, J. R. 1990. Basin Analysis: Principles and Applications. Blackwell Scientific Publications, Inc., Cambridge, MA.
|
| |
6
|
ALTA VISTA, 1999. http://altavista.digital.com.
|
| |
7
|
|
| |
8
|
ANDERBERG, M. R. 1973. Cluster Analysis for Applications. Academic Press, Inc., New York, NY.
|
 |
9
|
|
| |
10
|
|
| |
11
|
BABU, G. P. AND MURTY, M.N. 1994. Clustering with evolution strategies. Pattern Recogn. 27, 321-329.
|
| |
12
|
BABU, G. P., MURTY, M. N., AND KEERTHI, S. S. 2000. Stochastic connectionist approach for pattern clustering (To appear). IEEE Trans. Syst. Man Cybern..
|
| |
13
|
BACKER, F. B. AND HUBERT, L.g. 1976. A graphtheoretic approach to goodness-of-fit in complete-link hierarchical clustering. J. Am. Stat. Assoc. 71,870-878.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
BALL, G. H. AND HALL, D.J. 1965. ISODATA, a novel method of data analysis and classification. Tech. Rep.. Stanford University, Stanford, CA.
|
| |
18
|
BENTLEY, J. L. AND FRIEDMAN, J.H. 1978. Fast algorithms for constructing minimal spanning trees in coordinate spaces. IEEE Trans. Comput. C-27, 6 (June), 97-105.
|
| |
19
|
|
| |
20
|
BHUYAN, J. N., RAGHAVAN, V. V., AND VENKATESH, K.E. 1991. Genetic algorithm for clustering with an ordered representation. In Proceedings of the Fourth International Conference on Genetic Algorithms, 408-415.
|
| |
21
|
BISWAS, G., WEINBERG, J., AND LI, C. 1995. A Conceptual Clustering Method for Knowledge Discovery in Databases. Editions Technip.
|
| |
22
|
|
| |
23
|
BRODATZ, P. 1966. Textures: A Photographic Album for Artists and Designers. Dover Publications, Inc., Mineola, NY.
|
 |
24
|
|
| |
25
|
CARPENTER, G. AND GROSSBERG, S. 1990. ART3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures. Neural Networks 3, 129-152.
|
| |
26
|
CHEKURI, C., GOLDWASSER, M. H., RAGHAVAN, P., AND UPFAL, E. 1997. Web search using automatic classification. In Proceedings of the Sixth International Conference on the World Wide Web (Santa Clara, CA, Apr.), http:// theory.stanford.edu/people/wass/publications/ Web Search/Web Search.html.
|
| |
27
|
CHENG, C. H. 1995. A branch-and-bound clustering algorithm. IEEE Trans. Syst. Man Cybern. 25, 895-898.
|
| |
28
|
CHENG, Y. AND FU, K.S. 1985. Conceptual clustering in knowledge organization. IEEE Trans. Pattern Anal. Mach. Intell. 7, 592-598.
|
| |
29
|
|
| |
30
|
|
| |
31
|
|
| |
32
|
1996. Special issue on data mining. Commun. ACM 39, 11.
|
| |
33
|
COLEMAN, G. B. AND ANDREWS, H. C. 1979. Image segmentation by clustering. Proc. IEEE 67, 5, 773-785.
|
| |
34
|
|
| |
35
|
|
| |
36
|
DALE, M. B. 1985. On the comparison of conceptual clustering and numerical taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 7, 241-244.
|
| |
37
|
DAVE, R. N. 1992. Generalized fuzzy C-shells clustering and detection of circular and elliptic boundaries. Pattern Recogn. 25, 713-722.
|
| |
38
|
DAVIS, T., Ed. 1991. The Handbook of Genetic Algorithms. Van Nostrand Reinhold Co., New York, NY.
|
| |
39
|
DAY, W. H.E. 1992. Complexity theory: An introduction for practitioners of classification. In Clustering and Classification, P. Arabie and L. Hubert, Eds. World Scientific Publishing Co., Inc., River Edge, NJ.
|
| |
40
|
DEMPSTER, A. P., LAIRD, N. M., AND RUB IN, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. B. 39, 1, 1-38.
|
| |
41
|
DIDAY, E. 1973. The dynamic cluster method in non-hierarchical clustering. J. Comput. Inf. Sci. 2, 61-88.
|
| |
42
|
DIDAY, E. AND SIMON, J. C. 1976. Clustering analysis. In Digital Pattern Recognition, K. S. Fu, Ed. Springer-Verlag, Secaucus, NJ, 47-94.
|
| |
43
|
DIDAY, E. 1988. The symbolic approach in clustering. In Classification and Related Methods, H. H. Bock, Ed. North-Holland Publishing Co., Amsterdam, The Netherlands.
|
| |
44
|
|
| |
45
|
DUBES, R. C. AND JAIN, A. K. 1976. Clustering techniques: The user's dilemma. Pattern Recogn. 8, 247-260.
|
| |
46
|
DUBES, R. C. AND JAIN, A. K. 1980. Clustering methodology in exploratory data analysis. In Advances in Computers, M. C. Yovits,, Ed. Academic Press, Inc., New York, NY, 113- 125.
|
| |
47
|
|
| |
48
|
|
| |
49
|
DUBUISSON, M. P. AND JAIN, A.K. 1994. A modified Hausdorff distance for object matching. In Proceedings of the International Conference on Pattern Recognition (ICPR '94), 566-568.
|
| |
50
|
|
| |
51
|
DUNN, S., JANOS, L., AND ROSENFELD, A. 1983. Bimean clustering. Pattern Recogn. Lett. 1, 169-173.
|
| |
52
|
DURAN, B. S. AND ODELL, P. L. 1974. Cluster Analysis: A Survey. Springer-Verlag, New York, NY.
|
| |
53
|
|
 |
54
|
|
| |
55
|
EVERITT, B.S. 1993. Cluster Analysis. Edward Arnold, Ltd., London, UK.
|
| |
56
|
FABER, V. 1994. Clustering and the continuous k-means algorithm. Los Alamos Science 22, 138-144.
|
| |
57
|
FABER, V., HOCHBERG, J. C., KELLY, P. M., THOMAS, T. R., AND WHITE, J.M. 1994. Concept extraction: A data-mining technique. Los Alamos Science 22, 122-149.
|
| |
58
|
|
| |
59
|
|
| |
60
|
|
| |
61
|
Doug Fisher , Ling Xu , James R. Carnes , Yoram Reich , Steven J. Fenves , Jason Chen , Richard Shiavi , Gautam Biswas , Jerry Weinberg, Applying AI Clustering to Engineering Tasks, IEEE Expert: Intelligent Systems and Their Applications, v.8 n.6, p.51-60, December 1993
[doi> 10.1109/64.248353]
|
| |
62
|
FISHER, L. AND VAN NESS, J. W. 1971. Admissible clustering procedures. Biometrika 58, 91-104.
|
| |
63
|
|
| |
64
|
FOGEL, D. B. AND SIMPSON, P.K. 1993. Evolving fuzzy clusters. In Proceedings of the International Conference on Neural Networks (San Francisco, CA), 1829-1834.
|
| |
65
|
FOGEL, D. B. AND FOGEL, L. J., Eds. 1994. Special issue on evolutionary computation. IEEE Trans. Neural Netw. (Jan.).
|
| |
66
|
FOGEL, L. J., OWENS, A. J., AND WALSH, M. J. 1965. Artificial Intelligence Through Simulated Evolution. John Wiley and Sons, Inc., New York, NY.
|
| |
67
|
|
| |
68
|
FRED, A. L. N. AND LEITAO, J. M. N. 1996. A minimum code length technique for clustering of syntactic patterns. In Proceedings of the International Conference on Pattern Recognition (Vienna, Austria), 680-684.
|
| |
69
|
|
| |
70
|
Fu, K. S. AND LU, S.Y. 1977. A clustering procedure for syntactic patterns. IEEE Trans. Syst. Man Cybern. 7, 734-742.
|
| |
71
|
Fu, K. S. AND MUI, J. K. 1981. A survey on image segmentation. Pattern Recogn. 13, 3-16.
|
| |
72
|
|
| |
73
|
|
| |
74
|
|
| |
75
|
GORDON, A. D. AND HENDERSON, J. T. 1977. Algorithm for Euclidean sum of squares. Biometrics 33, 355-362.
|
 |
76
|
|
| |
77
|
GOWDA, K. C. 1984. A feature reduction and unsupervised classification algorithm for multispectral data. Pattern Recogn. 17, 6, 667- 676.
|
| |
78
|
GOWDA, K. C. AND KRISHNA, G. 1977. Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recogn. 10, 105-112.
|
| |
79
|
GOWDA, K. C. AND DIDAY, E. 1992. Symbolic clustering using a new dissimilarity meG- sure. IEEE Trans. Syst. Man Cybern. 22, 368-378.
|
| |
80
|
GOWER, J. C. AND ROSS, G. J.S. 1969. Minimum spanning rees and single-linkage cluster analysis. Appl. Stat. 18, 54-64.
|
| |
81
|
|
| |
82
|
HARALICK, R. M. AND KELLY, G. L. 1969. Pattern recognition with measurement space and spatial clustering for multiple images. Proc. IEEE 57, 4, 654-665.
|
| |
83
|
|
| |
84
|
|
| |
85
|
|
| |
86
|
|
| |
87
|
|
| |
88
|
|
| |
89
|
|
| |
90
|
Adam Hoover , Gillian Jean-Baptiste , Xiaoyi Jiang , Patrick J. Flynn , Horst Bunke , Dmitry B. Goldgof , Kevin Bowyer , David W. Eggert , Andrew Fitzgibbon , Robert B. Fisher, An Experimental Comparison of Range Image Segmentation Algorithms, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.18 n.7, p.673-689, July 1996
[doi> 10.1109/34.506791]
|
| |
91
|
|
| |
92
|
ICHINO, M. AND YAGUCHI, H. 1994. Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Trans. Syst. Man Cybern. 24, 698-708.
|
| |
93
|
1991. Proceedings of the International Joint Conference on Neural Networks. (IJCNN'91).
|
| |
94
|
1992. Proceedings of the International Joint Conference on Neural Networks.
|
| |
95
|
|
| |
96
|
|
| |
97
|
|
| |
98
|
|
| |
99
|
|
| |
100
|
JAIN, A. K. AND MAO, J. 1994. Neural networks and pattern recognition. In Computational Intelligence: Imitating Life, J. M. Zurada, R. J. Marks, and C. J. Robinson, Eds. 194- 212.
|
| |
101
|
JAIN, A. K. AND FLYNN, P.J. 1996. Image segmentation using clustering. In Advances in Image Understanding: A Festschrift for Azriel Rosenfeld, N. Ahuja and K. Bowyer, Eds, IEEE Press, Piscataway, NJ, 65-83.
|
| |
102
|
|
| |
103
|
JAIN, A. K., RATHA, N. K., AND LAKSHMANAN, S. 1997. Object detection using Gabor filters. Pattern Recogn. 30, 2, 295-309.
|
| |
104
|
|
| |
105
|
|
| |
106
|
JARVIS, R. A. AND PATRICK, E. A. 1973. Clustering using a similarity method based on shared near neighbors. IEEE Trans. Comput. C-22, 8 (Aug.), 1025-1034.
|
| |
107
|
|
| |
108
|
JONES, D. AND BELTRAMO, M.A. 1991. Solving partitioning problems with genetic algorithms. In Proceedings of the Fourth International Conference on Genetic Algorithms, 442-449.
|
| |
109
|
|
| |
110
|
KING, B. 1967. Step-wise clustering procedures. J. Am. Stat. Assoc. 69, 86-101.
|
| |
111
|
KIRKPATRICK, S., GELATT, C. D., JR., AND VECCHI, M.P. 1983. Optimization by simulated annealing. Science 220, 4598 (May), 671-680.
|
| |
112
|
KLEIN, R. W. AND DUBES, R. C. 1989. Experiments in projection and clustering by simulated annealing. Pattern Recogn. 22, 213-220.
|
| |
113
|
|
| |
114
|
KOONTZ, W. L. G., FUKUNAGA, K., AND NARENDRA, P.M. 1975. A branch and bound clustering algorithm. IEEE Trans. Comput. 23, 908- 914.
|
| |
115
|
|
| |
116
|
KRAAIJVELD, M., MAO, J., AND JAIN, A. K. 1995. A non-linear projection method based on Kohonen's topology preserving maps. IEEE Trans. Neural Netw. 6, 548-559.
|
| |
117
|
KRISHNAPURAM, R., FRIGUI, H., AND NASRAOUI, O. 1995. Fuzzy and probabilistic shell clustering algorithms and their application to boundary detection and surface approximation. IEEE Trans. Fuzzy Systems 3, 29-60.
|
| |
118
|
|
| |
119
|
LIBRARY OF CONGRESS, 1990. LC classification outline. Library of Congress, Washington, DC.
|
| |
120
|
|
| |
121
|
|
| |
122
|
LEE, R. C. T., SLAGLE, J. R., AND MONG, C. T. 1978. Towards automatic auditing of records. IEEE Trans. Softw. Eng. 4, 441- 448.
|
| |
123
|
LEE, R. C. T. 1981. Cluster analysis and its applications. In Advances in Information Systems Science, J. T. Tou, Ed. Plenum Press, New York, NY.
|
| |
124
|
LI, C. AND BISWAS, G. 1995. Knowledge-based scientific discovery in geological databases. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (Montreal, Canada, Aug. 20-21), 204 -209.
|
| |
125
|
Lu, S. Y. AND FU, K. S. 1978. A sentence-tosentence clustering procedure for pattern analysis. IEEE Trans. Syst. Man Cybern. 8, 381-389.
|
| |
126
|
LUNDERVOLD, A., FENSTAD, A. M., ERSLAND, L., AND TAXT, T. 1996. Brain tissue volumes from multispectral 3D MRI: A comparative study of four classifiers. In Proceedings of the Conference of the Society on Magnetic Resonance,
|
| |
127
|
|
| |
128
|
MCQUEEN, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281-297.
|
| |
129
|
|
| |
130
|
MAO, J. AND JAIN, A.K. 1995. Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Netw. 6, 296-317.
|
| |
131
|
MAO, J. AND JAIN, A.K. 1996. A self-organizing network for hyperellipsoidal clustering (HEC). IEEE Trans. Neural Netw. 7, 16-29.
|
| |
132
|
|
| |
133
|
MICHALSKI, R., STEPP, R. E., AND DIDAY, E. 1981. A recent advance in data analysis: Clustering objects into classes characterized by conjunctive concepts. In Progress in Pattern Recognition, Vol. 1, L. Kanal and A. Rosenfeld, Eds. North-Holland Publishing Co., Amsterdam, The Netherlands.
|
| |
134
|
MICHALSKI, R., STEPP, R. E., AND DIDAY, E. 1983. Automated construction of classifications: conceptual clustering versus numerical taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-5, 5 (Sept.), 396-409.
|
| |
135
|
MISHRA, S. K. AND RAGHAVAN, V. V. 1994. An empirical study of the performance of heuristic methods for clustering. In Pattern Recognition in Practice, E. S. Gelsema and L. N. Kanal, Eds. 425-436.
|
| |
136
|
|
| |
137
|
MOHIUDDIN, K. M. AND MAO, g. 1994. A comparative study of different classifiers for handprinted character recognition. In Pattern Recognition in Practice, E. S. Gelsema and L. N. Kanal, Eds. 437-448.
|
| |
138
|
MOOR, B.K. 1988. ART 1 and Pattern Clustering. In 1988 Connectionist Summer School, Morgan Kaufmann, San Mateo, CA, 174-185.
|
| |
139
|
MURTAGH, F. 1984. A survey of recent advances in hierarchical clustering algorithms which use cluster centers. Comput. J. 26, 354-359.
|
| |
140
|
MURTY, M. N. AND KRISHNA, G. 1980. A computationally efficient technique for data clustering. Pattern Recogn. 12, 153-158.
|
| |
141
|
MURTY, M. N. AND JAIN, A.K. 1995. Knowledgebased clustering scheme for collection management and retrieval of library books. Pattern Recogn. 28, 949-964.
|
| |
142
|
NAGY, G. 1968. State of the art in pattern recognition. Proc. IEEE 56, 836-862.
|
| |
143
|
|
| |
144
|
|
| |
145
|
|
| |
146
|
OJA, E. 1982. A simplified neuron model as a principal component analyzer. Bull. Math. Bio. 15, 267-273.
|
| |
147
|
OZAWA, K. 1985. A stratificational overlapping cluster scheme. Pattern Recogn. 18, 279-286.
|
| |
148
|
OPEN TEXT, 1999. http://index.opentext.net.
|
| |
149
|
KAMGAR-PARSI, B., GUALTIERI, J. A., DEVANEY, J. A., AND KAMGAR-PARSI, K. 1990. Clustering with neural networks. Biol. Cybern. 63, 201-208.
|
| |
150
|
LYCOS, 1999. http://www.lycos.com.
|
| |
151
|
PAL, N. R., BEZDEK, J. C., AND TSAO, E. C.-K. 1993. Generalized clustering networks and Kohonen's self-organizing scheme. IEEE Trans. Neural Netw. 4, 549-557.
|
| |
152
|
QUINLAN, J. R. 1990. Decision trees and decision making. IEEE Trans. Syst. Man Cybern. 20, 339-346.
|
 |
153
|
|
| |
154
|
RAGHAVAN, V. V. AND YU, C.T. 1981. A comparison of the stability characteristics of some graph theoretic clustering methods. IEEE Trans. Pattern Anal. Mach. Intell. 3, 393-402.
|
| |
155
|
|
| |
156
|
|
| |
157
|
|
| |
158
|
|
| |
159
|
|
| |
160
|
ROSENFELD, A., SCHNEIDER, V. B., AND HUANG, M. K. 1969. An application of cluster detection to text and picture processing. IEEE Trans. Inf. Theor. 15, 6, 672-681.
|
| |
161
|
Ross, G. J. S. 1968. Classification techniques for large sets of data. In Numerical Taxonomy, A. J. Cole, Ed. Academic Press, Inc., New York, NY.
|
| |
162
|
RuSPINI, E.H. 1969. A new approach to clustering. Inf. Control 15, 22-32.
|
| |
163
|
SALTON, G. 1991. Developments in automatic text retrieval. Science 253, 974-980.
|
| |
164
|
|
| |
165
|
SAMMON, J. W. JR. 1969. A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18, 401-409.
|
| |
166
|
|
| |
167
|
SCHACHTER, B. J., DAVIS, L. S., AND ROSENFELD, A. 1979. Some experiments in image segmentation by clustering of local feature values. Pattern Recogn. 11, 19-28.
|
| |
168
|
|
| |
169
|
SELIM, S. Z. AND ISMAIL, M.A. 1984. K-meanstype algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Trans. Pattern Anal. Mach. Intell. 6, 81-87.
|
| |
170
|
|
| |
171
|
SEN, A. AND SRIVASTAVA, M. 1990. Regression Analysis. Springer-Verlag, New York, NY.
|
| |
172
|
|
| |
173
|
|
| |
174
|
|
| |
175
|
|
| |
176
|
SLAGLE, J. R., CHANG, C. L., AND HELLER, S. R. 1975. A clustering and data-reorganizing algorithm. IEEE Trans. Syst. Man Cybern. 5, 125-128.
|
| |
177
|
SNEATH, P. H. A. AND SOKAL, R. R. 1973. Numerical Taxonomy. Freeman, London, UK.
|
| |
178
|
SPATH, H. 1980. Cluster Analysis Algorithms for Data Reduction and Classification. Ellis Horwood, Upper Saddle River, NJ.
|
| |
179
|
SOLBERG, A., TAXT, T., AND JAIN, A. 1996. A Markov random field model for classification of multisource satellite imagery. IEEE Trans. Geoscience and Remote Sensing 34, 1, 100-113.
|
| |
180
|
|
| |
181
|
STAHL, H. 1986. Cluster analysis of large data sets. In Classification as a Tool of Research, W. Gaul and M. Schader, Eds. Elsevier North-Holland, Inc., New York, NY, 423-430.
|
| |
182
|
|
| |
183
|
SUTTON, M., STARK, L., AND BOWYER, K. 1993. Function-based generic recognition for multiple object categories. In Three-Dimensional Object Recognition Systems, A. Jain and P. J. Flynn, Eds. Elsevier Science Inc., New York, NY.
|
| |
184
|
SYMON, M. J. 1977. Clustering criterion and multi-variate normal mixture. Biometrics 77, 35-43.
|
| |
185
|
TANAKA, E. 1995. Theoretical aspects of syntactic pattern recognition. Pattern Recogn. 28, 1053-1061.
|
| |
186
|
TAXT, T. AND LUNDERVOLD, A. 1994. Multispectral analysis of the brain using magnetic resonance imaging. IEEE Trans. Medical Imaging 13, 3, 470-481.
|
| |
187
|
TITTERINGTON, D. M., SMITH, A. F. M., AND MAKOV, U.E. 1985. Statistical Analysis of Finite Mixture Distributions. John Wiley and Sons, Inc., New York, NY.
|
| |
188
|
TOUSSAINT, G. T. 1980. The relative neighborhood graph of a finite planar set. Pattern Recogn. 12, 261-268.
|
| |
189
|
|
| |
190
|
|
| |
191
|
URQUHART, R.B. 1982. Graph theoretical clustering based on limited neighborhood sets. Pattern Recogn. 15, 173-187.
|
| |
192
|
|
| |
193
|
VINOD, V. V., CHAUDHURY, S., MUKHERJEE, J., AND GHOSE, S. 1994. A connectionist approach for clustering with applications in image analysis. IEEE Trans. Syst. Man Cybern. 24, 365-384.
|
| |
194
|
WAH, B. W., Ed. 1996. Special section on mining of databases. IEEE Trans. Knowl. Data Eng. (Dec.).
|
| |
195
|
WARD, J. H. JR. 1963. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236-244.
|
| |
196
|
|
| |
197
|
WESZKA, J. 1978. A survey of threshold selection techniques. Pattern Recogn. 7, 259-265.
|
| |
198
|
|
| |
199
|
WILSON, D. R. AND MARTINEZ, T. R. 1997. Improved heterogeneous distance functions. J. Artif Intell. Res. 6, 1-34.
|
| |
200
|
|
| |
201
|
|
| |
202
|
ZADEH, L.A. 1965. Fuzzy sets. Inf. Control 8, 338 -353.
|
| |
203
|
ZAHN, C. T. 1971. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. C-20 (Apr.), 68-86.
|
| |
204
|
ZHANG, K. 1995. Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recogn. 28, 463-474.
|
| |
205
|
|
 |
206
|
|
| |
207
|
ZUPAN, J. 1982. Clustering of Large Data Sets. Research Studies Press Ltd., Taunton, UK.
|
CITED BY 550
|
|
|
|
|
|
|
|
Tapas Kanungo , David M. Mount , Nathan S. Netanyahu , Christine D. Piatko , Ruth Silverman , Angela Y. Wu, A local search approximation algorithm for k-means clustering, Proceedings of the eighteenth annual symposium on Computational geometry, p.10-18, June 05-07, 2002, Barcelona, Spain
|
|
|
Rebecca Cathey , Ling Ma , Nazli Goharian , David Grossman, Misuse detection for information retrieval systems, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
|
|
|
|
|
|
|
|
|
Jasit S. Suri , Sameer Singh , S. K. Setarehdan , Rakesh Sharma , Keir Bovis , Dorin Comaniciu , Laura Reden, A note on future research in segmentation techniques applied to neurology, cardiology, mammography and pathology, Advanced algorithmic approaches to medical image segmentation: state-of-the-art application in cardiology, neurology, mammography and pathology, Springer-Verlag New York, Inc., New York, NY, 2001
|
|
|
|
|
|
Taher H. Haveliwala , Aristides Gionis , Dan Klein , Piotr Indyk, Evaluating strategies for similarity search on the web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael Zink , David Westbrook , Sherief Abdallah , Bryan Horling , Vijay Lakamraju , Eric Lyons , Victoria Manfredi , Jim Kurose , Kurt Hondl, Meteorological command and control: an end-to-end architecture for a hazardous weather detection sensor network, Proceedings of the 2005 workshop on End-to-end, sense-and-respond systems, applications and services, June 05-05, 2005, Seattle, Washington
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Masahiro Motoyoshi , Takao Miura , Isamu Shioya, Clustering stream data by regression analysis, Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation, p.115-120, January 01, 2004, Dunedin, New Zealand
|
|
|
|
|
|
|
|
|
A. Bona , M. Sami , D. Sciuto , V. Zaccaria , C. Silvano , R. Zafalon, Energy estimation and optimization of embedded VLIW processors based on instruction clustering, Proceedings of the 39th conference on Design automation, June 10-14, 2002, New Orleans, Louisiana, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Schwartz , K. A. Smith , L. Churilov , M. Dally , R. Weber, Improving risk grouping rules for prostate cancer patients using self-organising maps, Design and application of hybrid intelligent systems, IOS Press, Amsterdam, The Netherlands, 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Miles Efron , Jonathan Elsas , Gary Marchionini , Junliang Zhang, Machine learning for information architecture in a large governmental website, Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2004, Tuscon, AZ, USA
|
|
|
Shie Mannor , Ishai Menache , Amit Hoze , Uri Klein, Dynamic abstraction in reinforcement learning via clustering, Proceedings of the twenty-first international conference on Machine learning, p.71, July 04-08, 2004, Banff, Alberta, Canada
|
|
|
Tapas Kanungo , David M. Mount , Nathan S. Netanyahu , Christine D. Piatko , Ruth Silverman , Angela Y. Wu, A local search approximation algorithm for k-means clustering, Computational Geometry: Theory and Applications, v.28 n.2-3, p.89-112, June 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tapas Kanungo , David M. Mount , Nathan S. Netanyahu , Christine D. Piatko , Ruth Silverman , Angela Y. Wu, An Efficient k-Means Clustering Algorithm: Analysis and Implementation, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.24 n.7, p.881-892, July 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wai-Ho Au , Keith C. C. Chan , Andrew K. C. Wong , Yang Wang, Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), v.2 n.2, p.83-101, April 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel Zeng , Hsinchun Chen , Chunju Tseng , Catherine A. Larson , Millicent Eidson , Ivan Gotham , Cecil Lynch , Michael Ascher, Towards a national infectious disease information infrastructure: a case study in West Nile virus and botulism, Proceedings of the 2004 annual national conference on Digital government research, p.1-10, May 24-26, 2004, Seattle, WA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ying Liu , Shamkant B. Navathe , Jorge Civera , Venu Dasigi , Ashwin Ram , Brian J. Ciliax , Ray Dingledine, Text Mining Biomedical Literature for Discovering Gene-to-Gene Relationships: A Comparative Study of Algorithms, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), v.2 n.1, p.62-76, January 2005
|
|
|
|
|
|
Xinyu Wang , Jianling Sun , Xiaohu Yang , Chao Huang , Zhijun He , Srinivasa R. Maddineni, Reengineering standalone C++ legacy systems into the J2EE partition distributed environment, Proceeding of the 28th international conference on Software engineering, May 20-28, 2006, Shanghai, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Csaba Legány , Sándor Juhász , Attila Babos, Cluster validity measurement techniques, Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, p.388-393, February 15-17, 2006, Madrid, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xavier Sevillano , Germán Cobo , Francesc Alías , Joan Claudi Socoró, Feature diversity in cluster ensembles for robust document clustering, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
Alessandro G. Di Nuovo , Maurizio Palesi , Davide Patti , Giuseppe Ascia , Vincenzo Catania, Fuzzy decision making in embedded system design, Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, October 22-25, 2006, Seoul, Korea
|
|
|
Lei Li , Roop G. Singh , Guangzhi Zheng , Art Vandenberg , Vijay Vaishnavi , Sham Navathe, A methodology for semantic integration of metadata in bioinformatics data sources, Proceedings of the 43rd annual southeast regional conference, March 18-20, 2005, Kennesaw, Georgia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Margit Antal , Gavril Toderean, Speaker recognition and broad phonetic groups, Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications, p.155-159, February 15-17, 2006, Innsbruck, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nachiketa Sahoo , Jamie Callan , Ramayya Krishnan , George Duncan , Rema Padman, Incremental hierarchical clustering of text documents, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
Chaolin Zhang , Xuegong Zhang , Michael Q. Zhang , Yanda Li, Neighbor number, valley seeking and clustering, Pattern Recognition Letters, v.28 n.2, p.173-180, January, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bruno Pouliquen , Ralf Steinberger , Camelia Ignat , Emilia Käsper , Irina Temnikova, Multilingual and cross-lingual news topic tracking, Proceedings of the 20th international conference on Computational Linguistics, p.959-es, August 23-27, 2004, Geneva, Switzerland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marcel Brun , Chao Sima , Jianping Hua , James Lowey , Brent Carroll , Edward Suh , Edward R. Dougherty, Model-based evaluation of clustering validation measures, Pattern Recognition, v.40 n.3, p.807-824, March, 2007
|
|
|
|
|
|
|
|
|
|
|
|
Godfrey Tan , Massimiliano Poletto , John Guttag , Frans Kaashoek, Role classification of hosts within enterprise networks based on connection patterns, Proceedings of the USENIX Annual Technical Conference 2003 on USENIX Annual Technical Conference, p.2-2, June 09-14, 2003, San Antonio, Texas
|
|
|
|
|
|
|
|
|
Chengkai Li , Min Wang , Lipyeow Lim , Haixun Wang , Kevin Chen-Chuan Chang, Supporting ranking and clustering as generalized order-by and group-by, Proceedings of the 2007 ACM SIGMOD international conference on Management of data, June 11-14, 2007, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yongguo Liu , Zhang Yi , Hong Wu , Mao Ye , Kefei Chen, A tabu search approach for the minimum sum-of-squares clustering problem, Information Sciences: an International Journal, v.178 n.12, p.2680-2704, June, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yong Xia , (David) Dagan Feng , Tianjiao Wang , Rongchun Zhao , Yanning Zhang, Image segmentation by clustering of spatial patterns, Pattern Recognition Letters, v.28 n.12, p.1548-1555, September, 2007
|
|
|
Sandro Pirkwieser , Günther R. Raidl, Finding consensus trees by evolutionary, variable neighborhood search, and hybrid algorithms, Proceedings of the 10th annual conference on Genetic and evolutionary computation, July 12-16, 2008, Atlanta, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jeffrey Erman , Anirban Mahanti , Martin Arlitt , Carey Williamson, Identifying and discriminating between web and peer-to-peer traffic in the network core, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thierry Urruty , Stanislas Lew , Nacim Ihadaddene , Dan A. Simovici, Detecting eye fixations by projection clustering, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.3 n.4, p.1-20, December 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zekeriya Erkin , Alessandro Piva , Stefan Katzenbeisser , R. L. Lagendijk , Jamshid Shokrollahi , Gregory Neven , Mauro Barni, Protection and retrieval of encrypted multimedia content: when cryptography meets signal processing, EURASIP Journal on Information Security, v.7 n.2, p.1-20, January 2007
|
|
|
Aurel Cami , Garrick L. Wallstrom , Ashley L. Fowlkes , Cathy A. Panozzo , William R. Hogan, Mining aggregates of over-the-counter products for syndromic surveillance, Pattern Recognition Letters, v.30 n.3, p.255-266, February, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rick Chow , Wei Zhong , Michael Blackmon , Richard Stolz , Marsha Dowell, An efficient SVM-GA feature selection model for large healthcare databases, Proceedings of the 10th annual conference on Genetic and evolutionary computation, July 12-16, 2008, Atlanta, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guofei Gu , Roberto Perdisci , Junjie Zhang , Wenke Lee, BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection, Proceedings of the 17th conference on Security symposium, p.139-154, July 28-August 01, 2008, San Jose, CA
|
|
|
Donghai Guan , Weiwei Yuan , Young-Koo Lee , Andrey Gavrilov , Sungyoung Lee, Improving supervised learning performance by using fuzzy clustering method to select training data, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, v.19 n.4,5, p.321-334, December 2008
|
|
|
Derchian Tsaih , Guangming Wu , Chiehyao Chang , Shaoshin Hung , Chinshan Wu , Huiling Lin, An efficient A* algorithm for the directed linear arrangement problem, WSEAS Transactions on Computers, v.7 n.12, p.1958-1967, December 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Prasenjit Majumder , Mandar Mitra , Swapan K. Parui , Gobinda Kole , Pabitra Mitra , Kalyankumar Datta, YASS: Yet another suffix stripper, ACM Transactions on Information Systems (TOIS), v.25 n.4, p.18-es, October 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wilma Penzo , Stefano Lodi , Federica Mandreoli , Riccardo Martoglia , Simona Sassatelli, Semantic peer, here are the neighbors you want!, Proceedings of the 11th international conference on Extending database technology: Advances in database technology, March 25-29, 2008, Nantes, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Maria Camila N. Barioni , Humberto L. Razente , Agma J. M. Traina , Caetano Traina, Jr., Accelerating k-medoid-based algorithms through metric access methods, Journal of Systems and Software, v.81 n.3, p.343-355, March, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ying Liu , Shamkant B. Navathe , Alex Pivoshenko , Venu G. Dasigi , Ray Dingledine , Brian J. Ciliax, Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes, International Journal of Data Mining and Bioinformatics, v.1 n.1, p.88-110, June 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Umberto Castellani , Marco Cristani , Carlo Combi , Vittorio Murino , Andrea Sbarbati , Pasquina Marzola, Visual MRI: Merging information visualization and non-parametric clustering techniques for MRI dataset analysis, Arificial Intelligence in Medicine, v.44 n.3, p.183-199, November, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yi Hong , Sam Kwong , Hui Xiong , Qingsheng Ren, Genetic-guided semi-supervised clustering algorithm with instance-level constraints, Proceedings of the 10th annual conference on Genetic and evolutionary computation, July 12-16, 2008, Atlanta, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In-Su Kang , Seung-Hoon Na , Seungwoo Lee , Hanmin Jung , Pyung Kim , Won-Kyung Sung , Jong-Hyeok Lee, On co-authorship for author disambiguation, Information Processing and Management: an International Journal, v.45 n.1, p.84-97, January, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stênio Fernandes , Carlos Kamienski , Judith Kelner , Dênio Mariz , Djamel Sadok, A stratified traffic sampling methodology for seeing the big picture, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.52 n.14, p.2677-2689, October, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A. V. Miranskyy , N. H. Madhavji , M. S. Gittens , M. Davison , M. Wilding , D. Godwin , C. A. Taylor, SIFT: a scalable iterative-unfolding technique for filtering execution traces, Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds, October 27-30, 2008, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
S. T. Milagre , C. D. Maciel , A. A. Shinoda , M. Hungria , J. R. B. Almeida, Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp. RFLP/PCR data set, Journal of Computational and Applied Mathematics, v.227 n.2, p.308-319, May, 2009
|
|
|
|
|
|
|
|
|
Guang-Hui Yan , Li-Song Liu , Lin-Na Du , Xia-Xia Yang , Zhi-Cheng Ma , Xiao-Min Zhang, Multifractal-based cluster hierarchy optimisation algorithm, International Journal of Business Intelligence and Data Mining, v.3 n.4, p.353-374, January 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guangli Nie , Lingling Zhang , Ying Liu , Xiuyu Zheng , Yong Shi, Decision analysis of data mining project based on Bayesian risk, Expert Systems with Applications: An International Journal, v.36 n.3, p.4589-4594, April, 2009
|
|
|
Melody Y. Kiang , Dorothy M. Fisher , Jeng-Chung Victor Chen , Steven A. Fisher , Robert T. Chi, The application of SOM as a decision support tool to identify AACSB peer schools, Decision Support Systems, v.47 n.1, p.51-59, April, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fabrício Benevenuto , Tiago Rodrigues , Virgílio Almeida , Jussara Almeida , Marcos Gonçalves, Detecting spammers and content promoters in online video social networks, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gabriela Czibula , Grigoreta Sofia Cojocar , Istvan Gergely Czibula, A partitional clustering algorithm for crosscutting concerns identification, Proceedings of the 8th WSEAS International Conference on Software engineering, parallel and distributed systems, p.111-116, February 21-23, 2009, Cambridge, UK
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wenyin Gong , Zhihua Cai , Charles X. Ling , Jun Du, Hybrid differential evolution based on fuzzy C-means clustering, Proceedings of the 11th Annual conference on Genetic and evolutionary computation, July 08-12, 2009, Montreal, Québec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Erendira Rendón , Rene Garcia , Itzel Abundez , Citlalih Gutierrez , Eduardo Gasca , Federico Del Razo , Adrian Gonzalez, NIVA: a Robust cluster validity, Proceedings of the 12th WSEAS international conference on Communications, p.241-248, July 23-25, 2008, Heraklion, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Jose M. Ramirez : Reviewer"
Data clustering is not defined the same way in each of the
disciplines that use it to deal with problems that involve the
extraction of information or structure from data. The authors have
produced a good survey of this slippery topic. They d
more...
|