ACM Home Page
Please provide us with feedback. Feedback
SS-ClusterTree: a subspace clustering based indexing algorithm over high-dimensional image features
Full text PdfPdf (458 KB)
Source
Conference On Image And Video Retrieval archive
Proceedings of the 2008 international conference on Content-based image and video retrieval table of contents
Niagara Falls, Canada
SESSION: Subspace learning in content-based image retrieval table of contents
Pages 95-104  
Year of Publication: 2008
ISBN:978-1-60558-070-8
Authors
Hongli Xu  Beijing Jiaotong University, Beijing, China
Dantong Yu  Brookhaven National Lab, Upton, NY, USA
De Xu  Beijing Jiaotong University, Beijing, China
Aidong Zhang  The State University of New York at Buffalo, Buffalo, NY, USA
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 120,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1386352.1386369
What is a DOI?

ABSTRACT

The rapid growth in the volume of image and video data collections motivates the research of building an index structure in image information retrieval. Constructing an index in the image database poses a very challenging problem due to the facts of image databases containing data with high dimensions, and lack of domain knowledge. ClusterTree is an indexing approach representing clusters generated by any existing clustering approach and do not need any prior knowledge. It is a hierarchy of clusters and subcluster which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. However, one disadvantage of ClusterTree is that non-clustering data points are often ignored. These non-clustering data points might represent interesting targets in an image database. In this paper, we propose a modified ClusterTree structure(called SS-ClusterTree), which is based on subspace clustering. The SS-ClusterTree includes two kinds of leaf nodes, a cluster leaf node and a noise leaf node. When a new data item is added to the SS-ClusterTree, if it belongs to a cluster, it is inserted into the corresponding the cluster leaf node, otherwise into the noise leaf node. The noise leaf node will be split while its volume is more than a certain threshold. We present a novel updating technique which optimizes the internal structure of the SS-ClusterTree by utilizing the Newton's Universal Law of Gravitation. When a noise node is split, the attraction forces are calculated between every new node and its sibling nodes. These new nodes may be merged by their sibling nodes, if the attraction force between them is the most significant. Meanwhile the nodes intersecting boundaries are updated. This approach guarantees that the SS-ClusterTree always represents the current dataset structure, and helps to identify the pattern hiding in the newly added data. SS-ClusterTree can efficiently support the dynamic insertion and manage the dataset with non-clustering data, and is highly adaptive to any kind of cluster structure. Our experiment results also show that this index structure is effective and efficient.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Ester M, Kriegel H, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial databases with noise. Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining (KDD'96). Portland: AAAI Press, 1996. 226--231.
 
3
Hinneburg A, Keim D. An efficient approach to clustering in large multimedia databases with noise. Proc. of the 4th Int'l Conf. on Knowledge Discovery and Data Mining (KDD'98). New York: AAAI Press, 1998. 58--65.
 
4
 
5
 
6
Hinneburg, A., Keim, D.A.: An optimal grid-clustering: Towards breaking the curse of dimensionality in high-dimensional clustering. The VLDB Journal.1999. 506--517
7
 
8
Karin Kailing, Hans-Peter Kriegel, Peer Kroger. Density-Connected Subspace Clustering for High-Dimensional Data. In Proc. 4th SIAM Int. Conf. on Data Mining, pp.246--257, Lake Buena Vista,FL,2004
 
9
Carlotta Domeniconi, D. Papadopoulos, D. Gunopulos, Sheng Ma. Subspace Clustering of High Dimensional Data, SIAM International Conference on Data Mining (SDM), Apr. 2004, http://www.cs.ucr.edu/~dimitris/publications.html
 
10
Hongli Xu, De Xu, and Enai Lin. An applicable hierarchical clustering algorithm for content based image retrieval, In Proc. Computer Vision / Computer Graphics Collaboration Techniques and Applications, MIRAGE'07 International Conference, France, Mar. 2007 pp.82--92
 
11
W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin, "The QBIC Project: Querying Images By Content Using Color, Texture, and Shape," Proc. of the SPIE Conf. on Storage and Retrieval for Image and Video Databases II, 1993. 173--187.
12
13
14
15
 
16
Sanjay Goil, H. Nagesh, A. Choudhary. MAFIA: Efficient and Scalable Subspace Clustering Clustering for Very Large Data Sets. Technical Report CPDC-TR-9906-010, Northwestern University, 2145 Sheridan Road, Evanston IL 60208, June 1999
17
 
18
Kyoung-Gu Woo, Jeong-Hoon Lee, Myoung-Ho Kim, Yoon-Joon Lee. FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Information & Software Technology, VOL 46(4),2004,pp255--271.
 
19
WANG Jian-hui, SHEN Zhan, HU Yun-fa. An Applicable and Efficient Clustering Algorithm{J}.Journal of Software,2004,15(5):697--705.

Collaborative Colleagues:
Hongli Xu: colleagues
Dantong Yu: colleagues
De Xu: colleagues
Aidong Zhang: colleagues