ACM Home Page
Please provide us with feedback. Feedback
A new distributed data mining model based on similarity
Full text PdfPdf (592 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2003 ACM symposium on Applied computing table of contents
Melbourne, Florida
SESSION: Data mining table of contents
Pages: 432 - 436  
Year of Publication: 2003
ISBN:1-58113-624-2
Authors
Tao Li  University of Rochester, Rochester, NY
Shenghuo Zhu  University of Rochester, Rochester, NY
Mitsunori Ogihara  University of Rochester, Rochester, NY
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/952532.952618
What is a DOI?

ABSTRACT

Distributed Data Mining (DDM) has been very active and enjoying a growing amount attention since its inception. Current DDM techniques regard the distributed data sets as a single virtual table and assume there exists a global model which could be generated if the data were combined/centralized. This paper proposes a similarity-based distributed data mining(SBDDM) framework which explicitly take the differences among distributed sources into consideration. A new similarity measure is introduced and its effectiveness is then evaluated and validated. This paper also illustrates the limitations of current DDM techniques through three concrete case studies. Finally distributed clustering within the SBDDM framework is also discussed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
Chan, P. C., & Stolfo, S. (1993). Meta-learning for multistrategy and parallel learning. Proceedings of the Second International Workshop on Multistrategy Learning.
 
5
 
6
 
7
 
8
9
 
10
 
11
Kargupta, H., & Chan, P. (Eds.). (2000). Advances in distributed and parallel data mining. AAAI Press.
 
12
Kargupta, H., Park, B., Hershbereger, D., & Johnson, E. (2000). Collective data mining: A new perspective toward distributed data mining. In H. Kargupta and P. Chan (Eds.), Advances in distributed data mining, 133--184. AAAI/MIT.
 
13
 
14
Li, T., Ogihara, M., & Zhu, S. (2002). Similarity testing between heterogeneous basket databases (Technical Report 781). Computer Science, Univ. of Rochester.
 
15
 
16
R. Wirth, M. B., & Hipp, J. (2001). When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In Proceedings of workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, PKDD/ECML 2001.
 
17
Rafiei, D., & Mendelzon, A. (1997). Similarity-based queries for time series data (pp. 13--25).
 
18
Ronkainen, R. (1998). Attribute similarity and event sequence similarity in data mining. Ph.lic.thesis, University of Helsinki. Available as Report C-1998-42, University of Helsinki, Department of Computer Science, October 1998.
 
19
Subramonian, R. (1998). Defining diff as a data mining primitive. KDD.
 
20
Turnisky, A., & Grossman, R. (2000). A framework for finding distributed data mining strategies that are intermediate between centralized strategies and in-place strategies. Proc. of KDD Workshop on Distributed Data Mining.
21
 
22
Zaki, M., & Ho, C. (Eds.). (2000). Large-scale parallel data mining. Springer.
 
23


Collaborative Colleagues:
Tao Li: colleagues
Shenghuo Zhu: colleagues
Mitsunori Ogihara: colleagues