ACM Home Page
Please provide us with feedback. Feedback
Distributed classification in peer-to-peer networks
Full text PdfPdf (1.00 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Jose, California, USA
SESSION: Industrial and government track papers table of contents
Pages: 968 - 976  
Year of Publication: 2007
ISBN:978-1-59593-609-7
Authors
Ping Luo  Chinese Academy of Sciences, Beijing, China
Hui Xiong  Rutgers University, Newark, NJ
Kevin Lü  Brunel University, London, United Kingdom
Zhongzhi Shi  Chinese Academy of Sciences, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 151,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1281192.1281296
What is a DOI?

ABSTRACT

This work studies the problem of distributed classification in peer-to-peer(P2P) networks. While there has been a significant amount of work in distributed classification, most of existing algorithms are not designed for P2P networks. Indeed, as server-less and router-less systems, P2P networks impose several challenges for distributed classification: (1) it is not practical to have global synchronization in large-scale P2P networks; (2)there are frequent topology changes caused by frequent failure and recovery of peers; and (3) there are frequent on-the-fly data updates on each peer.

In this paper, we propose an ensemble paradigm for distributed classification in P2P networks. Under this paradigm, each peer builds its local classifiers on the local data and the results from all local classifiers are then combined by plurality voting. To build local classifiers, we adopt the learning algorithm of pasting bites to generate multiple local classifierson each peer based on the local data. To combine local results, we propose a general form of Distributed Plurality Voting (DPV) protocol in dynamic P2P networks. This protocol keeps the single-site validity for dynamic networks, and supports the computing modes of both one-shot query and continuous monitoring. We theoretically prove that the condition (BOB CHECK THIS 'C')ω0 for sending messages used in DPV0 is locally communication-optimal to achieve the above properties. Finally, experimental results on real-world P2P networks show that: (1) the proposed ensemble paradigm is effective even if there are thousands of local classifiers; (2) in most cases, the DPV0 algorithm is local in the sense that voting is processed using information gathered from a very small vicinity, whose size is independent of the network size; (3) DPV0 is significantly more communication-efficient than existing algorithms for distributed plurality voting.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.
3
 
4
 
5
 
6
 
7
P. Chan and S. Stolfo. A comparative evaluation of voting and meta-learning on partitioned data. In ICML, pages 90--98, 1995.
 
8
 
9
 
10
S. Datta, C. Giannella, and H. Kargupta. K-means clustering over a large, dynamic network. In SDM, pages 153--164, 2006.
 
11
 
12
M. Demrekler and H. Altincay. Plurality voting-based multiple classifier systems: statistically independent with respect to dependent classifier sets. Pattern Recognition, 35(11):2365--2379, 2002.
 
13
 
14
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In ICML, pages 148--156, 1996.
15
 
16
 
17
 
18
19
 
20
S. Siersdorfer and S. Sizov. Automatic document organization in a p2p environment. In ECIR, pages 265--276, 2006.
 
21
 
22
G. Tsoumakas and I. Vlahavas. Effective stacking of distributed classifiers. In Proc. of the 15th European Conf. on AI, pages 340--344, 2002.
 
23
D. Watts and S. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393(6684):440--442, 1998.
 
24
R. Wolff, K. Bhaduri, and H. Kargupta. Local l2 thresholding based data mining in peer-to-peer systems. In SDM, pages 430--441, 2006.
 
25
R. Wolff and A. Schuster. Association rule mining in peer-to-peer systems. IEEE Transactions on Systems, Man and Cybernetics - Part B, 34(6), 2004.
 
26
J. Zhao, R. Govindan, and D. Estrin. Computing aggregates for monitoring wireless sensor networks. In Proc. of the 1st IEEE Int'l Workshop on Sensor Network Protocols and Applications, 2003.


Collaborative Colleagues:
Ping Luo: colleagues
Hui Xiong: colleagues
Kevin Lü: colleagues
Zhongzhi Shi: colleagues