| Distributed cooperative mining for information consortia |
| Full text |
Pdf
(168 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Washington, D.C.
POSTER SESSION: Research track
table of contents
Pages: 619 - 624
Year of Publication: 2003
ISBN:1-58113-737-0
|
|
Authors
|
|
Satoshi Morinaga
|
NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa
|
|
Kenji Yamanishi
|
NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa
|
|
Jun-ichi Takeuchi
|
NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 32, Citation Count: 0
|
|
|
ABSTRACT
We consider the situation where a number of agents are distributed and each of them collects a data sequence generated according to an unknown probability distribution. Here each of the distributions is specified by common parameters and individual parameters e.g., a normal distribution with an identical mean and a different variance. Here we introduce a notion of an information consortium, which is a framework where the agents cannot show raw data to one another, but they like to enjoy significant information gain for estimating the respective distributions. Such an information consortium has recently received much interest in a broad range of areas including financial risk management, ubiquitous network mining, etc. In this paper we are concerned with the following three issues: 1) how to design a collaborative strategy for agents to estimate the respective distributions in the information consortium, 2) characterizing when each agent has a benefit in terms of information gain for estimating its distribution or information loss for predicting future data, and 3) charracterizing how much benefit each agent obtains. In this paper we yield a statistical formulation of information consortia and solve all of the above three problems for a general form of probability distributions. Specifically we propose a basic strategy for cooperative estimation and derive a necessary and sufficient condition for each agent to have a significant benefit.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. Chan and S. Stolfo, "Toward Parallel and Distributed Learning By Meta-Learning", In Working Notes AAAI Work. Knowledge Discovery in Databases, pp. 227--240, AAAI, 1993.
|
| |
2
|
|
| |
3
|
|
| |
4
|
David W. Cheung , Jiawei Han , Vincent T. Ng , Ada W. Fu , Yongjian Fu, A fast distributed algorithm for mining association rules, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.31-43, December 18-20, 1996, Miami Beach, Florida, United States
|
| |
5
|
T. Han and S. Amari, "Statistical Inference Under Multiterminal Data Compression", IEEE Tans. on Information Theory, Vol. 44, No. 6, pp. 2300--2324, (1998).
|
 |
6
|
|
| |
7
|
H. Kargupta, B. Park, D. Hershbereger, and E. Johnson, "Collective data mining: A new perspective toward distributed data mining", Advances in Distributed Data Mining, AAAI/MIT Press, (1999).
|
 |
8
|
|
| |
9
|
|
| |
10
|
NetRisk (R. Ceske and L. Swann), "Share and Share Alike", http://www.netrisk.com/downloads/publishedarticles/shareandsharealike.PDF, (1999).
|
| |
11
|
A. Prodromidis, P. Chan, and S. Stolfo, "Meta-learning in distributed data mining systems: Issues and approaches", In Advances in Distributed and Parallel Knowledge Discovery, H. Kargupta and P. Chan (editors), Chapter 3, AAAI/MIT Press, (2000).
|
| |
12
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|