ACM Home Page
Please provide us with feedback. Feedback
Cost-effective outbreak detection in networks
Full text MovMov (16:52),  PdfPdf (1.11 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Jose, California, USA
SESSION: Research track papers table of contents
Pages: 420 - 429  
Year of Publication: 2007
ISBN:978-1-59593-609-7
Authors
Jure Leskovec  Carnegie Mellon University
Andreas Krause  Carnegie Mellon University
Carlos Guestrin  Carnegie Mellon University
Christos Faloutsos  Carnegie Mellon University
Jeanne VanBriesen  Carnegie Mellon University
Natalie Glance  Nielsen BuzzMetrics
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 247,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1281192.1281239
What is a DOI?

ABSTRACT

Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?.

These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information asquickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of "submodularity". We exploit submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs.

We evaluate our approach on several large real-world problems,including a model of a water distribution network from the EPA, andreal blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
N. Bailey. The Mathematical Theory of Infectious Diseases and its Applications. Griffin, London, 1975.
 
2
J. Berry, W. E. Hart, C. E. Phillips, J. G. Uber, and J. Watson. Sensor placement in municipal water networks with temporal integer programming models. J. Water Resources Planning and Management, 2006.
 
3
S. Bikhchandani, D. Hirshleifer, and I. Welch. A theory of fads, fashion, custom, and cultural change as informational cascades. J. of Polit. Econ., (5), 1992.
 
4
 
5
R. Cohen, S. Havlin, and D. ben Avraham. Efficient immunization strategies for computer networks and populations. Physical Review Letters, 91:247901, 2003.
 
6
G. Dorini, P. Jonkergouw, and et. al. An efficient algorithm for sensor placement in water distribution systems. In at. Dist. Syst. An. Conf., 2006.
7
 
8
J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 12, 2001.
9
10
 
11
 
12
A. Krause, J. Leskovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Efficient sensor placement optimization for securing large water distribution networks. Submitted to the J. of Water Resources Planning an Management, 2007.
13
14
 
15
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective Outbreak Detection in Networks. TR, CMU-ML-07-111, 2007.
 
16
J. Leskovec, M. McGlohon, C. Faloutsos, N. S. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SDM, 2007.
 
17
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14, 1978.
 
18
A. Ostfeld and E. Salomons. Optimal layout of early warning detection stations for water distribution systems security. J. Water Resources Planning and Management, 130(5):377--385, 2004.
 
19
A. Ostfeld, J. G. Uber, and E. Salomons. Battle of water sensor networks: A design challenge for engineers and algorithms. In WDSA, 2006.
 
20
R. Pastor-Satorras and A. Vespignani. Immunization of complex networks. Physical Review E, 65, 2002.
21
 
22
 
23
E. Rogers. Diffusion of innovations. Free Press, 1995.
 
24
L. A. Rossman. The epanet programmer's toolkit for analysis of water distribution systems. In Ann. Wat. Res. Plan. Mgmt. Conference, 1999.
 
25
M. Sviridenko. A note on maximizing a submodular set function subject to knapsack constraint. Operations Research Letters, 32:41--43, 2004.

CITED BY  11

Collaborative Colleagues:
Jure Leskovec: colleagues
Andreas Krause: colleagues
Carlos Guestrin: colleagues
Christos Faloutsos: colleagues
Jeanne VanBriesen: colleagues
Natalie Glance: colleagues