ACM Home Page
Please provide us with feedback. Feedback
Detecting distributed scans using high-performance query-driven visualization
Full text HtmlHtml (2 KB),  PdfPdf (433 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2006 ACM/IEEE conference on Supercomputing table of contents
Tampa, Florida
SESSION: Technical papers table of contents
Article No. 82  
Year of Publication: 2006
ISBN:0-7695-2700-0
Authors
Kurt Stockinger  University of California, Berkeley, California
E. Wes Bethel  University of California, Berkeley, California
Scott Campbell  University of California, Berkeley, California
Eli Dart  University of California, Berkeley, California
Kesheng Wu  University of California, Berkeley, California
Sponsors
IEEE : Institute of Electrical and Electronics Engineers
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 53,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1188455.1188542
What is a DOI?

ABSTRACT

Modern forensic analytics applications, like network traffic analysis, perform high-performance hypothesis testing, knowledge discovery and data mining on very large datasets. One essential strategy to reduce the time required for these operations is to select only the most relevant data records for a given computation. In this paper, we present a set of parallel algorithms that demonstrate how an efficient selection mechanism -- bitmap indexing -- significantly speeds up a common analysis task, namely, computing conditional histogram on very large datasets. We present a thorough study of the performance characteristics of the parallel conditional histogram algorithms. As a case study, we compute conditional histograms for detecting distributed scans hidden in a dataset consisting of approximately 2.5 billion network connection records. We show that these conditional histograms can be computed on interactive time scale (i.e., in seconds). We also show how to progressively modify the selection criteria to narrow the analysis and find the sources of the distributed scans.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Bellman, R. 1961. Adaptive Control Processes: A Guided Tour. Princeton University Press.
3
 
4
Berchtold, S., Jagadish, H. V., and Ross, K. A. 1998. Independence diagrams: A technique for visual data mining. In Proc. 4th Int. Conf. Knowledge Discovery and Data Mining, KDD, AAAI Press, R. Agrawal, P. E. Stolorz, and G. Piatetsky-Shapiro, Eds., 139--143.
 
5
Bethel, E. W., Campbell, S., Dart, E., Stockinger, K., and Wu, K. 2006. Accelerating network traffic analysis using query-driven visualization. In IEEE Symposium on Visual Analytics Science and Technology, IEEE Computer Society Press.
 
6
Brun, R., and Rademarkers, F. 1997. Root -- an object oriented data analysis framework. In Proceedings of the AIHENP 1996 Workshop, 81--86.
 
7
Burrescia, J., and Johnston, W., 2005. Esnet status update. Internet2 International Meeting.
8
9
 
10
Experiment, B., 2006. The babar experiment. http://wwwpublic.slac.stanford.edu/babar/.
11
 
12
Fisk, M., Smith, S. A., Weber, P., Kothapally, S., and Caudell, T. 2003. Immersive Network Monitoring. In Proceedings of the 2003 Passive and Active Measurement Workshop.
 
13
 
14
 
15
 
16
Grinstein, G., Keim, D., and Ward, M., 2002. Information visualization, visual data mining, and its application to drug design. IEEE Visualization 2002 Course #1 Notes, October.
 
17
 
18
Ioannidis, Y. 2003. The history of histograms (abridged). In International Conference on Very Large Data Bases.
 
19
Jacobsen, V., Leres, C., and McCanne, S., 1989. tcpdump. ftp://ftp.ee.lbl.gov/.
 
20
 
21
 
22
Kindlmann, G. 1999. Semi-Automatic Generation of Transfer Functions for Direct Volume Rendering. Master's thesis, Cornell University.
 
23
Kitware, Inc. 2003. The Visualization Toolkit User's Guide, January.
 
24
 
25
 
26
Kornexl, S., Paxson, V., Dreger, H., Feldmann, A., and Sommer, R. 2005. Building a time machine for efficient recording and retrieval of high-volume network traffic. In Internet Measurement Conference.
 
27
28
29
 
30
 
31
Livnat, Y., Agutter, J., Moon, S., Erbacher, R., and Foresti, S. 2005. A visual paradigm for network intrusion detection. In IEEE Workshop on Information Assurance And Security.
32
 
33
 
34
McCanne, S., Leres, C., and Jacobsen, V., 1994. libpcap. ftp://ftp.ee.lbl.gov/.
 
35
36
 
37
 
38
Oetiker, T., 2006. Multi router traffic grapher. http://mrtg.hdl.com/.
 
39
Oetiker, T., 2006. Round robin database tool. http://oss.oetiker.ch/rrdtool/.
40
 
41
 
42
Paxson, V. 1998. Bro: A system for detecting network intruders in real-time. In Proceedings of the 7th USENIX Security Symposium.
 
43
 
44
 
45
Products, E. S., 2006. The fast light toolkit. http://www.fltk.org.
 
46
R3vis, 1999-2006. OpenRM Scene Graph. http://www.openrm.org.
 
47
Scientific Data Management Group, L. B. N. L., 2006. Fastbit. http://sdm.lbl.gov/fastbit.
 
48
Shoshani, A., Bernardo, L., Nordberg, H., Rotem, D., and Sim, A. 1999. Multidimensional indexing and query coordination for tertiary storage management. In International Conference on Scientific and Statistical Database Management, IEEE Computer Society. 1998. Proceedings of the 1998 ACM SIGMOD: International Conference on Management of Data, ACM Press, New York, NY, USA.
 
49
 
50
 
51
Stockinger, K., Shalf, J., Wu, K., and Bethel, E. W. 2005. Query-driven visaulization of large data sets. In Proceedings of IEEE Visualization.
 
52
Stockinger, K., Wu, K., Brun, R., and Canal, P. 2006. Bitmap indices for fast end-user physics analysis in root. Nuclear Instruments and Methods in Physics Research, Section A - Accelerators, Spectrometers, Detectors and Associated Equipment 559, 99--102.
 
53
Systems, C., 2005. Cisco netflow collection engine. http://www.cisco.com/en/US/products/sw/netmgtsw/ps1964/.
 
54
Thomas, J. J., and Eds., K. A. C. 2005. Illuminating the Path -- The Research and Development Agenda for Visual Analytics. IEEE Computer Society Press.
 
55
Uphoff, B., and Criscuolo, P. 2004. A framework for collection and management of intrusion detection data sets. In Proceedings of the 16th Annual FIRST Conference on Computer Security Incident Handling.
 
56
57
 
58
Wu, K., Otoo, E., and Shoshani, A. 2004. On the performance of bitmap indices for high cardinality attributes. In Proceedings of the International Conference on Very Large Data Bases.
59
60


Collaborative Colleagues:
Kurt Stockinger: colleagues
E. Wes Bethel: colleagues
Scott Campbell: colleagues
Eli Dart: colleagues
Kesheng Wu: colleagues