| LOF: identifying density-based local outliers |
| Full text |
Pdf
(264 KB)
|
| Source
|
International Conference on Management of Data
archive
Proceedings of the 2000 ACM SIGMOD international conference on Management of data
table of contents
Dallas, Texas, United States
Pages: 93 - 104
Year of Publication: 2000
ISBN:1-58113-217-4
Also published in ...
|
|
Authors
|
|
Markus M. Breunig
|
Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany
|
|
Hans-Peter Kriegel
|
Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany
|
|
Raymond T. Ng
|
Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
|
|
Jörg Sander
|
Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 31, Downloads (12 Months): 177, Citation Count: 105
|
|
|
ABSTRACT
For many KDD applications, such as detecting criminal activities in E-commerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using real-world datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Aming, A., Agrawal R., Raghavan R: "A Linear Method for Deviation Detection in Large Databases", Proc. 2rid Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, p. 164-169.
|
 |
2
|
Mihael Ankerst , Markus M. Breunig , Hans-Peter Kriegel , Jörg Sander, OPTICS: ordering points to identify the clustering structure, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.49-60, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
3
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
| |
4
|
|
| |
5
|
Barnett V., Lewis T.: "Outliers in statistical data", John Wiley, 1994.
|
| |
6
|
DuMouchel W., Schonlau M.: "A Fast Computer Intrusion Detection Algorithm based on Hypothesis Testing of Command Transition Probabilities", Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, AAAI Press, 1998, pp. 189-193.
|
| |
7
|
Ester M., Kriegel H.-E, Sander J., Xu X.: "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise", Proc. 2rid Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, pp. 226-231.
|
| |
8
|
|
| |
9
|
Fayyad U., Piatetsky-Shapiro G., Smyth R: "Knowledge Discovery and Data Mining: Towards a Unifying Framework", Proc. 2rid Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, 1996, pp. 82-88.
|
| |
10
|
Hawkins, D.: "Identification of Outliers", Chapman and Hall, London, 1980.
|
| |
11
|
Hinneburg A., Keim D.A.: "An Efficient Approach to Clustering in Large Multimedia Databases with Noise", Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York City, NY, 1998,pp. 58-65.
|
| |
12
|
Johnson T., Kwok I., Ng R.: "Fast Computation of 2- Dimensional Depth Contours", Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, AAAI Press, 1998, pp. 224-228.
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
Sridhar Ramaswamy , Rajeev Rastogi , Kyuseok Shim, Efficient algorithms for mining outliers from large data sets, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.427-438, May 15-18, 2000, Dallas, Texas, United States
|
| |
18
|
|
| |
19
|
|
| |
20
|
Tukey J. W.: "Exploratory Data Analysis", Addison-Wesley, 1977.
|
| |
21
|
|
| |
22
|
|
 |
23
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
CITED BY 108
|
|
|
|
|
Christian Böhm , Bernhard Braunmüller , Markus Breunig , Hans-Peter Kriegel, High performance clustering based on the similarity join, Proceedings of the ninth international conference on Information and knowledge management, p.298-305, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shichao Zhang , Feng Chen , Xindong Wu , Chengqi Zhang, Identifying bridging rules between conceptual clusters, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
S. Subramaniam , T. Palpanas , D. Papadopoulos , V. Kalogeraki , D. Gunopulos, Online outlier detection in sensor data using non-parametric models, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lian Duan , Lida Xu , Feng Guo , Jun Lee , Baopin Yan, A local-density based spatial clustering algorithm with noise, Information Systems, v.32 n.7, p.978-986, November, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ji Zhang , Meng Lou , Tok Wang Ling , Hai Wang, Hos-Miner: a system for detecting outlyting subspaces of high-dimensional data, Proceedings of the Thirtieth international conference on Very large data bases, p.1265-1268, August 31-September 03, 2004, Toronto, Canada
|
|
|
Chenyi Xia , Hongjun Lu , Beng Chin Ooi , Jing Hu, Gorder: an efficient method for KNN join processing, Proceedings of the Thirtieth international conference on Very large data bases, p.756-767, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
Stefan Harmeling , Guido Dornhege , David Tax , Frank Meinecke , Klaus-Robert Müller, From outliers to prototypes: Ordering data, Neurocomputing, v.69 n.13-15, p.1608-1618, August, 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiuyao Song , Mingxi Wu , Christopher Jermaine , Sanjay Ranka, Statistical change detection for multi-dimensional data, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
Bo Sheng , Qun Li , Weizhen Mao , Wen Jin, Outlier detection in sensor networks, Proceedings of the 8th ACM international symposium on Mobile ad hoc networking and computing, September 09-14, 2007, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jian Yang , Ning Zhong , Yiyu Yao , Jue Wang, Local peculiarity factor and its application in outlier detection, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yufei Tao , Ke Yi , Cheng Sheng , Panos Kalnis, Quality and efficiency in high dimensional nearest neighbor search, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Suratna Budalakoti , Ashok N. Srivastava , Matthew E. Otey, Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, v.39 n.1, p.101-113, January 2009
|
|
|
Hui Xiong , Junjie Wu , Jian Chen, K-means clustering versus validation measures: a data-distribution perspective, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, v.39 n.2, p.318-331, April 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nebojša Pejčić , Nataša Reljin , Samantha McDaniel , Dragoljub Pokrajac , Aleksandar Lazarević, Detection of moving objects using incremental connectivity outlier factor algorithm, Proceedings of the 47th Annual Southeast Regional Conference, March 19-21, 2009, Clemson, South Carolina
|
|
|
|
|
|
|
|
|
|
|