ACM Home Page
Please provide us with feedback. Feedback
SNARE: a link analytic system for graph labeling and risk detection
Full text MovMov (17:12),  PdfPdf (652 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Industrial track papers table of contents
Pages 1265-1274  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Mary McGlohon  Carnegie Mellon University, Pittsburgh, PA, USA
Stephen Bay  PricewaterhouseCoopers, LLC, San Jose, CA, USA
Markus G. Anderle  PricewaterhouseCoopers, LLC, San Jose, CA, USA
David M. Steier  PricewaterhouseCoopers, LLC, San Jose, CA, USA
Christos Faloutsos  Carnegie Mellon University, Pittsburgh, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 47,   Downloads (12 Months): 113,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557155
What is a DOI?

ABSTRACT

Classifying nodes in networks is a task with a wide range of applications. It can be particularly useful in anomaly and fraud detection. Many resources are invested in the task of fraud detection due to the high cost of fraud, and being able to automatically detect potential fraud quickly and precisely allows human investigators to work more efficiently. Many data analytic schemes have been put into use; however, schemes that bolster link analysis prove promising. This work builds upon the belief propagation algorithm for use in detecting collusion and other fraud schemes. We propose an algorithm called SNARE (Social Network Analysis for Risk Evaluation). By allowing one to use domain knowledge as well as link knowledge, the method was very successful for pinpointing misstated accounts in our sample of general ledger data, with a significant improvement over the default heuristic in true positive rates, and a lift factor of up to 6.5 (more than twice that of the default heuristic). We also apply SNARE to the task of graph labeling in general on publicly-available datasets. We show that with only some information about the nodes themselves in a network, we get surprisingly high accuracy of labels. Not only is SNARE applicable in a wide variety of domains, but it is also robust to the choice of parameters and highly scalable-linearly with the number of edges in a graph.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
]]L. Adamic and N. Glance. The political blogosphere and the 2004 U.S. election: Divided they blog, 2005.
 
2
 
3
]]R. Behrman and K. Carley. Modeling the structure and effectiveness of intelligence organizations: Dynamic information flow simulation. In Proceedings of the 8th International Command and Control Research and Technology Symposium., 2003.
 
4
]]T. Bell and J. Carcello. A decision aid of assessing the likelihood of fraudulent financial reporting. Auditing: A journal of practice and theory, 19:169--184, 2000.
 
5
]]M. Beneish. The detection of earnings manipulation. Financial Analysts Journal, 55(5):24--36, 1999.
 
6
]]R. Bolton and D. Hand. Statistical fraud detection: A review, 2002.
 
7
]]R. J. Bolton and D. J. Hand. Unsupervised profiling methods for fraud detection, 2001.
 
8
]]T. Cohn. Scaling Conditional Random Fields for Natural Language Processing. PhD thesis, University of Melbourne, 2007.
 
9
]]P. M. Dechow, W. Ge, C. R. Larson, and R. G. Sloan. Predicting material account manipulations. AAA 2008 Financial Accounting and Reporting Section (FARS), 2008.
 
10
]]D. Dooley and G. Lamont. PwC 2005 securities litigation study. Technical report, PricewaterhouseCoopers LLP, 2006.
11
 
12
13
 
14
]]W. Golden, S. Skalak, and M. Clayton. A Guide to Forensic Accounting Investigation. John Wiley & Sons, Hoboken, N.J., 2006.
 
15
]]H. Grove and T. Cook. A statistical analysis of financial ratio red flags. Oil, Gas and Energy Quarterly, 53(2):3212--3346, 2004.
16
17
 
18
]]S. Hill, F. Provost, and C. Volinsky. Network-based marketing: Identifying likely adopters via consumer networks. Statistical Science, 22(2):256--275, 2006.
19
20
 
21
]]S. A. Macskassy and F. Provost. Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In Proceedings of the NAACSOS Conference, June 2005.
 
22
]]C. W. Mulford and E. E. Comiskey. The Financial Numbers Game: Detecting Creative Accounting Practices. John Wiley & Sons, Hoboken, N.J., 2002.
23
 
24
]]L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
25
 
26
]]H. Schilit. Financial Shenanigans: How to Detect Accounting Gimmicks and Fraud in Financial Reports. McGraw-Hill, 2002.
 
27
]]S. Skalak and C. Nestler. Global economic crime survey 2005. Technical report, PricewaterhouseCooper LLP, 2005.
 
28
]]J. Wells. Corporate Fraud Handbook: Prevention and Detection. John Wiley & Sons, Hoboken, N.J., 2004.
 
29
 
30
 
31

Collaborative Colleagues:
Mary McGlohon: colleagues
Stephen Bay: colleagues
Markus G. Anderle: colleagues
David M. Steier: colleagues
Christos Faloutsos: colleagues