|
ABSTRACT
This paper provides a novel algorithm for automatically extracting social hierarchy data from electronic communication behavior. The algorithm is based on data mining user behaviors to automatically analyze and catalog patterns of communications between entities in a email collection to extract social standing. The advantage to such automatic methods is that they extract relevancy between hierarchy levels and are dynamic over time. We illustrate the algorithms over real world data using the Enron corporation's email archive. The results show great promise when compared to the corporations work chart and judicial proceeding analyzing the major players.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
W. Cohen. Enron data set, March 2004.
|
| |
5
|
D. G. Deepak P and V. Varshney. Analysis of enron email threads and quantification of employee responsiveness. In Proceedings of the Text Mining and Link Analysis Workshop on International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007.
|
| |
6
|
J. Diesner and K. Carley. Exploration of communication networks from the enron email corpus. In Proceedings of Workshop on Link Analysis, Counterterrorism and Security, Newport Beach CA, 2005.
|
| |
7
|
|
| |
8
|
T. Elsayed and D. W. Oard. Modeling identity in archival collections of email: a preliminary study. In Third Conference on Email and Anti-spam (CEAS), Mountain View, CA, July 2006.
|
 |
9
|
|
| |
10
|
L. Freeman. Centrality in networks: I. conceptual clarification. Social networks, 1:215-239, 1979.
|
 |
11
|
|
| |
12
|
|
| |
13
|
H. G. Goldberg, J. D. Kirkland, D. Lee, P. Shyr, and D. Thakker. The NASD securities observation, news analysis and regulation system (sonar). In IAAI 2003, 2003.
|
| |
14
|
|
| |
15
|
D. F. Joshua O'Madadhain and S. White. Java universal network/graph framework, 2006. JUNG 1.7.4.
|
| |
16
|
|
| |
17
|
J. D. Kirkland, T. E. Senator, J. J. Hayden, T. Dybala, H. G.Goldberg, and P. Shyr. The nasd regulation advanced detection system (ads). AI Magazine, 20(1):55-67, 1999.
|
 |
18
|
|
| |
19
|
B. Klimt and Y. Yang. The enron corpus: A new dataset for email classification research. In European Conference on Machine Learning, Pisa, Italy, 2004.
|
| |
20
|
B. Klimt and Y. Yang. Introducing the enron corpus. In First Conference on Email and Anti-spam (CEAS), Mountain View, CA, 2004.
|
| |
21
|
B. Klimt and Y. Yang. Introducing the enron corpus. In CEAS, 2004.
|
| |
22
|
|
| |
23
|
S. Madnick, R. Wang, and W. Zhang. A framework for corporate householding. In C. Fisher and B. Davidson, editors, Proceedings of the Seventh International Conference on Information Quality, pages 36-40, Cambridge, MA, November 2002.
|
| |
24
|
A. McCallum, A. Corrada-Emmanuel, and X. Wang. The author-recipient-topic model for topic and role discovery in social networks: Experiments with enron and academic email. In NIPS'04 Workshop on 'Structured Data and Representations in Probabilistic Models for Categorization', Whistler, B.C., 2004.
|
| |
25
|
C. Perlich and Z. Huang. Relational learning for customer relationship management. In Proceedings of International Workshop on Customer Relationship Management: Data Mining Meets Marketing, 2005.
|
| |
26
|
|
 |
27
|
|
| |
28
|
J. Shetty and J. Adibi. The enron email dataset database schema and brief statistical report, 2004.
|
 |
29
|
|
| |
30
|
M. Sparrow. The application of network analysis to criminal intelligence: an assessment of the prospects. Social networks, 13:251-274, 1991.
|
 |
31
|
|
 |
32
|
Salvatore J. Stolfo , Shlomo Hershkop , Chia-Wei Hu , Wei-Jen Li , Olivier Nimeskern , Ke Wang, Behavior-based modeling and its application to Email analysis, ACM Transactions on Internet Technology (TOIT), v.6 n.2, p.187-221, May 2006
[doi> 10.1145/1149121.1149125]
|
| |
33
|
B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In B. Nebel, editor, Proceeding of IJCAI-01, 17th International Joint Conference on Artificial Intelligence, pages 870-878, Seattle, US, 2001.
|
| |
34
|
B. Taskar, M. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. In Proceedings of Neural Information Processing Systems, 2004., 2004.
|
CITED BY 3
|
|
Haizheng Zhang , John Yen , C. Lee Giles , Bamshad Mombaster , Myra Spiliopoulou , Jaideep Srivastava , Olfa Nasraoui , Andrew McCallum, WebKDD/SNAKDD 2007: web mining and social network analysis post-workshop report, ACM SIGKDD Explorations Newsletter, v.9 n.2, December 2007
|
|
|
Matthew C. Schmidt , Nagiza F. Samatova , Kevin Thomas , Byung-Hoon Park, A scalable, parallel algorithm for maximal clique enumeration, Journal of Parallel and Distributed Computing, v.69 n.4, p.417-428, April, 2009
|
|
|
|
|