ACM Home Page
Please provide us with feedback. Feedback
Automated social hierarchy detection through email network analysis
Full text PdfPdf (682 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis table of contents
San Jose, California
Pages 109-117  
Year of Publication: 2007
ISBN:978-1-59593-848-0
Authors
Ryan Rowe  Columbia University, New York, NY
German Creamer  Columbia University, New York, NY
Shlomo Hershkop  Columbia University, New York, NY
Salvatore J Stolfo  Columbia University, New York, NY
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 244,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1348549.1348562
What is a DOI?

ABSTRACT

This paper provides a novel algorithm for automatically extracting social hierarchy data from electronic communication behavior. The algorithm is based on data mining user behaviors to automatically analyze and catalog patterns of communications between entities in a email collection to extract social standing. The advantage to such automatic methods is that they extract relevancy between hierarchy levels and are dynamic over time.

We illustrate the algorithms over real world data using the Enron corporation's email archive. The results show great promise when compared to the corporations work chart and judicial proceeding analyzing the major players.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
W. Cohen. Enron data set, March 2004.
 
5
D. G. Deepak P and V. Varshney. Analysis of enron email threads and quantification of employee responsiveness. In Proceedings of the Text Mining and Link Analysis Workshop on International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007.
 
6
J. Diesner and K. Carley. Exploration of communication networks from the enron email corpus. In Proceedings of Workshop on Link Analysis, Counterterrorism and Security, Newport Beach CA, 2005.
 
7
 
8
T. Elsayed and D. W. Oard. Modeling identity in archival collections of email: a preliminary study. In Third Conference on Email and Anti-spam (CEAS), Mountain View, CA, July 2006.
9
 
10
L. Freeman. Centrality in networks: I. conceptual clarification. Social networks, 1:215-239, 1979.
11
 
12
 
13
H. G. Goldberg, J. D. Kirkland, D. Lee, P. Shyr, and D. Thakker. The NASD securities observation, news analysis and regulation system (sonar). In IAAI 2003, 2003.
 
14
 
15
D. F. Joshua O'Madadhain and S. White. Java universal network/graph framework, 2006. JUNG 1.7.4.
 
16
 
17
J. D. Kirkland, T. E. Senator, J. J. Hayden, T. Dybala, H. G.Goldberg, and P. Shyr. The nasd regulation advanced detection system (ads). AI Magazine, 20(1):55-67, 1999.
18
 
19
B. Klimt and Y. Yang. The enron corpus: A new dataset for email classification research. In European Conference on Machine Learning, Pisa, Italy, 2004.
 
20
B. Klimt and Y. Yang. Introducing the enron corpus. In First Conference on Email and Anti-spam (CEAS), Mountain View, CA, 2004.
 
21
B. Klimt and Y. Yang. Introducing the enron corpus. In CEAS, 2004.
 
22
 
23
S. Madnick, R. Wang, and W. Zhang. A framework for corporate householding. In C. Fisher and B. Davidson, editors, Proceedings of the Seventh International Conference on Information Quality, pages 36-40, Cambridge, MA, November 2002.
 
24
A. McCallum, A. Corrada-Emmanuel, and X. Wang. The author-recipient-topic model for topic and role discovery in social networks: Experiments with enron and academic email. In NIPS'04 Workshop on 'Structured Data and Representations in Probabilistic Models for Categorization', Whistler, B.C., 2004.
 
25
C. Perlich and Z. Huang. Relational learning for customer relationship management. In Proceedings of International Workshop on Customer Relationship Management: Data Mining Meets Marketing, 2005.
 
26
27
 
28
J. Shetty and J. Adibi. The enron email dataset database schema and brief statistical report, 2004.
29
 
30
M. Sparrow. The application of network analysis to criminal intelligence: an assessment of the prospects. Social networks, 13:251-274, 1991.
31
32
 
33
B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In B. Nebel, editor, Proceeding of IJCAI-01, 17th International Joint Conference on Artificial Intelligence, pages 870-878, Seattle, US, 2001.
 
34
B. Taskar, M. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. In Proceedings of Neural Information Processing Systems, 2004., 2004.


Collaborative Colleagues:
Ryan Rowe: colleagues
German Creamer: colleagues
Shlomo Hershkop: colleagues
Salvatore J Stolfo: colleagues