ACM Home Page
Please provide us with feedback. Feedback
Intelligent file scoring system for malware detection from the gray list
Full text MovMov (17:59),  PdfPdf (1.15 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Industrial track papers table of contents
Pages 1385-1394  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Yanfang Ye  Xiamen University, Xiamen, China
Tao Li  Florida International University, Miami, FL, USA
Qingshan Jiang  Xiamen University, Xiamen, China
Zhixue Han  Xiamen University, Xiamen, China
Li Wan  KingSoft Corporation, Zhuhai, China
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 107,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557167
What is a DOI?

ABSTRACT

Currently, the most significant line of defense against malware is anti-virus products which focus on authenticating valid software from a white list, blocking invalid software from a black list, and running any unknown software (i.e., the gray list) in a controlled manner. The gray list, containing unknown software programs which could be either normal or malicious, is usually authenticated or rejected manually by virus analysts. Unfortunately, along with the development of the malware writing techniques, the number of file samples in the gray list that need to be analyzed by virus analysts on a daily basis is constantly increasing. In this paper, we develop an intelligent file scoring system (IFSS for short) for malware detection from the gray list by an ensemble of heterogeneous base-level classifiers derived by different learning methods, using different feature representations on dynamic training sets. To the best of our knowledge, this is the first work of applying such ensemble methods for malware detection. IFSS makes it practical for virus analysts to identify malware samples from the huge gray list and improves the detection ability of anti-virus software. It has already been incorporated into the scanning tool of Kingsoft's Anti-Virus software. The case studies on large and real daily collection of the gray list illustrate that the detection ability and efficiency of our IFSS system outperforms other popular scanning tools such as NOD32 and Kaspersky.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
 
5
U. Bayer, A. Moser, C. Kruegel, and E. Kirda. Dynamic analysis of malicious code. J Comput Virol, 2:67--77, May 2006.
 
6
N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16:321--357, 2002.
7
 
8
Thomas G. Dietterich. Machine-learning research: Four current directions. AI Magazine, 18(4):97--136, 1997.
 
9
 
10
D.K.S.Reddy and A.K.Pujari. N-gram analysis for computer virus detection. J Comput Virol, 2:231--239, November 2006.
 
11
 
12
Eric Filiol, Gregoire Jacob, and Michael Le Liard. Evaluation methodology and theoretical model for antiviral behavioural detection strategies. Journal in Computer Virology, 3(1):27--37, 2007.
13
 
14
C. Hsu and C. Lin. A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Networks, 13:415--425, 2002.
15
 
16
 
17
 
18
19
 
20
P. Langley. Selection of relevant features in machine learning. In Proceedings of AAAI Fall Symp., 1994.
 
21
 
22
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proceedings of KDD'98, pages 80--86, 1998.
 
23
 
24
 
25
 
26
 
27
 
28
G.J. Tesauro, J.O. Kephart, and G.B. Sorkin. Neural networks for computer virus recognition. IEEE Expert, 11:5--6, 1996.
 
29
 
30
J. Wang, P. Deng, Y. Fan, L. Jaw, and Y. Liu. Virus detection using data mining techniques. In Proceedings ICDM'03, 2003.
 
31
Y. Wang, Q. Xin, and F. Coenen. A novel rule ordering approach in classification association rule mining. In Proceedings of ICDM Workshop, 2007.
 
32
33
 
34
X. Yin and J. Han. CPAR: Classification based on predictive association rules. In Proceedings of SIAM International Conference on Data Mining (SDM-03), pages 331--335, 2003.

Collaborative Colleagues:
Yanfang Ye: colleagues
Tao Li: colleagues
Qingshan Jiang: colleagues
Zhixue Han: colleagues
Li Wan: colleagues