ACM Home Page
Please provide us with feedback. Feedback
Multipattern string matching with q-grams
Full text PdfPdf (386 KB)
Source Journal of Experimental Algorithmics (JEA) archive
Volume 11 ,  (2006) table of contents
SECTION: 1 - Regular Papers table of contents
Article No. 1.1  
Year of Publication: 2007
ISSN:1084-6654
Authors
Leena Salmela  Helsinki University of Technology, Finland
Jorma Tarhio  Helsinki University of Technology, Finland
Jari Kytöjoki  Helsinki University of Technology, Finland
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1187436.1187438
What is a DOI?

ABSTRACT

We present three algorithms for exact string matching of multiple patterns. Our algorithms are filtering methods, which apply q-grams and bit parallelism. We ran extensive experiments with them and compared them with various versions of earlier algorithms, e.g., different trie implementations of the Aho--Corasick algorithm. All of our algorithms appeared to be substantially faster than earlier solutions for sets of 1,000--10,000 patterns and the good performance of two of them continues to 100,000 patterns. The gain is because of the improved filtering efficiency caused by q-grams.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
4
5
 
6
 
7
 
8
 
9
 
10
Fredriksson, K. and Grabowski, S. 2005. Practical and optimal string matching. In Proceedings of 12th Symposium on String Processing and Information Retrieval (SPIRE'05). Lecture Notes in Computer Science, vol. 3772. Springer-Verlag, Berlin.
 
11
Gum, B. and Lipton, R. 2001. Cheaper by the dozen: Batched algorithms. In Proceedings of the 1st SIAM International Conference on Data Mining.
 
12
Horspool, N. 1980. Practical fast searching in strings. Softw. Pract. Exper. 10, 501--506.
 
13
 
14
Kim, S. and Kim, Y. 1999. A fast multiple string-pattern matching algorithm. In Proceedings of 17th AoM/IAoM Conference on Computer Science.
 
15
Markatos, E. P., Antonatos, S., Polychronakis, M., and Anagnostakis, K. G. 2002. Exclusion-based signature matching for intrusion detection. In Proceedings of the IASTED International Conference on Communications and Computer Networks (CCN). ACTA Press. Calgary, AB, Canada, 146--152.
 
16
17
 
18
 
19
Peltola, H. and Tarhio, J. 2003. Alternative algorithms for bit-parallel string matching. In Proceedings of 10th Symposium on String Processing and Information Retrieval (SPIRE'03). Lecture Notes in Computer Science, vol. 2857. Springer-Verlag, Berlin. 80--94.
 
20
Ping, L., Jian-long, T., and Yan-bing, L. 2005. A partition-based efficient algorithm for large scale multiple-strings matching. In Proceedings of 12th Symposium on String Processing and Information Retrieval (SPIRE'05). Lecture Notes in Computer Science, vol. 3772. Springer-Verlag, Berlin.
 
21
Tuck, N., Sherwood, T., Calder, B., and Varghese, G. 2004. Deterministic memory-efficient string matching algorithms for intrusion detection. In Proceedings of the IEEE Infocom Conference.
 
22
Wu, S. and Manber, U. 1992a. Agrep---a fast approximate pattern-matching tool. In Proceedings of the Usenix Winter 1992 Technical Conference. 153--162.
23
 
24
Wu, S. and Manber, U. 1994. A fast algorithm for multi-pattern searching. Tech. Rep. TR-94-17, Department of Computer Science. University of Arizona, Tucson, Arizona.
25


Collaborative Colleagues:
Leena Salmela: colleagues
Jorma Tarhio: colleagues
Jari Kytöjoki: colleagues