|
ABSTRACT
We present three algorithms for exact string matching of multiple patterns. Our algorithms are filtering methods, which apply q-grams and bit parallelism. We ran extensive experiments with them and compared them with various versions of earlier algorithms, e.g., different trie implementations of the Aho--Corasick algorithm. All of our algorithms appeared to be substantially faster than earlier solutions for sets of 1,000--10,000 patterns and the good performance of two of them continues to 100,000 patterns. The gain is because of the improved filtering efficiency caused by q-grams.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Fredriksson, K. and Grabowski, S. 2005. Practical and optimal string matching. In Proceedings of 12th Symposium on String Processing and Information Retrieval (SPIRE'05). Lecture Notes in Computer Science, vol. 3772. Springer-Verlag, Berlin.
|
| |
11
|
Gum, B. and Lipton, R. 2001. Cheaper by the dozen: Batched algorithms. In Proceedings of the 1st SIAM International Conference on Data Mining.
|
| |
12
|
Horspool, N. 1980. Practical fast searching in strings. Softw. Pract. Exper. 10, 501--506.
|
| |
13
|
|
| |
14
|
Kim, S. and Kim, Y. 1999. A fast multiple string-pattern matching algorithm. In Proceedings of 17th AoM/IAoM Conference on Computer Science.
|
| |
15
|
Markatos, E. P., Antonatos, S., Polychronakis, M., and Anagnostakis, K. G. 2002. Exclusion-based signature matching for intrusion detection. In Proceedings of the IASTED International Conference on Communications and Computer Networks (CCN). ACTA Press. Calgary, AB, Canada, 146--152.
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
Peltola, H. and Tarhio, J. 2003. Alternative algorithms for bit-parallel string matching. In Proceedings of 10th Symposium on String Processing and Information Retrieval (SPIRE'03). Lecture Notes in Computer Science, vol. 2857. Springer-Verlag, Berlin. 80--94.
|
| |
20
|
Ping, L., Jian-long, T., and Yan-bing, L. 2005. A partition-based efficient algorithm for large scale multiple-strings matching. In Proceedings of 12th Symposium on String Processing and Information Retrieval (SPIRE'05). Lecture Notes in Computer Science, vol. 3772. Springer-Verlag, Berlin.
|
| |
21
|
Tuck, N., Sherwood, T., Calder, B., and Varghese, G. 2004. Deterministic memory-efficient string matching algorithms for intrusion detection. In Proceedings of the IEEE Infocom Conference.
|
| |
22
|
Wu, S. and Manber, U. 1992a. Agrep---a fast approximate pattern-matching tool. In Proceedings of the Usenix Winter 1992 Technical Conference. 153--162.
|
 |
23
|
|
| |
24
|
Wu, S. and Manber, U. 1994. A fast algorithm for multi-pattern searching. Tech. Rep. TR-94-17, Department of Computer Science. University of Arizona, Tucson, Arizona.
|
 |
25
|
|
|