|
ABSTRACT
An algorithm is presented that searches for the location, “il” of the first occurrence of a character string, “pat,” in another string, “string.” During the search operation, the characters of pat are matched starting with the last character of pat. The information gained by starting the match at the end of the pattern often allows the algorithm to proceed in large jumps through the text being searched. Thus the algorithm has the unusual property that, in most cases, not all of the first i characters of string are inspected. The number of characters actually inspected (on the average) decreases as a function of the length of pat. For a random English pattern of length 5, the algorithm will typically inspect i/4 characters of string before finding a match at i. Furthermore, the algorithm has been implemented so that (on the average) fewer than i + patlen machine instructions are executed. These conclusions are supported with empirical evidence and a theoretical analysis of the average behavior of the algorithm. The worst case behavior of the algorithm is linear in i + patlen, assuming the availability of array space for tables linear in patlen plus the size of the alphabet.
3~
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
Dewey, G. Relativ Frequency o f English Speech Sounds. Harvard U. Press, Cambridge, Mass., 1923, p. 185.
|
| |
4
|
Knuth, D.E., Morris, J.H., and Pratt, V.R. Fast pattern matching in strings. TR CS-74-440, Stanford U., Stanford, Calif., 1974.
|
| |
5
|
Knuth, D.E., Morris, J.H., and Pratt, V.R. Fast pattern matching in strings. (to appear in SIAM J. Comput.).
|
CITED BY 198
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Amihood Amir , Gary Benson , Martin Farach, Alphabet independent two dimensional matching, Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, p.59-68, May 04-06, 1992, Victoria, British Columbia, Canada
|
|
|
|
|
|
Ricardo A. Baeza-Yates , Gaston H. Gonnet , Mireille Régnier, Analysis of Boyer-Moore-type string searching algorithms, Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, p.328-343, January 22-24, 1990, San Francisco, California, United States
|
|
|
|
|
|
Thanaruk Theeramunkong , Virach Sornlertlamvanich , Thanasan Tanhermhong , Wirat Chinnan, Character cluster based Thai information retrieval, Proceedings of the fifth international workshop on on Information retrieval with Asian languages, p.75-80, September 30-October 01, 2000, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Y. Choueka , A. Fraenkel , S. Klein , E. Segal, Improved techniques for processing queries in full-text systems, Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval, p.306-315, June 03-05, 1987, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
Amihood Amir , Martin Farach , Ramana M. Idury , Johannes A. La Poutré , Alejandro A. Schäffer, Improved dynamic dictionary matching, Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, p.392-401, January 25-27, 1993, Austin, Texas, United States
|
|
|
Stephen Alstrup , Gerth Stølting Brodal , Theis Rauhe, Pattern matching in dynamic texts, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.819-828, January 09-11, 2000, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Andrew B. Kahng , Darko Kirovski , Stefanus Mantik , Miodrag Potkonjak , Jennifer L. Wong, Copy detection for intellectual property protection of VLSI designs, Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design, p.600-605, November 07-11, 1999, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ming Gu , Martin Farach , Richard Beigel, An efficient algorithm for dynamic text indexing, Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms, p.697-704, January 23-25, 1994, Arlington, Virginia, United States
|
|
|
|
|
|
Amihood Amir , Moshe Lewenstein , Ely Porat, Faster algorithms for string matching with k mismatches, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.794-803, January 09-11, 2000, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Torben Amtoft , Charles Consel , Olivier Danvy , Karoline Malmkjær, The abstraction and instantiation of string-matching programs, The essence of computation: complexity, analysis, transformation, Springer-Verlag New York, Inc., New York, NY, 2002
|
|
|
|
|
|
Lipyeow Lim , Min Wang , Sriram Padmanabhan , Jeffrey Scott Vitter , Ramesh Agarwal, Dynamic maintenance of web indexes using landmarks, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Maxime Crochemore , Costas S. Iliopoulos , Thierry Lecroq , Yoan J. Pinzon , Wojciech Plandowski , Wojciech Rytter, Occurrence and substring heuristics for δ-matching, Fundamenta Informaticae, v.56 n.1,2, p.1-21, July 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reza Sadri , Carlo Zaniolo , Amir Zarkesh , Jafar Adibi, Optimization of sequence queries in database systems, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.71-81, May 2001, Santa Barbara, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Subhabrata Sen , Oliver Spatscheck , Dongmei Wang, Accurate, scalable in-network identification of p2p traffic using application signatures, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
|
|
|
|
|
|
Tun-Wen Pai , Margaret Dah-Tsyr Chang , Jia-Han Chu , Wei-Yuan Chang , Hsiu Ling Tai, Ladderlike stepping and interval jumping searching algorithms for DNA sequences, Proceedings of the second conference on Asia-Pacific bioinformatics, p.93-98, January 01, 2004, Dunedin, New Zealand
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Charles W. Glover , Nageswara S. V. Rao , E. M. Oblow, Hybrid pattern recognition system capable of self-modification, Proceedings of the second international conference on Information and knowledge management, p.239-244, November 01-05, 1993, Washington, D.C., United States
|
|
|
Amihood Amir , Gad M. Landau , Usi Vishkin, Efficient pattern matching with scaling, Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, p.344-357, January 22-24, 1990, San Francisco, California, United States
|
|
|
|
|
|
K. Selçuk Candan , Mehmet E. Dönderler , J. Ramamoorthy , Jong W. Kim, Clustering and indexing of experience sequences for popularity-driven recommendations, Proceedings of the 3rd ACM workshop on Continuous archival and retrival of personal experences, October 28-28, 2006, Santa Barbara, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sungwon Yi , Byoung-koo Kim , Jintae Oh , Jongsoo Jang , George Kesidis , Chita R. Das, Memory-efficient content filtering hardware for high-speed intrusion detection systems, Proceedings of the 2007 ACM symposium on Applied computing, March 11-15, 2007, Seoul, Korea
|
|
|
|
|
|
Nikola Grcevski , Allan Kielstra , Kevin Stoodley , Mark Stoodley , Vijay Sundaresan, JavaTM just-in-time compiler and virtual machine improvements for server and middleware applications, Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium, p.12-12, May 06-07, 2004, San Jose, California
|
|
|
|
|
|
Ying-Dar Lin , Kuo-Kun Tseng , Tsern-Huei Lee , Yi-Neng Lin , Chen-Chou Hung , Yuan-Cheng Lai, A platform-based SoC design and implementation of scalable automaton matching for deep packet inspection, Journal of Systems Architecture: the EUROMICRO Journal, v.53 n.12, p.937-950, December, 2007
|
|
|
|
|
|
|
|
|
|
|
|
Amihood Amir , Ayelet Butman , Moshe Lewenstein , Ely Porat , Dekel Tsur, Efficient one-dimensional real scaled matching, Journal of Discrete Algorithms, v.5 n.2, p.205-211, June, 2007
|
|
|
Amihood Amir , Eran Chencinski , Costas Iliopoulos , Tsvi Kopelowitz , Hui Zhang, Property matching and weighted matching, Theoretical Computer Science, v.395 n.2-3, p.298-310, May, 2008
|
|
|
Choochart Haruechaiyasak , Chatchawal Sangkeettrakarn , Pornpimon Palingoon , Sarawoot Kongyoung , Chaianun Damrongrat, A collaborative framework for collecting Thai unknown words from the web, Proceedings of the COLING/ACL on Main conference poster sessions, p.345-352, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
H. V. Jagadish , S. Al-Khalifa , A. Chapman , L. V. S. Lakshmanan , A. Nierman , S. Paparizos , J. M. Patel , D. Srivastava , N. Wiwatwattana , Y. Wu , C. Yu, TIMBER: A native XML database, The VLDB Journal — The International Journal on Very Large Data Bases, v.11 n.4, p.274-291, December 2002
|
|
|
|
|
|
|
|
|
Z. M. Kedem , G. M. Landau , K. V. Palem, Optimal parallel suffix-prefix matching algorithm and applications, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.388-398, June 18-21, 1989, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Atul Mahajan , Benfano Soewito , Sai K. Parsi , Ning Weng , Haibo Wang, Implementing high-speed string matching hardware for network intrusion detection systems, Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays, February 24-26, 2008, Monterey, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ying Liu , Lucian V. Lita , R. Stefan Niculescu , Kun Bai , Prasenjit Mitra , C. Lee Giles, Real-time data pre-processing technique for efficient feature extraction in large scale datasets, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Elias Athanasopoulos , Antonis Krithinakis , Georgios Kopidakis , Graeme Maxwell , Alistair Poustie , Bob Manning , Rod Webb , Martin Koyabe , Carla Di Cairano-Gilfedder, WISDOM: security-aware fibres, Proceedings of the Second European Workshop on System Security, p.22-27, March 31-31, 2009, Nuremburg, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Lior Aronovich , Ron Asher , Eitan Bachmat , Haim Bitner , Michael Hirsch , Shmuel T. Klein, The design of a similarity based deduplication system, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
|
|
|
|
|
|
|
|