ACM Home Page
Please provide us with feedback. Feedback
Dynamic dictionary matching and compressed suffix trees
Full text PdfPdf (952 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms table of contents
Vancouver, British Columbia
SESSION: Session 1A table of contents
Pages: 13 - 22  
Year of Publication: 2005
ISBN:0-89871-585-7
Authors
Ho-Leung Chan  The University of Hong Kong, Hong Kong
Wing-Kai Hon  The University of Hong Kong, Hong Kong
Tak-Wah Lam  The University of Hong Kong, Hong Kong
Kunihiko Sadakane  Kyushu University, Japan
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
: SIAM Activity Group on Discrete Mathematics
Publisher
Society for Industrial and Applied Mathematics  Philadelphia, PA, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 37,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Recent breakthrough in compressed indexing data structures has reduced the space for indexing a text (or a collection of texts) of length n from O(n log n) bits to O(n) bits, while allowing very efficient pattern matching [10, 13]. Yet the compressed nature of such indices also makes them difficult to update dynamically. This paper presents the first O(n)-bit representation of a suffix tree for a dynamic collection of texts whose total length is n, which supports insertion and deletion of a text T in O(|T | log2 n) time, as well as all suffix tree traversal operations, including forward and backward suffix links. This work can be regarded as a generalization of the compressed representation of static texts. Our new suffix tree representation serves as a core part in a compact solution for the dynamic dictionary matching problem, i.e., providing an O(d)-bit data structure for a dynamic collection of patterns of total length d that can support the dictionary matching query efficiently. When compared with the O(d log d)-bit suffix tree based solution of Amir et al., the compact solution increases the query time by roughly a factor of log d only. In the study of the above results, we also derive the first O(n)-bit representation for maintaining n pairs of balanced parentheses in O(log n/log log n) time per operation, matching the time complexity of the previous O(n log n)-bit solution.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
 
6
M. Burrows and D. J. Wheeler. A Block-sorting Lossless Data Compression Algorithm. Technical Report 124, Digital Equipment Corporation, Paolo Alto, California, 1994.
 
7
H. L. Chan, W. K. Hon, and T. W. Lam. Compressed Index for a Dynamic Collection of Texts. In Proceedings of Symposium on Combinatorial Pattern Matching, pages 445--456, 2004.
 
8
W. I. Chang and E. L. Lawler. Sublinear Approximate String Matching and Biological Applications. Algorithmica, 12(4/5):327--344, 1994.
 
9
A. L. Delcher, S. Kasif, R. D. Fleischmann, J. Peterson, O. White, and S. L. Salzberg. Alignment of whole genomes. Nucleic Acids Research, 27(11):2369--2376, 1999.
 
10
11
 
12
R. Grossi and G. F. Italiano. Suffix Trees and their Applications in String Algorithms. In Proceedings of South American Workshop on String Processing, pages 57--76, 1993.
13
 
14
 
15
 
16
G. Jacobson. Space-efficient Static Trees and Graphs. In Proceedings of Symposium on Foundations of Computer Science, pages 549--554, 1989.
 
17
18
 
19
 
20
 
21
 
22
K. Sadakane. Compressed Suffix Trees with Full Functionality. Theory of Computing Systems, accepted.
 
23
 
24
P. Weiner. Linear Pattern Matching Algorithms. In Proceedings of Symposium on Switching and Automata Theory, pages 1--11, 1973.
Collaborative Colleagues:
Ho-Leung Chan: colleagues
Wing-Kai Hon: colleagues
Tak-Wah Lam: colleagues
Kunihiko Sadakane: colleagues