|
ABSTRACT
Recent breakthrough in compressed indexing data structures has reduced the space for indexing a text (or a collection of texts) of length n from O(n log n) bits to O(n) bits, while allowing very efficient pattern matching [10, 13]. Yet the compressed nature of such indices also makes them difficult to update dynamically. This paper presents the first O(n)-bit representation of a suffix tree for a dynamic collection of texts whose total length is n, which supports insertion and deletion of a text T in O(|T | log2 n) time, as well as all suffix tree traversal operations, including forward and backward suffix links. This work can be regarded as a generalization of the compressed representation of static texts. Our new suffix tree representation serves as a core part in a compact solution for the dynamic dictionary matching problem, i.e., providing an O(d)-bit data structure for a dynamic collection of patterns of total length d that can support the dictionary matching query efficiently. When compared with the O(d log d)-bit suffix tree based solution of Amir et al., the compact solution increases the query time by roughly a factor of log d only. In the study of the above results, we also derive the first O(n)-bit representation for maintaining n pairs of balanced parentheses in O(log n/log log n) time per operation, matching the time complexity of the previous O(n log n)-bit solution.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
M. Burrows and D. J. Wheeler. A Block-sorting Lossless Data Compression Algorithm. Technical Report 124, Digital Equipment Corporation, Paolo Alto, California, 1994.
|
| |
7
|
H. L. Chan, W. K. Hon, and T. W. Lam. Compressed Index for a Dynamic Collection of Texts. In Proceedings of Symposium on Combinatorial Pattern Matching, pages 445--456, 2004.
|
| |
8
|
W. I. Chang and E. L. Lawler. Sublinear Approximate String Matching and Biological Applications. Algorithmica, 12(4/5):327--344, 1994.
|
| |
9
|
A. L. Delcher, S. Kasif, R. D. Fleischmann, J. Peterson, O. White, and S. L. Salzberg. Alignment of whole genomes. Nucleic Acids Research, 27(11):2369--2376, 1999.
|
| |
10
|
|
 |
11
|
|
| |
12
|
R. Grossi and G. F. Italiano. Suffix Trees and their Applications in String Algorithms. In Proceedings of South American Workshop on String Processing, pages 57--76, 1993.
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
G. Jacobson. Space-efficient Static Trees and Graphs. In Proceedings of Symposium on Foundations of Computer Science, pages 549--554, 1989.
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
K. Sadakane. Compressed Suffix Trees with Full Functionality. Theory of Computing Systems, accepted.
|
| |
23
|
|
| |
24
|
P. Weiner. Linear Pattern Matching Algorithms. In Proceedings of Symposium on Switching and Automata Theory, pages 1--11, 1973.
|
|