|
ABSTRACT
Consider a sequence S of n symbols drawn from an alphabet A = {1, 2,. . .,σ}, stored as a binary string of nlog σ bits. A succinct data structure on S supports a given set of primitive operations on S using just f (n) = o(n log σ) extra bits. We present a technique for transforming succinct data structures (which do not change the binary content of S) into compressed data structures using nHk + f(n) + O(n log σ + log logσ n + k)/ logσ n) bits of space, where Hk ≤ log σ is the kth-order empirical entropy of S. When k + log σ = o(log n), we improve the space complexity of the succinct data structure from n log σ + o(n log σ) to n Hk + o(nlog σ) bits by keeping S in compressed format, so that any substring of O(log σ n) symbols in S (i.e. O(log n) bits) can be decoded on the fly in constant time. Thus, the time complexity of the supported operations does not change asymptotically. Namely, if an operation takes t(n) time in the succinct data structure, it requires O(t(n)) time in the resulting compressed data structure. Using this simple approach we improve the space complexity of some of the best known results on succinct data structures We extend our results to handle another definition of entropy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
P. Ferragina, G. Manzini, V. Mäkinen, and G. Navarro. Succinct Representation of Sequences. Tr/dcc-2004-5, August 2004. ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/sequences.ps.gz.
|
| |
11
|
Paolo Ferragina and Giovanni Manzini. On compressing and indexing data. Journal of the ACM, 2005.
|
| |
12
|
|
| |
13
|
Anna Gál and Peter Bro Miltersen. The cell probe complexity of succinct data structures. In Proc. ICALP, em LNCS 2719, pp. 332--344, 2003.
|
| |
14
|
R. F. Geary, N. Rahman, R. Raman, and V. Raman. A simple optimal representation for balanced parentheses. In Proc. CPM, LNCS 3109, pp. 159--172, 2004.
|
| |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
| |
20
|
A. Gupta, W.-K. Hon, R. Shah, and J. S. Vitter. Compressed Data Structures: Dictionaries and Data-Aware Measures. Manuscript, 2005.
|
| |
21
|
G. Jacobson. Space-efficient static trees and graphs. In Proc. IEEE FOCS, pp. 549--554, 1989.
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
|
| |
28
|
J. Ian Munro, Rajeev Raman, Venkatesh Raman, and S. Srinivasa Rao. Succinct representations of permutations. In Proc. ICALP, LNCS 2719, pp. 345--356, 2003.
|
| |
29
|
|
| |
30
|
J. Ian Munro , Venkatesh Raman , Adam J. Storm, Representing dynamic binary trees succinctly, Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, p.529-536, January 07-09, 2001, Washington, D.C., United States
|
| |
31
|
J. Ian Munro and S. Srinivasa Rao. Succinct representations of functions. In Proc. ICALP, LNCS 3142, pp. 1006--1015, 2004.
|
| |
32
|
|
| |
33
|
C. K. Poon and W. K. Yiu. Opportunistic Data Structures for Range Queries. In Proc. COCOON, LNCS 3595, pp. 560--569, 2005.
|
| |
34
|
|
| |
35
|
Rajeev Raman and S. Srinivasa Rao. Succinct dynamic dictionaries and trees. In Proc. ICALP, LNCS 2719, pp. 357--368, 2003.
|
| |
36
|
|
| |
37
|
|
| |
38
|
J. Ziv and A. Lempel. Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory, IT-24(5):530--536, September 1978.
|
CITED BY 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jérémy Barbay , Meng He , J. Ian Munro , S. Srinivasa Rao, Succinct indexes for strings, binary relations and multi-labeled trees, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, p.680-689, January 07-09, 2007, New Orleans, Louisiana
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|