ACM Home Page
Please provide us with feedback. Feedback
Reconstructing strings from random traces
Full text PdfPdf (245 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms table of contents
New Orleans, Louisiana
SESSION: Session 11A table of contents
Pages: 910 - 918  
Year of Publication: 2004
ISBN:0-89871-558-X
Authors
Tuǧkan Batu  University of Pennsylvania, Philadelphia, PA
Sampath Kannan  University of Pennsylvania, Philadelphia, PA
Sanjeev Khanna  University of Pennsylvania, Philadelphia, PA
Andrew McGregor  University of Pennsylvania, Philadelphia, PA
Sponsor
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
Society for Industrial and Applied Mathematics  Philadelphia, PA, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 25,   Citation Count: 4
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

We are given a collection of m random subsequences (traces) of a string t of length n where each trace is obtained by deleting each bit in the string with probability q. Our goal is to exactly reconstruct the string t from these observed traces. We initiate here a study of deletion rates for which we can successfully reconstruct the original string using a small number of samples. We investigate a simple reconstruction algorithm called Bitwise Majority Alignment that uses majority voting (with suitable shifts) to determine each bit of the original string. We show that for random strings t, we can reconstruct the original string (w.h.p.) for q = O(1/ log n) using only O(log n) samples. For arbitrary strings t, we show that a simple modification of Bitwise Majority Alignment reconstructs a string that has identical structure to the original string (w.h.p.) for q = O(1/n1/2+ε) using O(1) samples. In this case, using O(n log n) samples, we can reconstruct the original string exactly. Our setting can be viewed as the study of an idealized biological evolutionary process where the only possible mutations are random deletions. Our goal is to understand at what mutation rates, a small number of observed samples can be correctly aligned to reconstruct the parent string.In the process of establishing these results, we show that Bitwise Majority Alignment has an interesting self-correcting property whereby local distortions in the traces do not generate errors in the reconstruction and eventually get corrected.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
V. I. Levenshtein, Binary codes capable of correcting deletions, insertions and reversals (in Russian), Doklady Akademii Nauk SSSR, 163 (1965), no. 4, 845--848. Enlgish translation in Soviet Physics Dokl., 10 (1966), no. 8, 707--710.
 
4
V. I. Levenshtein, Binary codes capable of correcting spurious insertions and deletions of ones (in Russian), Problemy Peredachi Informatsii, 1 (1965), no. 1, 12--25. Enlgish translation in Problems of Information Transmission, 1 (1965), no. 1, 8--17.
 
5
V. I. Levenshtein, On perfect codes in the deletion/insertion metric (in Russian), Diskret. Mat. 3 (1991), no. 1, 3--20. English translation in Discrete Mathematics and Applications 2 (1992), no. 3, 241--258.
 
6
V. I. Levenshtein, Efficient reconstruction of sequences, IEEE Trans. Inform. Theory 47 (2001), no. 1, 2--22.
 
7
L. J. Shulman and D. Zuckerman, Asymptotically good codes correction insertions, deletions and transpositions, IEEE Trans. Inform. Theory 45 (1999), no. 7, 2552--2557.

Collaborative Colleagues:
Tuǧkan Batu: colleagues
Sampath Kannan: colleagues
Sanjeev Khanna: colleagues
Andrew McGregor: colleagues