ACM Home Page
Please provide us with feedback. Feedback
Multiseed Lossless Filtration
Full text PdfPdf (430 KB)
Source IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) archive
Volume 2 ,  Issue 1  (January 2005) table of contents
Pages: 51 - 61  
Year of Publication: 2005
ISSN:1545-5963
Authors
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 47,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/TCBB.2005.12

ABSTRACT

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Kärkkäinen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
S. Altschul T. Madden A. Schäffer J. Zhang Z. Zhang W. Miller and D. Lipman, “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” <i>Nucleic Acids Research,</i> vol. 25, no. 17, pp. 3389-3402, 1997.
 
4
B. Ma J. Tromp and M. Li, “PatternHunter: Faster and More Sensitive Homology Search,” <i>Bioinformatics,</i> vol. 18, no. 3, pp. 440-445, 2002.
 
5
S. Schwartz J. Kent A. Smit Z. Zhang R. Baertsch R. Hardison D. Haussler and W. Miller, “Human-Mouse Alignments with BLASTZ,” <i>Genome Research,</i> vol. 13, pp. 103-107, 2003.
 
6
L. Noé and G. Kucherov, “Improved Hit Criteria for DNA Local Alignment,” <i>BMC Bioinformatics,</i> vol. 5, no. 149, Oct. 2004.
 
7
P. Pevzner and M. Waterman, “Multiple Filtration and Approximate Pattern Matching,” <i>Algorithmica,</i> vol. 13, pp. 135-154, 1995.
 
8
9
 
10
11
 
12
B. Brejova D. Brown and T. Vinar, “Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity,” <i>Proc. Third Int'l Workshop Algorithms in Bioinformatics (WABI),</i> pp. 39-54, Sept. 2003.
 
13
 
14
 
15
M. Csürös, “Performing Local Similarity Searches with Variable Length Seeds,” <i>Proc. 15th Ann. Combinatorial Pattern Matching Symp. (CPM),</i> pp. 373-387, 2004.
 
16
M. Li B. Ma D. Kisman and J. Tromp, “PatternHunter II: Highly Sensitive and Fast Homology Search,” <i>J. Bioinformatics and Computational Biology,</i> vol. 2, no. 3, pp. 417-440, Sept. 2004.
17
 
18
D.G. Brown, “Multiple Vector Seeds for Protein Alignment,” <i>Proc. Fourth Int'l Workshop Algorithms in Bioinformatics (WABI),</i> pp. 170-181, Sept. 2004.
 
19
J. Xu D. Brown M. Li and B. Ma, “Optimizing Multiple Spaced Seeds for Homology Search,” <i>Proc. 15th Symp. Combinatorial Pattern Matching,</i> pp. 47-58, 2004.
 
20
 
21
F. Li and G. Stormo, “Selection of Optimal DNA Oligos for Gene Expression Arrays,” <i>Bioinformatics,</i> vol. 17, pp. 1067-1076, 2001.
 
22
L. Kaderali and A. Schliep, “Selecting Signature Oligonucleotides to Identify Organisms Using DNA Arrays,” <i>Bioinformatics,</i> vol. 18, no. 10, pp. 1340-1349, 2002.
 
23
S. Rahmann, “Fast Large Scale Oligonucleotide Selection Using the Longest Common Factor Approach,” <i>J. Bioinformatics and Computational Biology,</i> vol. 1, no. 2, pp. 343-361, 2003.
 
24
J. Zheng T. Close T. Jiang and S. Lonardi, “Efficient Selection of Unique and Popular Oligos for Large EST Databases,” <i>Proc. 14th Ann. Combinatorial Pattern Matching Symp. (CPM),</i> pp. 273-283, 2003.
 
25


Collaborative Colleagues:
Gregory Kucherov: colleagues
Laurent Noe: colleagues
Mikhail Roytberg: colleagues