ACM Home Page
Please provide us with feedback. Feedback
Mercury BLASTP: Accelerating Protein Sequence Alignment
Full text PdfPdf (779 KB)
Source
ACM Transactions on Reconfigurable Technology and Systems (TRETS) archive
Volume 1 ,  Issue 2  (June 2008) table of contents
Article No. 9  
Year of Publication: 2008
ISSN:1936-7406
Authors
Arpith Jacob  Washington University in St. Louis
Joseph Lancaster  Washington University in St. Louis
Jeremy Buhler  Washington University in St. Louis
Brandon Harris  Washington University in St. Louis, and BECS Technology, Inc.
Roger D. Chamberlain  Washington University in St. Louis, and BECS Technology, Inc.
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 183,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1371579.1371581
What is a DOI?

ABSTRACT

Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this article, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11--15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D. J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucl. Acids Res. 25, 17, 3389--3402.
 
2
Altschul, S. F. and Gish, W. 1996. Local alignment statistics. Metho. Enzymol. 266, 460--80.
 
3
Buhler, J. D., Lancaster, J. M., Jacob, A. C., and Chamberlain, R. D. 2007. Mercury BLASTN: Faster DNA sequence comparison using a streaming hardware architecture. In Proceedings of Reconfigurable Systems Summer Institute.
4
 
5
Chamberlain, R. D. and Shands, B. 2005. Streaming data from disk store to application. In Proceedings of the International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI). 17--23.
 
6
Dayhoff, M. O., Schwartz, R., and Orcutt, B. C. 1978. A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure 5, 345--52.
 
7
Henikoff S. and Henikoff, J. G. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. 89, 22, 10915--10919.
 
8
 
9
 
10
 
11
Hoang, D. T. 1993. Searching genetic databases on Splash 2. In IEEE Workshop on FPGAs for Custom Computing Machines (FCCM). 185--191.
 
12
 
13
 
14
Lancaster, J., Buhler, J., and Chamberlain, R. D. 2005. Acceleration of ungapped extension in Mercury BLAST. In Proceedings of 7th Workshop on Media and Streaming Processors. 50--57.
 
15
Lancaster, J., Buhler, J., and Chamberlain, R. D. 2008. Acceleration of ungapped extension in Mercury BLAST. Intl. J. of Embed. Sys. To appear.
 
16
Lavenier, D., Guyetant, S., Derrien, S., and Rubini, S. 2003. A reconfigurable parallel disk system for filtering genomic banks. In Proceedings of Engineering of Reconfigurable Systems and Algorithms (ERSA). 154--166.
 
17
 
18
Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., et al. 2005. Genome sequencing in microfabricated high-density picoliter reactors. Nature 437, 326--7.
 
19
McGinnis, S. and Madden, T. L. 2004. BLAST: At the core of a powerful and diverse set of sequence analysis tools. Nuc. Acids Res. 32, 20--5.
 
20
 
21
Portugaly, E. and Ninio, M. 2004. HMMERHEAD - accelerating HMM searches on large databases. In Proceedings of the International Conference on Research in Molecular Biology (RECOMB). 250--251.
 
22
Rangwala, H., Lantz, E., Musselman, R., Pinnow, K., Smith, B., and Wallenfelt, B. 2005. Massively parallel BLAST for the Blue Gene/L. In High Availability and Performance Computing Workshop.
 
23
Schaffer, A. A., Wolf, Y. I., Ponging, C. P., Koonin, E. V., Aravind, L., and Altschul, S. F. 1999. IMPALA: Matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15, 1000--11.
 
24
Smith, T. F. and Waterman, M. S. 1981. Identification of common molecular subsequences. J. Molec. Biol. 147, 195--197.
 
25
Sotiriades, E., Dollas, A., and Kozanitis, C. 2006. Some initial results on hardware BLAST acceleration with a reconfigurable architecture. In Proceedings of the 5th IEEE International Workshop on High Performance Computational Biology (HiCOMB).
 
26
Swiss Institute of Bioinformatics. 2006. Growth of Swiss-Prot. http://www.expasy.org/sprot/ relnotes/#SPstat.
 
27
Wang, T. and Stormo, G. D. 2005. Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. Proc. Natl. Acad. Sci. 102, 17400--5.
 
28
Yamaguchi, Y., Maruyama, T., and Konagaya, A. 2002. High speed homology search with FPGAs. In Proceedings of the Pacific Symposium on Biocomputing. 271--282.


Collaborative Colleagues:
Arpith Jacob: colleagues
Joseph Lancaster: colleagues
Jeremy Buhler: colleagues
Brandon Harris: colleagues
Roger D. Chamberlain: colleagues