|
ABSTRACT
Protein sequences with unknown functionality are often compared to a set of known sequences to detect functional similarities. Efficient dynamic-programming algorithms exist for solving this problem, however current solutions still require significant scan times. These scan time requirements are likely to become even more severe due to exponential database growth. In this paper we present a new approach to bio-sequence database scanning using re-configurable FPGA-based hardware platforms to gain high performance at low cost. Efficient mappings of the Smith-Waterman algorithm using fine-grained parallel processing elements (PEs) that are tailored towards the parameters of a query have been designed. We use customization opportunities available at run-time to dynamically hyper customize the systolic array to make better use of available resource. Our FPGA implementation achieves a speedup of approximately 170 for linear gap penalties and 125 for affine gap penalties as compared to a standard desktop computing platform. We show how hyper-customization at run-time can be used to further improve the performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alpha-Data http://www.alpha-data.co.uk
|
| |
2
|
Altschul, S.F., Gish, W., Miller, W., Myers, E.W, Lipman, D.J.: Basic local alignment search tool, Journal of Molecular Biology 215 (1990) 403--410.
|
| |
3
|
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O'Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 Nucleic Acids Research 31(2003) 365--370.
|
| |
4
|
Borah, M., Bajwa, R.S., Hannenhalli, S., Irwin, M.J.: A SIMD solution to the sequence comparison problem on the MGAP, in Proc. ASAP'94, IEEE CS (1994) 144--160.
|
| |
5
|
Camilleri, N.: XAPP138 Status and Control Semaphore Registers Using Partial Reconfiguration, Xilinx Inc, Version 1.0, May 17, 1999.
|
| |
6
|
Chow, E., Hunkapiller, T., Peterson, J., Waterman, M.S.: Biological Information Signal Processor, Proc. ASAP'91, IEEE CS (1991) 144--160.
|
| |
7
|
Dahle, D., Grate L., Rice, E., Hughey, R.: The UCSC Kestrel general purpose parallel processor, Proc. Int. Conf. Parallel and Distributed Processing Techniques and Appilcations (1999) 1243--1249.
|
| |
8
|
Glemet, E., Codani, J.J.: LASSAP, a Large Scale Sequence compArison Package, CABIOS 13 (2) (1997) 145--150.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Guccione, S.A., Levi, D., Sundararajan, P.: JBits: A Java-based Interface for Reconfigurable Computing, 2nd Annual Military and Aerospace Applications of Programmable Devices and Technologies Conference (MAPLD Con'99).
|
| |
13
|
Guerdoux-Jamet, P., Lavenier, D.: SAMBA: hardware accelerator for biological sequence comparison, CABIOS 12 (6) (1997) 609--615.
|
| |
14
|
Hoang, D.T.: Searching genetic databases on Splash 2, in Proc. IEEE Workshop on FPGAs for Custom Computing Machines, IEEE CS, (1993) 185--191.
|
| |
15
|
Hughey, R.: Parallel Hardware for Sequence Comparison and Alignment, CABIOS 12 (6) (1996) 473--479.
|
| |
16
|
Lavenier, D., Pacherie, J.-L.: Parallel Processing for Scanning Genomic Data-Bases, Proc. PARCO'97, Elseiver (1998) 81--88.
|
| |
17
|
|
| |
18
|
Pearson, W.R.: Comparison of methods for searching protein sequence databases, Protein Science 4 (6) (1995) 1145--1160.
|
| |
19
|
Project Proteus, http://www.projectproteus.com
|
| |
20
|
Singh, R.K. et al.: BIOSCAN: a network sharable computational resource for searching biosequence databases, CABIOS, 12 (3) (1996) 191--196.
|
| |
21
|
Schmidt, B., Schröder, H., Schimmler, M: Massively Parallel Solutions for Molecular Sequence Analysis, Proc. 1st IEEE Int. Workshop on High Performance Computational Biology, Ft. Lauderdale, Florida, 2002.
|
| |
22
|
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences, Journal of Molecular Biology 147 (1981) 195--197.
|
| |
23
|
Sturrock, S., Collins, J.: MPsrch version 1.3, Biocomputing Research Unit University of Edinburgh, UK, 1993.
|
| |
24
|
TimeLogic Corporation, http://www.timelogic.com./
|
| |
25
|
Yamaguchi, Y., Maruyama, T., Konagaya, A.: High Speed Homology Search with FPGAs, Proc. Pacific Symposium on Biocomputing'02, pp.271--282, (2002).
|
| |
26
|
Yu, C.W., Kwong, K.H., Lee, K.H., Leong, P.H.W.: A Smith-Waterman Systolic Cell, Proc. 13th Int. Workshop on Field Programmable Logic and Applications (FPL'03), Springer, LNCS 2778, (2003) 375--384.
|
| |
27
|
Xilinx Inc.: XAPP290 Two Flows for Partial Reconfiguration: Module Based or Small Bit Manipulations, Xilinx Inc, September 9, 2004.
|
| |
28
|
Xilinx Inc.: Virtex-II™ Platform FPGA User Guide G0U02, Xilinx Inc, Version 1.9, December 2002.
|
CITED BY 5
|
Rahul P. Maddimsetty , Jeremy Buhler , Roger D. Chamberlain , Mark A. Franklin , Brandon Harris, Accelerator design for protein sequence HMM search, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
Justin Ma , Kirill Levchenko , Christian Kreibich , Stefan Savage , Geoffrey M. Voelker, Unexpected means of protocol inference, Proceedings of the 6th ACM SIGCOMM on Internet measurement, October 25-27, 2006, Rio de Janeriro, Brazil
|
|
|
|
|
|
|
|
|
|