ACM Home Page
Please provide us with feedback. Feedback
HMMer acceleration using systolic array based reconfigurable architecture
Source
International Symposium on Field Programmable Gate Arrays archive
Proceeding of the ACM/SIGDA international symposium on Field programmable gate arrays table of contents
Monterey, California, USA
POSTER SESSION: Applications table of contents
Pages 282-282  
Year of Publication: 2009
ISBN:978-1-60558-410-2
Authors
Yanteng Sun  Harbin Engineering University, Harbin, China
Peng Li  Intel China Research Center, Beijing, China
Guochang Gu  Harbin Engineering University, Harbin, China
Yuan Wen  Harbin Engineering University, Harbin, China
Yuan Liu  Intel China Research Center, Beijing, China
Dong Liu  Intel China Research Center, Beijing, China
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 0
Additional Information:

abstract   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1508128.1508193
What is a DOI?

ABSTRACT

HMMer is a widely-used bioinformatics software package that uses profile Hidden Markov Models (HMMs) to model the primary structure consensus of a family of protein or nucleic acid sequences. However, with the rapid growth of both sequence and model databases, it is more and more time-consuming to run HMMer on traditional computer architecture. With the development of modern field programmable gate array (FPGA) technology, applications can be accelerated using CPU-FPGA cooperative system by mapping computational-intensive work onto FPGA. In this paper, the computation kernel of HMMer, P7Viterbi, is selected to be accelerated by FPGA. After carefully data dependency analysis, we proposed a systolic array based reconfigurable architecture to exploit both inter-module and intra-module parallelism. There is an infrequent feedback loop in P7Viterbi to update the value of beginning state (B state), which limits further parallelization. Previous work either ignored the feedback loop or serialized the process, leading to loss of either precision or efficiency. Our proposed architecture can exploit maximum parallelism without loss of precision. The proposed architecture speculatively runs with fully parallelism assuming that the feedback loop does not take place. If the rare feedback case actually occurs, a rollback mechanism is used to ensure correctness. Results show that by using Xilinx Virtex-5 110T FPGA, the proposed architecture can achieve about a 56.8 times speedup compared with that of Intel Core2 Duo 2.33GHz CPU.


Collaborative Colleagues:
Yanteng Sun: colleagues
Peng Li: colleagues
Guochang Gu: colleagues
Yuan Wen: colleagues
Yuan Liu: colleagues
Dong Liu: colleagues