|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
ABSTRACT
The well-known Gaussian elimination (with partial pivoting) is a widely-used algorithm, one of traditional methods for solving dense linear systems of equations (LSEs). This paper presents a hardware-optimized variant of Gaussian elimination and its 32-bit ANSI/IEEE Std 754-1985 floating-point implementation on a Xilinx Virtex-5 FPGA with highly efficient design. The logic of the traditional algorithm is changed in order to make use of parallelism in hardware. According to this change the proposed hardware architecture can accomplish the solution very fast. Its average running time for n×n 32-bit floating-point matrices with uniformly distributed entries equals around n2(clock cycles) as opposed to n3 in software. Meanwhile, an open source library FPLibrary, which provides parameterizable pipelined floating-point operators, is used in the design. In realization, the design is finally integrated in an developed prototype system to accelerate the general purpose processor's work with the data exchanging through PCI-express between host and FPGA with DMA access method. Furthermore, by means of Strasson's algorithm, large LSEs also can be solved based on multiple FPGAs' co-work. The whole implementation placed and routed in the xc5vlx110t-3 FPGA with the applicability for solving LSE at most dimension 22, can be clocked with a frequency of up to 200MHz and computes the solution in 5.39 ¼s on average, providing a speed-up of up to almost 15 times over an equivalent software implementation on a Pentium IV 2.6GHz CPU. To the best of authors' knowledge, there has been no previous work on floating-point LSEs solving hardware and its implementation used as an application function unit in reconfigurable computing system. INDEX TERMS
Primary Classification:
Additional Classification:
General Terms:
Collaborative Colleagues:
|
|||||||||||||||||||||||