ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages
Full text PdfPdf (522 KB)
Source
ACM Transactions on Mathematical Software (TOMS) archive
Volume 35 ,  Issue 3  (October 2008) table of contents
Article No.: 19  
Year of Publication: 2008
ISSN:0098-3500
Authors
Jean-Guillaume Dumas  Université Joseph Fourier
Pascal Giorgi  Université Montpellier 2 and Université de Perpignan
Clément Pernet  University of Washington
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 53,   Citation Count: 0
Additional Information:

appendices and supplements   abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1391989.1391992
What is a DOI?

APPENDICES and SUPPLEMENTS
The proof of Theorem 3.1 is given in an electronic appendix, available online in the ACM Digital Library.


ABSTRACT

In the past two decades, some major efforts have been made to reduce exact (e.g. integer, rational, polynomial) linear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide efficient implementations of such algorithms one need to be careful with the underlying arithmetic. It is well known that modular techniques such as the Chinese remainder algorithm or the p-adic lifting allow very good practical performance, especially when word size arithmetic is used. Therefore, finite field arithmetic becomes an important core for efficient exact linear algebra libraries. In this article, we study high performance implementations of basic linear algebra routines over word size prime fields: especially matrix multiplication; our goal being to provide an exact alternate to the numerical BLAS library. We show that this is made possible by a careful combination of numerical computations and asymptotically faster algorithms. Our kernel has several symbolic linear algebra applications enabled by diverse matrix multiplication reductions: symbolic triangularization, system solving, determinant, and matrix inverse implementations are thus studied.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Brassel, M., Giorgi, P., and Pernet, C. 2003. LUdivine: A symbolic block LU factorisation for matrices over finite fields using BLAS. Poster, http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/ludivine_poster_eccad2003.ps.gz.
 
5
Bunch, J. R. and Hopcroft, J. E. 1974. Triangular factorization and inversion by fast matrix multiplication. Mathematics of Computation 28, 231--236.
 
6
Chen, Z. and Storjohann, A. 2003. Effective reductions to matrix multiplication. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.
 
7
 
8
Courrieu, P. 2005. Fast computation of Moore-Penrose inverse matrices. Neural Information Processing - Letters and Reviews 8, 2 (Aug.), 25--29.
 
9
Dixon, J. D. 1982. Exact solution of linear equations using p-adic expansions. Numerische Mathematik 40, 137--141.
 
10
Dongarra, J. and Eijkhout, V. 2003. Self-adapting numerical software and automatic tuning of heuristics. Lecture Notes in Computer Science 2660, 759--770.
11
 
12
 
13
Dumas, J.-G., , Giorgi, P., and Pernet, C. 2006. FFLAS-FFPACK: Finite field linear algebra subroutine/package. Software, http://ciel.ccsd.cnrs.fr/ciel−00000025. Feb.
 
14
Dumas, J.-G. 2004. Efficient dot product over finite fields. In Proceedings of the Seventh International Workshop on Computer Algebra in Scientific Computing, (Yalta, Ukraine). V. G. Ganzha, E. W. Mayr, and E. V. Vorozhtsov, Eds. Technische Universität München, Germany, 139--154.
 
15
Dumas, J.-G. 2007. Q-adic transform revisited. Tech. Rep. 0710.0510 {cs.SC}, ArXiv. Oct. http://hal.archives-ouvertes.fr/hal-00173894.
 
16
Dumas, J.-G., Gautier, T., Giesbrecht, M., Giorgi, P., Hovinen, B., Kaltofen, E., Saunders, B. D., Turner, W. J., and Villard, G. 2002a. LinBox: A generic library for exact linear algebra. In Proceedings of the 2002 International Congress of Mathematical Software (Beijing, China). A. M. Cohen, X.-S. Gao, and N. Takayama, Eds. World Scientific Pub, 40--50.
17
18
 
19
Dumas, J.-G., Pernet, C., and Roch, J.-L. 2006. Adaptive triangular system solving. In Challenges in Symbolic Computation Software. Dagstuhl Seminar proceedings 06271, paper 770.
20
 
21
Dumas, J.-G., Pernet, C., and Zhou, W. 2007. Memory efficient scheduling of Strassen-Winograd’s matrix multiplication algorithm. Tech. rep., arXiv:0707.2347v2. Aug. http://arxiv.org/abs/0707.2347v2.
 
22
 
23
 
24
 
25
Giorgi, P. 2003. From BLAS routines to finite field exact linear algebra solutions. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.
26
 
27
 
28
Goto, K. and van de Geijn, R. 2002. On reducing tlb misses in matrix multiplication. Tech. Rep. TR-2002-55, University of Texas. Nov. FLAME working note #9.
 
29
30
 
31
 
32
Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. 1996b. Strassen’s algorithm for matrix multiplication : modeling analysis, and implementation. Tech. rep., Center for Computing Sciences. Nov. CCS-TR-96-17.
 
33
Ibarra, O. H., Moran, S., and Hui, R. 1982. A generalization of the fast LUP matrix decomposition algorithm and applications. J. Algor. 3, 1 (Mar.), 45--56.
 
34
 
35
 
36
Laderman, J., Pan, V., and Sha, X.-H. 1992. On practical algorithms for accelerated matrix multiplication. Linear Algebra Appl. 162--164, 557--588.
 
37
Montgomery, P. L. 1985. Modular multiplication without trial division. Math. Computat. 44, 170 (Apr.), 519--521.
 
38
Montgomery, P. L. 1995. A block Lanczos algorithm for finding dependencies over gf(2). In Proceedings of the 1995 International Conference on the Theory and Application of Cryptographic Techniques (Saint-Malo, France). L. C. Guillou and J.-J. Quisquater, Eds. Lecture Notes in Computer Science, vol. 921. 106--120.
 
39
Noble, B. 1966. A method for computing the generalized inverse of a matrix. SIAM J. Numer. Anal. 3, 4 (Dec.), 582--584.
 
40
 
41
Pernet, C. 2001. Implementation of Winograd’s matrix multiplication over finite fields using ATLAS level 3 BLAS. Tech. Rep. RR011122, Laboratoire Informatique et Distribution. July. http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/FFLAS_technical_report.ps.gz.
42
 
43
Shoup, V. 2002. NTL 5.3: A library for doing number theory. www.shoup.net/ntl.
 
44
 
45
Strassen, V. 1969. Gaussian elimination is not optimal. Numerische Mathematik 13, 354--356.
 
46
Whaley, R. C., Petitet, A., and Dongarra, J. J. 2001. Automated empirical optimizations of software and the ATLAS project. Para. Comput. 27, 1--2 (Jan.), 3--35. http://www.netlib.org/utk/people/JackDongarra/PAPERS/atlas_pub.pdf.
 
47
Zassenhaus, H. 1978. A remark on the Hensel factorization method. Mathematics of Computation 32, 141 (Jan.), 287--292.

Collaborative Colleagues:
Jean-Guillaume Dumas: colleagues
Pascal Giorgi: colleagues
Clément Pernet: colleagues