ACM Home Page
Please provide us with feedback. Feedback
Parallel out-of-core computation and updating of the QR factorization
Full text PdfPdf (522 KB)
Source ACM Transactions on Mathematical Software (TOMS) archive
Volume 31 ,  Issue 1  (March 2005) table of contents
Pages: 60 - 78  
Year of Publication: 2005
ISSN:0098-3500
Authors
Brian C. Gunter  The University of Texas at Austin, Austin, TX
Robert A. Van De Geijn  The University of Texas at Austin, Austin, TX
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 93,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1055531.1055534
What is a DOI?

ABSTRACT

This article discusses the high-performance parallel implementation of the computation and updating of QR factorizations of dense matrices, including problems large enough to require out-of-core computation, where the matrix is stored on disk. The algorithms presented here are scalable both in problem size and as the number of processors increases. Implementation using the Parallel Linear Algebra Package (PLAPACK) and the Parallel Out-of-Core Linear Algebra Package (POOCLAPACK) is discussed. The methods are shown to attain excellent performance, in some cases attaining roughly 80&percent; of the “realizable” peak of the architectures on which the experiments were performed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
Bjorck, A. 1996. Numerical Methods for Least Squares Problems. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA.
 
5
Choi, J., Dongarra, J. J., Pozo, R., and Walker, D. W. 1992. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation. IEEE Computer Society Press, 120--127.
 
6
Coleman, R., Leback, B., Norin, R., Scott, D., and de Houten, K. V. 1992. Soz - a dense, out-of-core solver with partial pivoting for the iPSC/860: A case history. In 1992 Annual Users Conference.
 
7
Condi, F., Gunter, B., Ries, J., and Tapley, B. 2003. Combining sea surface and terrestrial gravity data for global geopotential modelling and geoid determination. In Eos Trans. AGU, 84(46), Fall Meet. Suppl., Abstract G31A-06.
 
8
 
9
Dongarra, J., Kaufmann, L., and Hammarling, S. 1986. Squeezing the most out of eigenvalue solvers on high-performance computers. Linear Algebra and It Applications 77:113--136.
 
10
 
11
Dongarra, J. J., Bunch, J. R., Moler, C. B., and Stewart, G. W. 1979. LINPACK Users' Guide. SIAM, Philadelphia.
12
13
 
14
 
15
 
16
Elmroth, E. and Gustavson, F. G. 2000. Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Dev. 44, 4 (July), 605--624.
 
17
Elmroth, E. and Gustavson, F. G. 2001. A faster and simpler recursive algorithm for the LAPACK routine DGELS. BIT 41, 5, 936--949.
 
18
 
19
Gropp, W., Lusk, E., and Skjellum, A. 1994. Using MPI. The MIT Press.
20
 
21
Gunter, B. C. 2000. Parallel least squares analysis of simulated GRACE data. CSR Technical Memoranda CSR-TM-00-05, The Center for Space Research, The University of Texas at Austin.
 
22
 
23
Gunter, B. C., Tapley, B. D., and van de Geijn, R. A. 2001b. Advanced parallel least squares algorithms for GRACE data processing. In Proceedings of the International Association of Geodesy (IAG) Conference. Budapest, Hungary.
 
24
 
25
Klimkowski, K. and van de Geijn, R. 1995. Anatomy of an out-of-core dense linear solver. In Proceedings of the International Conference on Parallel Processing 1995. Vol. III---Algorithms and Applications. 29--33.
26
 
27
Lichtenstein, W. and Johnsson, S. L. 1992. Block-cyclic dense linear algebra. Tech. Rep. TR-04-92, Harvard University, Center for Research in Computing Technology. Jan.
28
 
29
Rabani, E. and Toledo, S. 2001. Out-of-core SVD and QR decompositions. In Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing (PARA). Norfolk, Virginia.
 
30
 
31
 
32
 
33
Scott, D. S. 1993. Parallel I/O and solving out-of-core systems of linear equations. In Proceedings of the 1993 DAGS/PC Symposium. Dartmouth Institute for Advanced Graduate Studies, Hanover, NH, 123--130.
 
34
 
35
Stewart, G. 1990. Communication and matrix computations on large message passing systems. Parallel Computing 16, 27--40.
 
36
Strazdins, P. 1998. Optimal load balancing techniques for block-cyclic decompositions for matrix factorization. Tech. Rep. TR-CS-98-10, Canberra 0200 ACT, Australia.
37
 
38
 
39
 
40


Collaborative Colleagues:
Brian C. Gunter: colleagues
Robert A. Van De Geijn: colleagues