Least squares solvers for distributed-memory machines with GPU accelerators
Type de document : Article de périodique
Langue : anglais
Responsabilité(s) :
Editeur : ACM New York
Année de publication : 2019
Domaine (s) :
Sujets :
Mots clés :
Indice LC : QA76.88
Indice Dewey : 004
Résumé anglais :
- This work presents an implementation of a linear least squares solver for distributed-memory machines with GPU accelerators, developed as part of the Software for Linear Algebra Targeting Exascale (SLATE) package. From the algorithmic standpoint, the work leverages recent advances in dense linear algebra, specifically the communication-avoiding QR factorization. From the implementation standpoint, the work represents a sharp departure from the traditional conventions established by legacy packages, such as LAPACK and ScaLAPACK. It is based on representing the matrix as a collection of individual tiles, and using batch operations for offloading work to accelerators. The article lays out the principles of the new approach, discusses the implementation details and presents the performance results.
Note : Extrait de : Proceedings of the International Conference on Supercomputing 26 June 2019, Pages 117-126
Consulter en ligne
Chargement des enrichissements...