This paper considers a method for accelerating finite-element simulations of electromagnetic problems on a workstation using graphics processing units (GPUs). The focus is on finite-element formulations using higher order elements and tetrahedral meshes that lead to sparse matrices too large to be dealt with on a typical workstation using direct methods. We discuss the problem of rapid matrix generation and assembly, as well as accelerating preconditioned iterative solvers in the context of limited on-board GPU memory, and we show how to mitigate some of these problems using multiple GPUs. We propose a new fast data-distribution technique for multi-GPU platforms that allows optimal splitting of finite-element method (FEM) matrices between graphics accelerators. The technique draws upon the graph partitioning approach used in nonoverlapping domaindecomposition methods and provides information that drives the FEM matrix-generation and assembly process in such a way that it produces data structures for each GPU; this not only ensures load balancing and minimizes communication between GPUs, but also reflects the hierarchy of the basis functions. The concepts proposed in this paper are illustrated with examples involving sparse matrices of up to 13.9 million rows and over a billion nonzero elements.
Autorzy
Informacje dodatkowe
- DOI
- Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.1109/tmtt.2017.2714670
- Kategoria
- Publikacja w czasopiśmie
- Typ
- artykuł w czasopiśmie wyróżnionym w JCR
- Język
- angielski
- Rok wydania
- 2017