Performance and Energy Aware Training of a Deep Neural Network in a Multi-GPU Environment with Power Capping

Grzegorz Koszczał; Jan Dobrosolski; Mariusz Matuszek; Paweł Czarnul

doi:10.1007/978-3-031-48803-0_1

In this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were able to find non-default GPU power cap settings within the range of 160–200 W to improve the difference between percentage energy gain and performance loss by over 15.0%, EDP (Abbreviations and terms used are described in main text.) by over 17.3%, EDS with k = 1.5 by over 2.2%, EDS with k = 2.0 by over 7.5% and pure energy by over 25%, compared to the default power cap setting of 260 W per GPU. These findings demonstrate the potential of today’s CPU+GPU systems for configuration improvement in the context of performance-energy consumption metrics.

Autorzy

Informacje dodatkowe

DOI: Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.1007/978-3-031-48803-0_1
Kategoria: Aktywność konferencyjna
Typ: publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
Język: angielski
Rok wydania: 2024

Źródło danych: MOSTWiedzy.pl - publikacja "Performance and Energy Aware Training of a Deep Neural Network in a Multi-GPU Environment with Power Capping" link otwiera się w nowej karcie

link otwiera się w nowej karcie

Repozytorium publikacji - Politechnika Gdańska

Treść strony