In this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were able to find non-default GPU power cap settings within the range of 160–200 W to improve the difference between percentage energy gain and performance loss by over 15.0%, EDP (Abbreviations and terms used are described in main text.) by over 17.3%, EDS with k = 1.5 by over 2.2%, EDS with k = 2.0 by over 7.5% and pure energy by over 25%, compared to the default power cap setting of 260 W per GPU. These findings demonstrate the potential of today’s CPU+GPU systems for configuration improvement in the context of performance-energy consumption metrics.
Autorzy
Informacje dodatkowe
- DOI
- Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.1007/978-3-031-48803-0_1
- Kategoria
- Aktywność konferencyjna
- Typ
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Język
- angielski
- Rok wydania
- 2024