In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode, additional modes preventing data preloading from a file and synchronization on file close if the solution is used as temporary cache only. Furthermore, we have evaluated the solution for a real application that computes powers of an adjacency matrix of a graph in parallel. We demonstrated superiority of our solution compared to a regular MPI I/O implementation for various powers and numbers of graph nodes. Finally, we presented good scalability of the solution for more than 600 processes running on a large HPC cluster.
Autorzy
Informacje dodatkowe
- DOI
- Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.1007/978-3-319-59105-6_2
- Kategoria
- Aktywność konferencyjna
- Typ
- materiały konferencyjne indeksowane w Web of Science
- Język
- angielski
- Rok wydania
- 2017