Titlebar

Export bibliographic data
Literature by the same author
plus on the publication server
plus at Google Scholar

 

Accelerating explicit ODE methods on GPUs by kernel fusion

Title data

Korch, Matthias ; Werner, Tim:
Accelerating explicit ODE methods on GPUs by kernel fusion.
In: Concurrency and Computation. Vol. 30 (2018) Issue 18 . - Art.Nr. e4470.
ISSN 1532-0634
DOI: https://doi.org/10.1002/cpe.4470

Abstract in another language

Graphics processing units (GPUs) have a promising architecture for implementing highly parallel solution methods for systems of ordinary differential equations (ODEs). However, their high performance comes at the price of caveats such as small caches or wide SIMD. For ODE methods, optimizing the memory access pattern is often crucial. In this article, instead of considering only one specific method, we generalize the description of explicit ODE methods by using data flow graphs consisting of basic operations that are suitable to cover the types of computations occurring in all common explicit methods. After showing that the straightforward approach for processing the data flow graph by calling one kernel per basic operation is memory bound, we explain how the number of memory accesses can be reduced by the kernel fusion technique, which fuses several basic operations into one kernel. Moreover, we will present enabling transformations that allow additional fusions and thus can reduce the number of memory accesses even further. We apply these optimizations to three different classes of explicit ODE methods: embedded Runge–Kutta (RK) methods, parallel iterated RK (PIRK) methods, and peer methods. A detailed experimental evaluation on three modern GPUs showed speedups between 1.86 and 3.51 compared to unfused implementations.

Further data

Item Type: Article in a journal
Refereed: Yes
Keywords: CUDA; explicit methods; GPU; initial value problems; kernel fusion; locality; numerical integration; OpenCL; ordinary differential equations; parallel; peer methods; PIRK methods; Runge-Kutta methods; scalability
Institutions of the University: Faculties > Faculty of Mathematics, Physics und Computer Science > Department of Computer Science > Chair Applied Computer Science II > Chair Applied Computer Science II - Univ.-Prof. Dr. Thomas Rauber
Result of work at the UBT: Yes
DDC Subjects: 000 Computer Science, information, general works > 004 Computer science
Date Deposited: 10 Dec 2019 09:07
Last Modified: 10 Dec 2019 09:07
URI: https://eref.uni-bayreuth.de/id/eprint/46036