Theses and Dissertations
ORCID
https://orcid.org/0000-0003-2289-0780
Issuing Body
Mississippi State University
Advisor
Luke, Edward A.
Committee Member
Jankun-Kelly, T.J.
Committee Member
Banicescu, Ioana
Committee Member
Zope, Anup
Date of Degree
8-9-2022
Document Type
Graduate Thesis - Open Access
Major
Computer Science
Degree Name
Master of Science (M.S.)
College
James Worth Bagley College of Engineering
Department
Department of Computer Science and Engineering
Abstract
Irregular applications, such as unstructured mesh operations, do not easily map onto the typical GPU programming paradigms endorsed by GPU manufacturers, which mostly focus on maximizing concurrency for latency hiding. In this work, we show how alternative techniques focused on latency amortization can be used to control overall latency while requiring less concurrency. We used a custom-built microbenchmarking framework to test several GPU kernels and show how the GPU behaves under relevant workloads. We demonstrate that coalescing is not required for efficacious performance; an uncoalesced access pattern can achieve high bandwidth - even over 80% of the theoretical global memory bandwidth in certain circumstances. We also make other further observations on specific relevant behaviors of GPUs. We hope that this study opens the door for further investigation into techniques that can exploit latency amortization when latency hiding does not achieve sufficient performance.
Recommended Citation
Winans-Pruitt, Dalton R., "GPGPU microbenchmarking for irregular application optimization" (2022). Theses and Dissertations. 5591.
https://scholarsjunction.msstate.edu/td/5591
Included in
Numerical Analysis and Scientific Computing Commons, Software Engineering Commons, Systems Architecture Commons