Theses and Dissertations



Luke, Edward A.

Committee Member

Jankun-Kelly, T.J.

Committee Member

Banicescu, Ioana

Committee Member

Zope, Anup

Date of Degree


Document Type

Graduate Thesis - Open Access


Computer Science

Degree Name

Master of Science (M.S.)


James Worth Bagley College of Engineering


Department of Computer Science and Engineering


Irregular applications, such as unstructured mesh operations, do not easily map onto the typical GPU programming paradigms endorsed by GPU manufacturers, which mostly focus on maximizing concurrency for latency hiding. In this work, we show how alternative techniques focused on latency amortization can be used to control overall latency while requiring less concurrency. We used a custom-built microbenchmarking framework to test several GPU kernels and show how the GPU behaves under relevant workloads. We demonstrate that coalescing is not required for efficacious performance; an uncoalesced access pattern can achieve high bandwidth - even over 80% of the theoretical global memory bandwidth in certain circumstances. We also make other further observations on specific relevant behaviors of GPUs. We hope that this study opens the door for further investigation into techniques that can exploit latency amortization when latency hiding does not achieve sufficient performance.