Advisor

Banicescu, Ioana

Committee Member

Hansen, Eric

Committee Member

Philip, Thomas

Date of Degree

1-1-2004

Document Type

Graduate Thesis - Open Access

Major

Computer Science

Degree Name

Master of Science

Department

Department of Computer Science

Abstract

Scientific applications are large, complex, irregular, and computationally intensive and are characterized by data parallel loops. The prevalence of independent iterations in these loops, makes parallel computing as the natural choice for solving these applications. The computational requirements of these problems vary due to variations in problem, algorithmic and systemic characteristics during parallelization, leading to performance degradation. Considerable amount of research has been dedicated to the development of dynamic scheduling techniques based on probabilistic analysis to address these predictable and unpredictable factors that lead to severe load imbalance. The mathematical foundations of these scheduling algorithms have been previously developed and published in the literature. These techniques have been successfully integrated into scientific applications as well as into runtime systems. Recently, efforts have also been directed to integrate these techniques into dynamic load balancing libraries for scientific applications. The optimal scheduling algorithm to load balance a specific scientific application in a dynamic parallel computing environment is very difficult without the exhaustive testing of all the scheduling techniques. This is a time consuming process, and therefore, there is a need for developing an automatic mechanism for the selection of dynamic scheduling algorithms. In recent years, extensive work has been dedicated to the development of reinforcement learning and some of its techniques have addressed load-balancing problems. However, they do not cover a number of aspects regarding the performance of scientific applications. First, these previously developed techniques address the load balancing problem only at a coarse granularity level (for example, job scheduling), and the reinforcement learning techniques used for load balancing are based on learning from trained datasets which are obtained prior to the execution of the application. Moreover, scientific applications contain parameters whose variations are so irregular that the use of training sets would not be able to accurately capture the entire spectrum of possible characteristics. Finally, algorithm selection using reinforcement learning has only been used for simple sequential problems. This thesis addresses these limitations and provides a novel integrated approach for automating the selection of dynamic scheduling algorithms at a finer granularity level to improve the performance of scientific applications using reinforcement learning. This integrated approach will experimentally be tested on a scientific application that involves a large number of time steps: The Quantum Trajectory Method (QTM). A qualitative and quantitative analysis of the effectiveness of this novel approach will be presented to underscore the significance of its use in improving the performance of large-scale scientific applications.

URI

https://hdl.handle.net/11668/19976

Share

COinS