Theses and Dissertations

Issuing Body

Mississippi State University


Banicescu, Ioana

Committee Member

Luke, Edward A.

Committee Member

Allen, Edward B.

Committee Member

Zhang, Song

Date of Degree


Document Type

Dissertation - Open Access

Degree Name

Doctor of Philosophy


James Worth Bagley College of Engineering


Department of Computer Science and Engineering


High performance parallel and distributed computing systems are used to solve large, complex, and data parallel scientific applications that require enormous computational power. Data parallel workloads which require performing similar operations on different data objects, are present in a large number of scientific applications, such as N-body simulations and Monte Carlo simulations, and are expressed in the form of loops. Data parallel workloads that lack precedence constraints are called arbitrarily divisible workloads, and are amenable to easy parallelization. Load imbalance that arise from various sources such as application, algorithmic, and systemic characteristics during the execution of scientific applications degrades performance. Scheduling of arbitrarily divisible workloads to address load imbalance in order to obtain better utilization of computing resources is a major area of research. Divisible load theory (DLT) and dynamic loop scheduling (DLS) algorithms are two algorithmic approaches employed in the scheduling of arbitrarily divisible workloads. Despite sharing the same goal of achieving load balancing, the two approaches are fundamentally different. Divisible load theory algorithms are linear, deterministic and platform dependent, whereas dynamic loop scheduling algorithms are probabilistic and platform agnostic. Divisible load theory algorithms have been traditionally used for performance prediction in environments characterized by known or expected variation in the system characteristics at runtime. Dynamic loop scheduling algorithms are designed to simultaneously address all the sources of load imbalance that stochastically arise at runtime from application, algorithmic, and systemic characteristics. In this dissertation, an analysis and performance evaluation of DLT and DLS algorithms are presented in the form of a scalability study and a robustness investigation. The effect of network topology on their performance is studied. A hybrid scheduling approach is also proposed that integrates DLT and DLS algorithms. The hybrid approach combines the strength of DLT and DLS algorithms and improves the performance of the scientific applications running in large scale parallel and distributed computing environments, and delivers performance superior to that which can be obtained by applying DLT algorithms in isolation. The range of conditions for which the hybrid approach is useful is also identified and discussed.



3D torus||parallel and distributed computing||robustness||scalability||dynamic loop scheduling||divisible load theory||arbitrarily divisible workloads||data parallel workloads||cluster