Mississippi State University
Jones, A. Bryan
Follett, F. Randolph
Date of Degree
Graduate Thesis - Open Access
Master of Science
James Worth Bagley College of Engineering
Department of Electrical and Computer Engineering
Large-scale distributed computing systems such as data centers are hosted on heterogeneous and networked servers that execute in a dynamic and uncertain operating environment, caused by factors such as time-varying user workload and various failures. Therefore, achieving stringent quality-of-service goals is a challenging task, requiring a comprehensive approach to performance control, fault diagnosis, and failure recovery. This work presents a model-based approach for fault management, which integrates limited lookahead control (LLC), diagnosis, and fault-tolerance concepts that: (1) enables systems to adapt to environment variations, (2) maintains the availability and reliability of the system, (3) facilitates system recovery from failures. We focused on memory leak errors in this thesis. A characterization function is designed to detect memory leaks. Then, a LLC is applied to enable the computing system to adapt efficiently to variations in the workload, and to enable the system recover from memory leaks and maintain functionality.
Wang, Zimin, "A model-based approach for automatic recovery from memory leaks in enterprise applications" (2011). Theses and Dissertations. 186.