Theses and Dissertations

Author

Gaurav Marwah

Issuing Body

Mississippi State University

Advisor

Hansen, A. Eric

Date of Degree

8-6-2005

Document Type

Graduate Thesis - Open Access

Major

Computer Science

Degree Name

Master of Science

College

James Worth Bagley College of Engineering

Department

Department of Computer Science and Engineering

Abstract

A partially observable Markov decision process (POMDP) is a mathematical framework for planning and control problems in which actions have stochastic effects and observations provide uncertain state information. It is widely used for research in decision-theoretic planning and reinforcement learning. % To cope with partial observability, a policy (or plan) must use memory, and previous work has shown that a finite-state controller provides a good policy representation. This thesis considers a previously-developed bounded policy iteration algorithm for POMDPs that finds policies that take the form of stochastic finite-state controllers. Two new improvements of this algorithm are developed. First improvement provides a simplification of the basic linear program, which is used to find improved controllers. This results in a considerable speed-up in efficiency of the original algorithm. Secondly, a branch and bound algorithm for adding the best possible node to the controller is presented, which provides an error bound and a test for global optimality. Experimental results show that these enhancements significantly improve the algorithm's performance.

URI

https://hdl.handle.net/11668/15456

Share

COinS