Advisor

Picone, Joseph

Committee Member

Lazarou, Georgios

Committee Member

Hansen, Eric

Date of Degree

1-1-2003

Document Type

Graduate Thesis - Open Access

Major

Computer Engineering

Degree Name

Master of Science

College

College of Engineering

Department

Department of Electrical and Computer Engineering

Abstract

Spoken language processing is one of the oldest and most natural modes of information exchange between humans beings. For centuries, people have tried to develop machines that can understand and produce speech the way humans do so naturally. The biggest problem in our inability to model speech with computer programs and mathematics results from the fact that language is instinctive, whereas, the vocabulary and dialect used in communication are learned. Human beings are genetically equipped with the ability to learn languages, and culture imprints the vocabulary and dialect on each member of society. This thesis examines the role of pattern classification in the recognition of human speech, i.e., machine learning techniques that are currently being applied to the spoken language processing problem. The primary objective of this thesis is to create a network training paradigm that allows for direct training of multi-path models and alleviates the need for complicated systems and training recipes. A traditional trainer uses an expectation maximization (EM)based supervised training framework to estimate the parameters of a spoken language processing system. EM-based parameter estimation for speech recognition is performed using several complicated stages of iterative reestimation. These stages typically are prone to human error. The network training paradigm reduces the complexity of the training process while retaining the robustness of the EM-based supervised training framework. The hypothesis of this thesis is that the network training paradigm can achieve comparable recognition performance to a traditional trainer while alleviating the need for complicated systems and training recipes for spoken language processing systems.

URI

https://hdl.handle.net/11668/18539

Comments

speech recognition||network training

Share

COinS