Theses and Dissertations

Issuing Body

Mississippi State University

Advisor

Hansen, Eric A.

Committee Member

Luke, Edward A.

Committee Member

Fowler, James E.

Committee Member

Younan, Nicholas H.

Committee Member

Bridges, Susan M.

Date of Degree

12-11-2009

Document Type

Dissertation - Open Access

Major

Computer Engineering

Degree Name

Doctor of Philosophy

College

James Worth Bagley College of Engineering

Department

Department of Electrical and Computer Engineering

Abstract

State-of-the-art speech-recognition systems can successfully perform simple tasks in real-time on most computers, when the tasks are performed in controlled and noiseree environments. However, current algorithms and processors are not yet powerful enough for real-time large-vocabulary conversational speech recognition in noisy, real-world environments. Parallel processing can improve the real-time performance of speech recognition systems and increase their applicability, and developing an effective approach to parallelization is especially important given the recent trend toward multi-core processor design. In this dissertation, we introduce methods for parallelizing a single-pass across-word n-gram lexical-tree based Viterbi recognizer, which is the most popular architecture for Viterbi-based large vocabulary continuous speech recognition. We parallelize two different open-source implementations of such a recognizer, one developed at Mississippi State University and the other developed at Rheinisch-Westfalische Technische Hochschule University in Germany. We describe three methods for parallelization. The first, called parallel fast likelihood computation, parallelizes likelihood computations by decomposing mixtures among CPU cores, so that each core computes the likelihood of the set of mixtures allocated to it. A second method, lexical-tree division, parallelizes the search management component of a speech recognizer by dividing the lexical tree among the cores. A third and alternative method for parallelizing the search-management component of a speech recognizer, called lexical-tree copies decomposition, dynamically distributes the active lexical-tree copies among the cores. All parallelization methods were tested on two and four cores of an Intel Core2 Quad processor and significantly improved real-time performance. Several challenges for parallelizing a lexical-tree based Viterbi speech recognizer are also identified and discussed.

URI

https://hdl.handle.net/11668/17071

Comments

fast gaussian calculations||fast likelihood computations||prefix tree||lexical tree||parallel speech decoding||parallel speech recognition||multi-core processors

Share

COinS