Theses and Dissertations
Issuing Body
Mississippi State University
Advisor
Picone, Joseph
Committee Member
Lazarou, Georgios Y.
Committee Member
Younan, Nicholas H.
Date of Degree
12-13-2003
Document Type
Graduate Thesis - Open Access
Major
Electrical Engineering
Degree Name
Master of Science
College
James Worth Bagley College of Engineering
Department
Department of Electrical and Computer Engineering
Abstract
Supervised learning using Hidden Markov Models has been used to train acoustic models for automatic speech recognition for several years. Typically clean transcriptions form the basis for this training regimen. However, results have shown that using sources of readily available transcriptions, which can be erroneous at times (e.g., closed captions) do not degrade the performance significantly. This work analyzes the effects of mislabeled data on recognition accuracy. For this purpose, the training is performed using manually corrupted training data and the results are observed on three different databases: TIDigits, Alphadigits and SwitchBoard. For Alphadigits, with 16% of data mislabeled, the performance of the system degrades by 12% relative to the baseline results. For a complex task like SWITCHBOARD, at 16% mislabeled training data, the performance of the system degrades by 8.5% relative to the baseline results. The training process is more robust to mislabeled data because the Gaussian mixtures that are used to model the underlying distribution tend to cluster around the majority of the correct data. The outliers (incorrect data) do not contribute significantly to the reestimation process.
URI
https://hdl.handle.net/11668/21086
Recommended Citation
Sundaram, Ramasubramanian H., "Effects of Transcription Errors on Supervised Learning in Speech Recognition" (2003). Theses and Dissertations. 1815.
https://scholarsjunction.msstate.edu/td/1815