Theses and Dissertations


Joe Crumpton

Issuing Body

Mississippi State University


Bethel, Cindy L.

Committee Member

Williams, Byron J.

Committee Member

Anderson, Derek T.

Committee Member

Swan II, J. Edward

Date of Degree


Document Type

Dissertation - Open Access


Computer Science

Degree Name

Doctor of Philosophy


James Worth Bagley College of Engineering


Department of Computer Science and Engineering


Vocal prosody (pitch, timing, loudness, etc.) and its use to convey emotions are essential components of speech communication between humans. The objective of this dissertation research was to determine the efficacy of using varying vocal prosody in robotic speech to convey emotion. Two pilot studies and two experiments were performed to address the shortcomings of previous HRI research in this area. The pilot studies were used to determine a set of vocal prosody modification values for a female voice model using the MARY speech synthesizer to convey the emotions: anger, fear, happiness, and sadness. Experiment 1 validated that participants perceived these emotions along with a neutral vocal prosody at rates significantly higher than chance. Four of the vocal prosodies (anger, fear, neutral, and sadness) were recognized at rates approaching the recognition rate (60%) of emotions in person to person speech. During Experiment 2 the robot led participants through a creativity test while making statements using one of the validated emotional vocal prosodies. The ratings of the robot’s positive qualities and the creativity scores by the participant group that heard nonnegative vocal prosodies (happiness, neutral) did not significantly differ from the ratings and scores of the participant group that heard the negative vocal prosodies (anger, fear, sadness). Therefore, Experiment 2 failed to show that the use of emotional vocal prosody in a robot’s speech influenced the participants’ appraisal of the robot or the participants’ performance on this specific task. At this time robot designers and programmers should not expect that vocal prosody alone will have a significant impact on the acceptability or the quality of human-robot interactions. Further research is required to show that multi-modal (vocal prosody along with facial expressions, body language, or linguistic content) expressions of emotions by robots will be effective at improving human-robot interactions.