Abstract:
This thesis presents the results of two speech perception experiments, showing that factors including sentence structure, delivery style, participant’s age and language background all have an impact on the intelligibility of synthetic speech. The first experiment was delivered on a person to person basis, and involved 10 young participants (age between 18 and 25), and 9 older individuals (aged over 60). The results showed that younger adults comprehended synthetic speech better than the elderly adults, key words placed at the end of a sentence are more easily recalled than at the start of a sentence, and unfamiliar and complicated stimuli such as medication names were recalled much less successfully than simpler and familiar name or time stimuli. The second experiment was a very similar structure to the first experiment but it was delivered over the web, which involved 58 young adults (aged between 18 and 25), and 19 older individuals (aged over 45). This experiment contrasted participants’ recall of information from natural speech, synthetic speech with key words indicated by either a small pause and pitch inflection prior to the word, or by both a pause and slowed speech rate. The findings were that the native English speakers have a better overall recall rate compare to the non-native English speakers regardless of age; and whilst the natural speech had the best results, synthetic speech with key word emphasized using either modification methods markedly increased the participants’ recall of synthetic speech (compare to the result of first experiment), but, at the cost of the speech quality. The thesis also discusses which features of the voice have the biggest impact on the voice quality, and finally it discusses the implications of the findings for the voice of the Healthcare robot, being developed for people over 60 years old.