Abstract:
This thesis presents the development of customised synthetic voices by using speech enhancement for rapid voice adaptation. It begins with a background on the relevant concepts of array processing and microphone array beamforming as it pertains to the speech enhancement algorithm implemented. This is followed with a discussion of the voice building aspects of text-to-speech systems and hidden Markov model based speech synthesis. This is linked to an existing rapid voice adaptation system to create customised synthetic voices. The framework of the speech enhancement algorithm is detailed and demonstrated using a simulation and acoustic analysis on test material. The two systems, speech enhancement and voice adaptation, are merged to develop a platform for creating adapted synthetic voices. Voices are developed by using this algorithm to condition a training corpus used for seeding material for voice adaptation. Some of the issues are addressed during the merger before synthetic voices are developed using a practically measured environment. The new voices created from this system, motivated a perception study with thirty participants. The study was to test for synthetic voice discrimination and synthetic voice quality based on the degree of speech enhancement and varying noise conditions. The results from the perception study showed that at low noise levels speech enhancement was of question- able benefit. However, at higher noise levels, speech enhancement showed an improvement to the quality perception of the synthetic voices.