Towards the Creation of Customised Synthetic Voices using Hidden Markov Models on a Healthcare Robot

ResearchSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Watson, C en
dc.contributor.author Jain, Sahil en
dc.date.accessioned 2015-11-19T23:27:12Z en
dc.date.issued 2015 en
dc.identifier.citation 2015 en
dc.identifier.uri http://hdl.handle.net/2292/27521 en
dc.description Full text is available to authenticated members of The University of Auckland only. en
dc.description.abstract This thesis presents the design and implementation of a novel system for the creation of customised synthetic voices for laypersons using a healthcare robot, using hidden Markov models. First the relevant background knowledge of text-to-speech systems and hidden Markov models, especially as they pertain to speech synthesis, is provided. The text-to-speech system, OpenMARY, is described. Steps taken to build the first New Zealand English hidden Markov model synthetic voices using OpenMARY are detailed, including the steps taken to implement New Zealand English as a new locale of English in OpenMARY. Loudness variation issues due to global variance for the New Zealand English female synthetic voices are described, and the use of frequency warping and mel-cepstral coefficient order to mitigate these effects is discussed. A perception study is conducted on the newly developed New Zealand English synthetic voices, based on the design of the Blizzard Challenge, which assesses similarity, naturalness, multidimensional scaling, and intelligibility of the voices. The analysis of the survey, as well as a metadata analysis, is provided. Finally, the Rapid Voice Adaptation System, a novel system designed to facilitate the creation of customised synthetic voices, is described. The creation of the \average" speaker for New Zealand English is described. The client-server /daemon architecture of the system is discussed, with the client situated on the healthcare robot, and the server/daemon component situated on a remote server machine. The usage of the client on the healthcare robot to record audio and transmit the data to the server using HTTP is described, and the subsequent adaptation of the target speaker by the daemon is detailed. The retrieval of the created synthetic voice by the healthcare robot is also discussed. It is found that spectral dissonance between the target speaker and the "average" speaker, such as due to noise and echo cancellation, negatively impacts voice adaptation. The Rapid Voice Adaptation System is shown to be effective in facilitating the creation of customised synthetic voices by laypersons. en
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof Masters Thesis - University of Auckland en
dc.relation.isreferencedby UoA99264808508402091 en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights Restricted Item. Available to authenticated members of The University of Auckland. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/nz/ en
dc.title Towards the Creation of Customised Synthetic Voices using Hidden Markov Models on a Healthcare Robot en
dc.type Thesis en
thesis.degree.discipline Electrical Engineering en
thesis.degree.grantor The University of Auckland en
thesis.degree.level Masters en
dc.rights.holder Copyright: The Author en
pubs.elements-id 505674 en
pubs.record-created-at-source-date 2015-11-20 en


Full text options

This item appears in the following Collection(s)

Show simple item record

http://creativecommons.org/licenses/by-nc-sa/3.0/nz/ Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc-sa/3.0/nz/

Share

Search ResearchSpace


Advanced Search

Browse