A speech recognition system utilising a word-graph parser

Show simple item record

dc.contributor.advisor Macdonald, Bruce
dc.contributor.author Waters, Christopher J. F. en
dc.date.accessioned 2020-07-08T05:04:11Z en
dc.date.available 2020-07-08T05:04:11Z en
dc.date.issued 2000 en
dc.identifier.uri http://hdl.handle.net/2292/52296 en
dc.description Full text is available to authenticated members of The University of Auckland only. en
dc.description.abstract A language model is used by a speech recogniser to guide a search process or choose between alternative recognition hypotheses. The language model makes its choice on the basis of the likelihood that the hypothesis could be generated by the model. Language models that incorporate information about longer range grammatical effects, such as phrase structure grammars, tend to be computationally expensive and so are not commonly used. The hypotheses generated by the acoustic search component of a speech recogniser are typically represented as a simple list-called an N-best list-or as a wordgraph. A word-graph compactly stores the hypotheses in a network structure with a single branch for words or phases common to multiple hypotheses. In this thesis a new phrase structure grammar parser capable of operating on word-graphs is described. Based on the Earley parser, this graph parser parses every hypothesis in a word-graph in a single traversal of the graph. This results in a significant computational saving and an increase in recogniser accuracy as it allows more hypotheses to be considered. Experiments have been performed with the graph parser on word-graphs generated from the common Resource Management benchmark. The graph parser achieved a 8.0% reduction in word error rate on the speaker dependent task. At the same time, it required only 13% of the computation that processing an equivalent N-best list would require. During the course of the research a new, flexible, speech recognition architecture, called ARISTOTLE, was developed. ARISTOTLE is a speaker independent, large vocabulary, continuous speech system. Through the use of a built-in scripting language; modular structure; client-server communications; and implementation of the most commonly used algorithms, ARISTOTLE is a tool for research into new methods and techniques used in speech recognition.
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof PhD Thesis - University of Auckland en
dc.relation.isreferencedby UoA9994867914002091 en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. en
dc.rights Restricted Item. Full text is available to authenticated members of The University of Auckland only. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.title A speech recognition system utilising a word-graph parser en
dc.type Thesis en
thesis.degree.discipline Electrical and Electronic Engineering en
thesis.degree.grantor The University of Auckland en
thesis.degree.level Doctoral en
thesis.degree.name PhD en
dc.rights.holder Copyright: The author en
dc.identifier.wikidata Q112902890


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics