A speech recognition system utilising a word-graph parser

Waters, Christopher J. F.

dc.contributor.advisor	Macdonald, Bruce
dc.contributor.author	Waters, Christopher J. F.	en
dc.date.accessioned	2020-07-08T05:04:11Z	en
dc.date.available	2020-07-08T05:04:11Z	en
dc.date.issued	2000	en
dc.identifier.uri	http://hdl.handle.net/2292/52296	en
dc.description	Full text is available to authenticated members of The University of Auckland only.	en
dc.description.abstract	A language model is used by a speech recogniser to guide a search process or choose between alternative recognition hypotheses. The language model makes its choice on the basis of the likelihood that the hypothesis could be generated by the model. Language models that incorporate information about longer range grammatical effects, such as phrase structure grammars, tend to be computationally expensive and so are not commonly used. The hypotheses generated by the acoustic search component of a speech recogniser are typically represented as a simple list-called an N-best list-or as a wordgraph. A word-graph compactly stores the hypotheses in a network structure with a single branch for words or phases common to multiple hypotheses. In this thesis a new phrase structure grammar parser capable of operating on word-graphs is described. Based on the Earley parser, this graph parser parses every hypothesis in a word-graph in a single traversal of the graph. This results in a significant computational saving and an increase in recogniser accuracy as it allows more hypotheses to be considered. Experiments have been performed with the graph parser on word-graphs generated from the common Resource Management benchmark. The graph parser achieved a 8.0% reduction in word error rate on the speaker dependent task. At the same time, it required only 13% of the computation that processing an equivalent N-best list would require. During the course of the research a new, flexible, speech recognition architecture, called ARISTOTLE, was developed. ARISTOTLE is a speaker independent, large vocabulary, continuous speech system. Through the use of a built-in scripting language; modular structure; client-server communications; and implementation of the most commonly used algorithms, ARISTOTLE is a tool for research into new methods and techniques used in speech recognition.
dc.publisher	ResearchSpace@Auckland	en
dc.relation.ispartof	PhD Thesis - University of Auckland	en
dc.relation.isreferencedby	UoA9994867914002091	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated.	en
dc.rights	Restricted Item. Full text is available to authenticated members of The University of Auckland only.	en
dc.rights.uri	https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm	en
dc.title	A speech recognition system utilising a word-graph parser	en
dc.type	Thesis	en
thesis.degree.discipline	Electrical and Electronic Engineering	en
thesis.degree.grantor	The University of Auckland	en
thesis.degree.level	Doctoral	en
thesis.degree.name	PhD	en
dc.rights.holder	Copyright: The author	en
dc.identifier.wikidata	Q112902890