A speech recognition system utilising a word-graph parser

Waters, Christopher J. F.

A speech recognition system utilising a word-graph parser

Waters, Christopher J. F.

Identifier: http://hdl.handle.net/2292/52296

Issue Date: 2000

Degree Name: PhD

Degree Grantor: The University of Auckland

Rights: Copyright: The author

Rights (URI): https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm

Abstract:

A language model is used by a speech recogniser to guide a search process or choose between alternative recognition hypotheses. The language model makes its choice on the basis of the likelihood that the hypothesis could be generated by the model. Language models that incorporate information about longer range grammatical effects, such as phrase structure grammars, tend to be computationally expensive and so are not commonly used. The hypotheses generated by the acoustic search component of a speech recogniser are typically represented as a simple list-called an N-best list-or as a wordgraph. A word-graph compactly stores the hypotheses in a network structure with a single branch for words or phases common to multiple hypotheses. In this thesis a new phrase structure grammar parser capable of operating on word-graphs is described. Based on the Earley parser, this graph parser parses every hypothesis in a word-graph in a single traversal of the graph. This results in a significant computational saving and an increase in recogniser accuracy as it allows more hypotheses to be considered. Experiments have been performed with the graph parser on word-graphs generated from the common Resource Management benchmark. The graph parser achieved a 8.0% reduction in word error rate on the speaker dependent task. At the same time, it required only 13% of the computation that processing an equivalent N-best list would require. During the course of the research a new, flexible, speech recognition architecture, called ARISTOTLE, was developed. ARISTOTLE is a speaker independent, large vocabulary, continuous speech system. Through the use of a built-in scripting language; modular structure; client-server communications; and implementation of the most commonly used algorithms, ARISTOTLE is a tool for research into new methods and techniques used in speech recognition.

Description:

Full text is available to authenticated members of The University of Auckland only.

Show full item record

Files in this item

Name: Waters-2000-whole.pdf

Size: 11.57Mb

Format: PDF

This item appears in the following Collection(s)

Doctoral Theses - Authenticated Access [1680]

A speech recognition system utilising a word-graph parser

A speech recognition system utilising a word-graph parser

Abstract:

Description:

Files in this item

This item appears in the following Collection(s)

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics

A speech recognition system utilising a word-graph parser

A speech recognition system utilising a word-graph parser

Abstract:

Description:

Files in this item

This item appears in the following Collection(s)

Share

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics