Single-Channel Statistical Bayesian Short-Time Fourier Transform Speech Enhancement with Deterministic A Priori Information

Show simple item record

dc.contributor.advisor Guillemin, B en McCallum, Matthew en 2015-02-25T20:45:01Z en 2015 en
dc.identifier.citation 2015 en
dc.identifier.uri en
dc.description.abstract Emergency service workers are constantly required to communicate in environments with very low acoustic signal to noise ratios (SNRs), where both quality and intelligibility of speech are of critical importance. Attempts to improve such aspects of speech have long been investigated under the umbrella of speech enhancement. Bayesian short-time Fourier transform (STFT) speech enhancement algorithms are a key candidate for real-time radio communications applications, as relatively good increases in speech quality can be achieved with relatively low computational complexity. In the context of empirical Bayesian statistics, a predictable or deterministic component of a speech STFT coe cient, due to information sources external to data at a given time-frequency index, may be represented as a non-zero mean in the respective a priori speech pdf, about which, there is some uncertainty (i.e., nonzero variance). Additionally, considering that public service workers often encounter few, but recurring noise sources, non-zero mean a priori pdfs are also of interest in modelling noise, where they may exploit predictable characteristics of a known noise source. Such a unimodal non-zero mean a priori speech/noise pdf is novel to Bayesian STFT speech enhancement, and the research here establishes a framework for Bayesian STFT speech enhancement under this consideration. Here, this is restricted to non-zero means representing sinusoidal signal components in both speech and noise. These components are typically underexploited in Bayesian STFT speech enhancement, and in theory, the framework established here may also be extended to more arbitrary predictable signal components. Several novel methods for the estimation of the amplitude, phase and frequency of potentially non-stationary sinusoidal deterministic components in speech and noise are presented. These estimated signal features may then specify a non-zero mean a priori pdf, allowing the development of several novel estimators for the clean speech STFT. The parameter estimation methods, and the clean speech estimators that are dependent upon them, are then combined to form a number of speech enhancement algorithms. The ideas developed in this research result in both improved recovery of speech information and improved removal of undesirable noise features, according to a range of quality/intelligibility measures under a range of conditions. en
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof PhD Thesis - University of Auckland en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights.uri en
dc.rights.uri en
dc.title Single-Channel Statistical Bayesian Short-Time Fourier Transform Speech Enhancement with Deterministic A Priori Information en
dc.type Thesis en The University of Auckland en Doctoral en PhD en
dc.rights.holder Copyright: The Author en
dc.rights.accessrights en
pubs.elements-id 476904 en
pubs.record-created-at-source-date 2015-02-26 en
dc.identifier.wikidata Q112909882

Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record


Search ResearchSpace