Impact on Forensic Voice Comparison of Speech Acquired from the GSM Mobile Phone Network

Show simple item record

dc.contributor.advisor Guillemin, B en
dc.contributor.advisor Watson, C en
dc.contributor.author Bahuleyan Nair Thulasi Kumari, Balamurali en
dc.date.accessioned 2015-06-09T23:58:20Z en
dc.date.issued 2015 en
dc.identifier.citation 2015 en
dc.identifier.uri http://hdl.handle.net/2292/25824 en
dc.description.abstract This thesis investigates the impact of the Global System for Mobile Communications (GSM) network on forensic voice comparison (FVC). Forensic scientists have in the past often used intercepted speech recorded from the landline telephone network for evaluating the strength of evidence between suspect and offender speech data. However, these days mobile phones are a much more widely used means of communication amongst the criminal fraternity. There exists the mistaken belief among some forensic scientists that the impacts of landline and mobile phone technologies on speech parameters relevant to FVC are more or less the same. In fact, these two communication technologies are entirely different in their ways of handling the speech signal. This research focuses on the impact of one of the current mobile phone network technologies, namely the GSM network, on the speech signal and, consequently, its impact on FVC. Such findings are important for forensic scientists undertaking FVC using GSM-coded speech. Speech forensic evidence for this research was evaluated using the likelihood ratio (LR) framework. A new approach for calculating LR values, namely Principal Component Analysis Kernel Likelihood Ratio (PCAKLR), was developed as part of this research. Motivation for this was the observation that one of the existing approaches, namely Multivariate Kernel Density (MVKD), can sometimes over or underestimate the strength of evidence when the number of input parameters exceeds approximately four. Another approach for calculating LRs, namely Gaussian Mixture Model – Universal Background Model (GMM–UBM) was not used in this research because it typically requires a large amount of data for determining the background model, something that may not be available in a real-world FVC scenario. PCAKLR is computationally robust irrespective of the number of parameters used and does not require a large amount of data for determining the background model. Furthermore, it is shown to produce comparable results to MVKD for small numbers of parameters. A number of key aspects of the GSM network that can negatively impact the speech signal were considered in this research, namely dynamic rate coding (DRC), frame loss (FL) and background noise (BN) at the transmitting end. Experiments were undertaken by directly driving the speech codec in this network (i.e., the Adaptive Multi-Rate (AMR) codec) to operate according to various channel conditions. Since this codec is the only component responsible for the quality of the transmitted speech under all possible modes of operation of network, this was considered to be a better strategy than conducting a large number of experiments with speech that has been passed through an actual network. An investigation was undertaken as part of this research to determine which speech parameters give the best FVC performance when using GSM-coded speech. Mel- Frequency Cepstral Coefficients (MFCCs) marginally outperformed others and have thus been used in this research. A number of experiments have been undertaken using MFCCs to investigate the impact of DRC, FL and BN on FVC when acting separately, and these are presented. DRC relates to dynamically changing the bit rate at which speech is coded in response to GSM channel condition. Results are presented showing that this feature surprisingly resulted in both better FVC accuracy and precision for coded speech compared to that of the uncoded speech. FL was found to negatively impact samespeaker comparisons, but again surprisingly it improved the accuracy of differentspeaker comparisons. GSM speech recordings are often corrupted by BN picked up by the transmitter microphone. The proportion of same- and different-speaker misclassifications was slightly higher when using BN-corrupted speech and this in turn negatively impacted FVC accuracy. Of the three different types of BN investigated, babble noise was found to have the greatest impact. Finally in this thesis, all the three GSM features are brought together and their impact investigated. The result as expected is a reduction in FVC accuracy compared to when using clean or uncoded speech. However, the reliability of the FVC was found to be better than that for uncoded speech. Overall, it is shown that use of GSM-coded speech does negatively impact FVC. Of the three aspects considered, FL appears to have the greatest impact. A brief investigation into the impact of mismatch between the background and testing sets has also been undertaken. It is shown that minimising this mismatch marginally improved the accuracy of the FVC analysis using coded speech, but with some small impact on reliability. en
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof PhD Thesis - University of Auckland en
dc.relation.isreferencedby UoA99264817312802091 en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/nz/ en
dc.title Impact on Forensic Voice Comparison of Speech Acquired from the GSM Mobile Phone Network en
dc.type Thesis en
thesis.degree.discipline Electrical and Computer Engineering en
thesis.degree.grantor The University of Auckland en
thesis.degree.level Doctoral en
thesis.degree.name PhD en
dc.rights.holder Copyright: The Author en
dc.rights.accessrights http://purl.org/eprint/accessRights/OpenAccess en
pubs.elements-id 488341 en
pubs.record-created-at-source-date 2015-06-10 en
dc.identifier.wikidata Q112908074


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics