Development of Bioinformatic Methods for Data-Driven Forensic Short Tandem Repeat Analyses

Show simple item record

dc.contributor.advisor Harbison, SallyAnn
dc.contributor.advisor Welch, David
dc.contributor.advisor Elliot, Douglas
dc.contributor.author Liu, Yao-Yuan (Alexander)
dc.date.accessioned 2021-11-26T00:57:47Z
dc.date.available 2021-11-26T00:57:47Z
dc.date.issued 2021 en
dc.identifier.uri https://hdl.handle.net/2292/57553
dc.description.abstract Short tandem repeats (STRs) are a class of markers which currently play an important role in the forensic DNA typing. The arrival of massively parallel sequencing platforms (MPS) in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development, and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. This Thesis reviews the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the MPS platforms currently most widely used and explores data-driven methods for the analysis of STR sequencing data. Part I of the Thesis (Chapters 1-3) explores the algorithms and methodology in the identification of STR sequences in sequence data. Two novel STR extraction tools that learn from data were developed as a practical implementation of the data-driven approach: Fragsifier, a tool that uses a machine learning approach to detect and extract STR sequences from MPS data (Chapter 2); and STRgazer, a tool that uses a convolutional neural network to detects the presence of STRs in unprocessed ForenSeq™ primer mix B reads (Chapter 3). Part II of the Thesis (Chapter 4-6) explores the characteristics of the extracted STR data such as the amounts of allele and artefact sequences in the data extracted by different tools (Chapter 4), and the modelling of these components to inform allele calling (Chapter 5). After exploring existing methods, an allele caller was developed, named the Automatic denoising allele caller (autoDAC), to examine the feasibility of an allele caller that learns all allele-calling models and thresholds from sequence extraction data and uses these to allele call sequence extraction results (Chapter 6). Overall, the work in this Thesis introduces novel approaches and paradigms to forensic STR analysis.
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof PhD Thesis - University of Auckland en
dc.relation.isreferencedby UoA en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/nz/
dc.title Development of Bioinformatic Methods for Data-Driven Forensic Short Tandem Repeat Analyses
dc.type Thesis en
thesis.degree.discipline Forensic Science
thesis.degree.grantor The University of Auckland en
thesis.degree.level Doctoral en
thesis.degree.name PhD en
dc.date.updated 2021-11-01T04:23:41Z
dc.rights.holder Copyright: The author en
dc.rights.accessrights http://purl.org/eprint/accessRights/OpenAccess en
dc.identifier.wikidata Q112955930


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics