Unsworth, CharlesDavid, CharlesWarren, Benjamin2020-08-192020https://hdl.handle.net/2292/52704Full Text is available to authenticated members of The University of Auckland only.Gene annotation remains a significant, non-trivial problem in bioinformatics. The two main aspects of the gene annotation problem are the identification of the gene positions in the genomes and identification of the gene’s biological function. In this thesis work, we use experimental and simulated RNA-Seq (ribonucleic-acid sequencing) data from, Arabidopsis thaliana and experimental RNA-Seq data from the kiwifruit variety Actinidia chinensis aligned to their respective host genomes to train linear predictor and multi-level perceptron (MLP) artificial neural networks (ANN) to identify exon region within 99 Arabidopsis thaliana genes and 99 Actinidia chinensis "Red5" genes, which were selected to represent a broad range of gene exon structure complexities, categorised as high medium or low complexity. We conclude linear predictor models are not generally capable of accurately predicting exon regions within the genes used in this study. Furthermore, MLP-based ANN models perform well in our experiments, though the choice of optimal ANN architecture, the 1 or 2 hidden layer MLP, differs depending on the exon structure complexity of the gene’s being predicted. Finally, the RNA-Seq data produced by the simulation method used in this work is not sufficient to train the ANN models to accurately recognise exon regions in the genes used in this study.Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher.Restricted Item. Full Text is available to authenticated members of The University of Auckland only.https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htmhttps://creativecommons.org/licenses/by-nc-sa/3.0/nz/Prediction of Exon Position in Plant Genomes from RNA-Seq using Multi-Layer PerceptronsThesisCopyright: The authorQ112954255