dc.contributor.author |
Pearman, William S |
|
dc.contributor.author |
Freed, Nikki E |
|
dc.contributor.author |
Silander, Olin K |
|
dc.coverage.spatial |
England |
|
dc.date.accessioned |
2023-11-05T22:23:27Z |
|
dc.date.available |
2023-11-05T22:23:27Z |
|
dc.date.issued |
2020-05 |
|
dc.identifier.citation |
(2020). BMC Bioinformatics, 21(1), 220-. |
|
dc.identifier.issn |
1471-2105 |
|
dc.identifier.uri |
https://hdl.handle.net/2292/66436 |
|
dc.description.abstract |
Background: The first step in understanding ecological community diversity and
dynamics is quantifying community membership. An increasingly common method
for doing so is through metagenomics. Because of the rapidly increasing popularity
of this approach, a large number of computational tools and pipelines are available
for analysing metagenomic data. However, the majority of these tools have been
designed and benchmarked using highly accurate short read data (i.e. Illumina), with
few studies benchmarking classification accuracy for long error-prone reads (PacBio
or Oxford Nanopore). In addition, few tools have been benchmarked for nonmicrobial communities.
Results: Here we compare simulated long reads from Oxford Nanopore and Pacific
Biosciences (PacBio) with high accuracy Illumina read sets to systematically investigate
the effects of sequence length and taxon type on classification accuracy for
metagenomic data from both microbial and non-microbial communities. We show that
very generally, classification accuracy is far lower for non-microbial communities, even
at low taxonomic resolution (e.g. family rather than genus). We then show that for two
popular taxonomic classifiers, long reads can significantly increase classification
accuracy, and this is most pronounced for non-microbial communities.
Conclusions: This work provides insight on the expected accuracy for metagenomic
analyses for different taxonomic groups, and establishes the point at which read
length becomes more important than error rate for assigning the correct taxon. |
|
dc.format.medium |
Electronic |
|
dc.language |
eng |
|
dc.publisher |
Springer Nature |
|
dc.relation.ispartofseries |
BMC bioinformatics |
|
dc.rights |
Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. |
|
dc.rights.uri |
https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm |
|
dc.rights.uri |
https://creativecommons.org/licenses/by/4.0/ |
|
dc.subject |
Sequence Analysis, DNA |
|
dc.subject |
Computer Simulation |
|
dc.subject |
Metagenomics |
|
dc.subject |
Eukaryota |
|
dc.subject |
High-Throughput Nucleotide Sequencing |
|
dc.subject |
Nanopore Sequencing |
|
dc.subject |
Community composition |
|
dc.subject |
Illumina |
|
dc.subject |
Long read |
|
dc.subject |
Nanopore |
|
dc.subject |
3107 Microbiology |
|
dc.subject |
31 Biological Sciences |
|
dc.subject |
3102 Bioinformatics and Computational Biology |
|
dc.subject |
3103 Ecology |
|
dc.subject |
Human Genome |
|
dc.subject |
Genetics |
|
dc.subject |
Science & Technology |
|
dc.subject |
Life Sciences & Biomedicine |
|
dc.subject |
Biochemical Research Methods |
|
dc.subject |
Biotechnology & Applied Microbiology |
|
dc.subject |
Mathematical & Computational Biology |
|
dc.subject |
Biochemistry & Molecular Biology |
|
dc.subject |
SPECIES DEFINITION |
|
dc.subject |
DNA |
|
dc.subject |
CLASSIFICATION |
|
dc.subject |
COMMUNITIES |
|
dc.subject |
TAXONOMY |
|
dc.subject |
01 Mathematical Sciences |
|
dc.subject |
06 Biological Sciences |
|
dc.subject |
08 Information and Computing Sciences |
|
dc.subject |
46 Information and computing sciences |
|
dc.subject |
49 Mathematical sciences |
|
dc.title |
Testing the advantages and disadvantages of short- and long- read eukaryotic metagenomics using simulated reads |
|
dc.type |
Journal Article |
|
dc.identifier.doi |
10.1186/s12859-020-3528-4 |
|
pubs.issue |
1 |
|
pubs.begin-page |
220 |
|
pubs.volume |
21 |
|
dc.date.updated |
2023-10-28T02:43:57Z |
|
dc.rights.holder |
Copyright: The authors |
en |
dc.identifier.pmid |
32471343 (pubmed) |
|
pubs.author-url |
https://www.ncbi.nlm.nih.gov/pubmed/32471343 |
|
pubs.publication-status |
Published |
|
dc.rights.accessrights |
http://purl.org/eprint/accessRights/OpenAccess |
en |
pubs.subtype |
research-article |
|
pubs.subtype |
Journal Article |
|
pubs.elements-id |
852969 |
|
dc.identifier.eissn |
1471-2105 |
|
dc.identifier.pii |
10.1186/s12859-020-3528-4 |
|
pubs.number |
220 |
|
pubs.record-created-at-source-date |
2023-10-28 |
|
pubs.online-publication-date |
2020-05-29 |
|