Abstract:
SARs-CoV-2 and its recent evolution are currently a major research topic; however, much less work has been done from the perspective of macroevolution, where insights are sought by studying long-term patterns across millions of years of coronavirus evolution. Studying the macroevolution of coronaviruses should allow a more in-depth understanding of how coronaviruses can and have varied in structure and sequence. To begin exploring the potential of macroevolutionary perspectives on coronavirus evolution, this thesis investigated to what degree homologs of SARs-CoV-2 proteins could be detected in distant relatives, the differences in the rate of sequence evolution in these proteins, and the congruence of phylogenetic trees derived from the most-conserved proteins. Following this, the patterns and rates of substitution across the coronavirus genome were compared between the modern SARS-CoV-2 outbreak, spanning a few years, and the macroevolutionary history of the family, spanning millions of years.
SARs-CoV-2 homologs were detected using HMMER, which found that the coronavirus structural proteins were the most conserved. For example, The ORF1ab was the most conserved protein, as this protein detected homologs throughout the Nidovirales order. Conserved proteins were used to make estimates of coronavirus phylogeny. These protein phylogenies presented different evolutionary patterns, suggesting that a combination of different evolutionary processes acted on coronaviruses. The amino acid substitution rates were calculated for all sites in each conserved protein at the macroevolutionary scale (the whole family) and microevolutionary scale (the SARs-CoV-2 outbreak). Statistical tests found that macroevolutionary substitutions significantly, although partially, predicted microevolutionary substitutions, which is knowledge that could be used in future studies to help predict evolutionary patterns in future coronavirus outbreaks. This thesis strengthens the argument that studying the large-scale, macroevolutionary history of pathogen groups can help us understand pressing problems like current and new outbreaks.