Abstract:
With the advent of DNA and protein sequencing, X-ray crystallography, and other methods of analysis, researchers interested in biological questions have been presented with a wealth of data to parse and interpret. Additionally, some hypotheses, such as those pertaining to the origin of life, are difficult to test experimentally using real molecules and/or systems. Computational and statistical models have therefore become essential tools to understanding the natural world. Such models can be applied to questions ranging from the origin of life on Earth to the spread of infectious diseases. This work provides background and descriptions of novel computational methods pertaining to: (a) a simulation study of sets of catalytic molecules that carry a minimal amount of genetic information for replication, (b) a new substitution method for phylogenetic inference on the aminoacyl-tRNA synthetase enzymes, which are responsible for attaching amino acids to tRNA molecules and are thereby instrumental to the functioning of the genetic code, and (c) the development, testing, and implementation of, and results generated by, a new model for joint phylogenetic and epidemiological inference from the sequences of rapidly evolving, infectious agents, and the comparison of two of these types of models with differing ways of describing branching processes.