Unraveling the genetic mechanisms of complex diseases using the 3D genome
Reference
Degree Grantor
Abstract
Genetics is a major contributor to human disease. However, while association studies have identified thousands of disease-associated genetic variants, the mechanisms through which the variants contribute to disease are poorly characterized. This is largely because >90% of disease associated variants do not fall within protein coding regions of the genome, which makes identifying their functional impact difficult. To address this challenge, I developed a computational method that integrates evidence from genotype, gene expression, and chromatin interaction data to identify the genes that are regulated by disease-associated genetic variants, and the tissues in which the regulatory interactions are active. I analyzed 20,782 variants that are associated with 1,351 human diseases and traits to reveal the genes they control. ~75% of the 16,248 gene regulatory interactions I identified have been missed in association studies because the genes are distant from their regulators in the linear DNA sequence. However, the three-dimensional structure of the genome brings the distant genes and their regulators close enough in space for regulatory interactions to occur. Unsupervised machine learning of spatially regulated genes identified disease clusters that are consistent with known multimorbid conditions. Further investigations revealed that the genes shared by phenotypes in a cluster are involved in the same biochemical pathways. These observations suggest that although multimorbid conditions may not have the same set of associated genetic variants, the variants affect common pathways that could explain the multimorbid conditions. Finally, I have demonstrated the application of this method to identify putative tissues and druggable genes that contribute to changes in blood metabolite levels. The novel SNP-gene associations identified in this thesis are a valuable resource for understanding the molecular mechanisms that guide pathologic metabolite levels in human tissues, and for further investigation into disease diagnosis, drug repurposing, and personalised therapy.