Statistical methods in clinical proteomic studies 2007-2014 -A protein concerto-

Reference

2014

Degree Grantor

The University of Auckland

Abstract

Clinical proteomics is a subject of systems biology that investigates large numbers of protein biomarkers associated with human disease. Like the other “omics”, proteomics use systems biology techniques to identify proteome-wide markers simultaneously. Unlike genomics that has been established for decades, proteomics is still in its infancy. The current biotechnologies have limited power to discover all the existing 20,000s proteins from the human body. Biologists have not been able to understand the molecular functions of lots of those identified proteins. Statistical techniques become essential in proteomics research because clinical proteomic studies generate a large amount of quantitative information through systems biology techniques to investigate proteins’ molecular activities. The complexities of clinical study and proteomic experiments also require statistical inputs to achieve valid and unbiased inferences. This PhD research firstly proposed a new method to assess the reproducibility in clinical proteomic studies when a new device or new tissue is being used for a proteomic experiment. The reproducibility assessment utilizes a dimensions reduction technique and permutation method to extend the evaluation from a single feature scale to a proteomewise scale. It secondly proposed algorithms to optimize the study design for a multiple stage study which bridges the biomarker discovery to clinical utility. The optimal design algorithms utilized a hybrid simulated annealing approach to finding the design parameters that achieve a maximal number of discoveries, under the constraints of cost and number of false discoveries. These algorithms were realized via a R package named “proteomicdesign”. Finally, a multivariate multilevel model has been proposed for the analysis of proteomic data. The non-random missing data presented in proteomic mass spectrometric experiments were estimated under a Bayesian framework. The proposed analytical method was tested in a simulated study and used in two real life clinical proteomic studies.

Description

DOI

Related Link

Keywords

ANZSRC 2020 Field of Research Codes

Collections