Abstract:
Motivation. Chronic conditions place a considerable burden on modern healthcare systems.
Within New Zealand and worldwide cardiovascular disease (CVD) affects a significant proportion
of the population and it is the leading cause of death. An enduring challenge in population health
is to accurately identify at risk individuals within the population. Like other chronic diseases, the
course of cardiovascular disease is usually prolonged and its management necessarily long-term.
The state-of-the-art in automated clinical decision support for risk assessment is regression based.
While these models are useful in identifying factors contributing to risk, they have a number of
limitations, among which they are sequence agnostic and have a limited capacity to model time.
Objectives. The aim of this thesis is to examine whether, by explicitly modelling the temporal
dimension of patient history using sequential data, risk assessment may be improved. This thesis
investigates methods for multivariate sequential modelling with a particular emphasis on Long
short-term memory (LSTM): a type of recurrent neural network.
Methods. The Vascular Informatics Using Epidemiology and theWeb (VIEW) programme data
set is linked to other routinely collected national data sets including pharmaceutical dispensing,
hospitalisation, lab test results and deaths. Selected methods are then applied to the linked
data set. The experiments carried out focus on three levels of outcomes: risk factor (TC/HDL),
disease management (medication adherence) and CVD event (hospitalisation/CVD death).
Results. The results of the experiments showed temporal models are valuable for the prediction
of mean TC/HDL, medication adherence and CVD event over a 5-year interval. This is especially
the case for LSTM, which produced the best predictive performance among all models compared.
However, the task of making a fine-grained forecast of a risk factor remains challenging, with
LSTM unable to beat a linear model based approach for predicting TC/HDL at each 90-day
quarter. In the context of adherence prediction, experiments also demonstrated that by lengthening
the observation window, allowing longer patient history to be integrated into an analytic
task, the performance of LSTM can be further improved.
Discussion. The findings of this thesis provide evidence that the use of deep temporal models
particularly LSTM in clinical decision support for CVD would be advantageous with LSTM
significantly improving on the state-of-the-art on all three levels of outcomes. Future work should
explore model bias for patient subgroups, the opportunities for explainable neural networks, and
application to other chronic illnesses.