A Hybrid Clustering-Association Rule Mining Framework for Medical Knowledge Discovery

ResearchSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Warren, JR en
dc.contributor.advisor Riddle, P en
dc.contributor.author Song, Shen en
dc.date.accessioned 2017-10-16T01:23:11Z en
dc.date.issued 2017 en
dc.identifier.uri http://hdl.handle.net/2292/36070 en
dc.description.abstract To improve health management on the heart disease related hospital readmission, this thesis proposes a machine learning based framework for modeling the patients who are at-risk for preventable hospital readmission. Differing from the traditional statistical methods, this proposed framework integrates Association Rule Mining (ARM) and Clustering techniques to build at-risk identification models for the pursuit of new insights into the at-risk populations. The new insights allow general practitioners to view their patients in groups characterized by the predicted risk factors. We aim to provide the new insights to complement conventional regression models which are limited in scope to the significance and the weight of specific expected predictors. The proposed framework is called a ‘Hybrid Clustering-ARM framework’ (HCA framework). To experimentally assess the feasibility of the HCA framework, we were approved to access two data sources: Framingham Heart Study, which is a well-known historical dataset for heart events; and the New Zealand VIEW (Vascular Informatics Using Epidemiology and the Web) dataset, which is relatively new. We applied the HCA framework on both data sources with a series of sensitivity analyses. To some extent, the HCA framework is able to produce a model to identify the risk factors as good as the traditional regression based models. Besides the traditional perspective of detecting risk factors in the medical prediction models, the identification model, derived by the HCA framework, provides an insight on the ‘at-risk’ patients in clusters as well as ‘low-risk’ patients. Together, these detected ‘at-risk’ patients are allowed to map into multiple clusters, which makes the understanding on the ‘at-risk’ patients close to the natural distribution of the sampled patients. Theoretically, all sampled patients are at some risk of having the CVD conditions of interest. However, some of the sampled patients are more likely to develop the disease than others. By segmenting the sampled patients in the style with multiple ‘at-risk’ clusters and one ‘low-risk’ group, it helps the general practitioners (or others with an interest, e.g. public health physicians, cardiologists or health policy and resource planners) to categorize their patients for better health management. en
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof PhD Thesis - University of Auckland en
dc.relation.isreferencedby UoA99264936012102091 en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.title A Hybrid Clustering-Association Rule Mining Framework for Medical Knowledge Discovery en
dc.type Thesis en
thesis.degree.discipline Computer Science en
thesis.degree.grantor The University of Auckland en
thesis.degree.level Doctoral en
thesis.degree.name PhD en
dc.rights.holder Copyright: The author en
dc.rights.accessrights http://purl.org/eprint/accessRights/OpenAccess en
pubs.elements-id 694034 en
pubs.record-created-at-source-date 2017-10-16 en

Full text options

This item appears in the following Collection(s)

Show simple item record


Search ResearchSpace

Advanced Search