Balancing Utility and Fairness Against Privacy for Sensitive Data

Show simple item record

dc.contributor.advisor Koh, Yun Sing
dc.contributor.advisor Wicker, Jörg
dc.contributor.advisor Junjae Lee
dc.contributor.author Chester, Andrew
dc.date.accessioned 2021-09-01T20:27:45Z
dc.date.available 2021-09-01T20:27:45Z
dc.date.issued 2021 en
dc.identifier.uri https://hdl.handle.net/2292/56312
dc.description.abstract Machine learning models continue to replace human decision making in high-stakes environments. In elds as diverse as healthcare, college acceptance and loan approval these algorithms can have signi cant e ects on peoples lives. Privacy has long been a concern when dealing with sensitive information. Recently, fairness in machine learning has become a signi cant concern as well. Previous research has established a well known tradeo between privacy and accuracy. More recently, research has investigated the trade-o between fairness and accuracy. The focus of this thesis is the interaction between these two concepts. This is a recent eld of study and most research focuses on the interaction of an individual de-identi cation mechanism and the subsequent bias mitigation methods which follow a similar methodology. In this thesis we investigate the complex interaction between privacy, fairness, and accuracy through multiple de-identi cation and bias mitigation mechanisms. As motivating examples we address the general privacy cases as well as the healthcare eld. We consider a scenario where a company that has access to sensitive records, be they medical or nancial, would like to publish this data for users to analyse. The company operates under a regulatory environment which requires them to de-identify the data to protect the privacy of the individuals in the data. Additionally, the company would like to ensure that the data is not biased towards any particular groups. We rst developed a technique to assess the level of fairness loss and accuracy loss due to the de-identi cation process. We developed two novel measures to assess this loss. We applied this to multiple levels of de-identi ed data to assess the impact that de-identi cation has on these measures. Following this we adapted this technique to assess the fairness gain and the accuracy loss due to bias mitigation on the original and de-identi ed data. We adapted our novel measures to apply them to this comparison. Finally, we adapted a bandit based hyper-parameter optimisation mechanism to assess the hyper-parameters of the mitigation mechanisms and hyper-parameters to achieve a good trade-o between fairness and accuracy on de-identi ed data.
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof Masters Thesis - University of Auckland en
dc.relation.isreferencedby UoA en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/nz/
dc.title Balancing Utility and Fairness Against Privacy for Sensitive Data
dc.type Thesis en
thesis.degree.discipline Computer Science
thesis.degree.grantor The University of Auckland en
thesis.degree.level Masters en
dc.date.updated 2021-09-01T03:05:01Z
dc.rights.holder Copyright: the author en
dc.rights.accessrights http://purl.org/eprint/accessRights/OpenAccess en


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics