dc.contributor.advisor |
Koh, Yun Sing |
|
dc.contributor.advisor |
Wicker, Jörg |
|
dc.contributor.advisor |
Junjae Lee |
|
dc.contributor.author |
Chester, Andrew |
|
dc.date.accessioned |
2021-09-01T20:27:45Z |
|
dc.date.available |
2021-09-01T20:27:45Z |
|
dc.date.issued |
2021 |
en |
dc.identifier.uri |
https://hdl.handle.net/2292/56312 |
|
dc.description.abstract |
Machine learning models continue to replace human decision making in high-stakes environments.
In elds as diverse as healthcare, college acceptance and loan approval these
algorithms can have signi cant e ects on peoples lives. Privacy has long been a concern
when dealing with sensitive information. Recently, fairness in machine learning has
become a signi cant concern as well. Previous research has established a well known tradeo
between privacy and accuracy. More recently, research has investigated the trade-o
between fairness and accuracy. The focus of this thesis is the interaction between these
two concepts. This is a recent eld of study and most research focuses on the interaction
of an individual de-identi cation mechanism and the subsequent bias mitigation methods
which follow a similar methodology.
In this thesis we investigate the complex interaction between privacy, fairness, and
accuracy through multiple de-identi cation and bias mitigation mechanisms. As motivating
examples we address the general privacy cases as well as the healthcare eld. We
consider a scenario where a company that has access to sensitive records, be they medical
or nancial, would like to publish this data for users to analyse. The company operates
under a regulatory environment which requires them to de-identify the data to protect
the privacy of the individuals in the data. Additionally, the company would like to ensure
that the data is not biased towards any particular groups.
We rst developed a technique to assess the level of fairness loss and accuracy loss
due to the de-identi cation process. We developed two novel measures to assess this
loss. We applied this to multiple levels of de-identi ed data to assess the impact that
de-identi cation has on these measures. Following this we adapted this technique to
assess the fairness gain and the accuracy loss due to bias mitigation on the original and
de-identi ed data. We adapted our novel measures to apply them to this comparison.
Finally, we adapted a bandit based hyper-parameter optimisation mechanism to assess
the hyper-parameters of the mitigation mechanisms and hyper-parameters to achieve a
good trade-o between fairness and accuracy on de-identi ed data. |
|
dc.publisher |
ResearchSpace@Auckland |
en |
dc.relation.ispartof |
Masters Thesis - University of Auckland |
en |
dc.relation.isreferencedby |
UoA |
en |
dc.rights |
Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. |
|
dc.rights.uri |
https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm |
en |
dc.rights.uri |
http://creativecommons.org/licenses/by-nc-sa/3.0/nz/ |
|
dc.title |
Balancing Utility and Fairness Against Privacy for Sensitive Data |
|
dc.type |
Thesis |
en |
thesis.degree.discipline |
Computer Science |
|
thesis.degree.grantor |
The University of Auckland |
en |
thesis.degree.level |
Masters |
en |
dc.date.updated |
2021-09-01T03:05:01Z |
|
dc.rights.holder |
Copyright: the author |
en |
dc.rights.accessrights |
http://purl.org/eprint/accessRights/OpenAccess |
en |
dc.identifier.wikidata |
Q112954992 |
|