Discovering Functional Dependencies From Possibilistic Data

He, Senyang

dc.contributor.advisor	Link, S	en
dc.contributor.author	He, Senyang	en
dc.date.accessioned	2016-11-06T20:38:22Z	en
dc.date.issued	2016	en
dc.identifier.uri	http://hdl.handle.net/2292/30970	en
dc.description	Full text is available to authenticated members of The University of Auckland only.	en
dc.description.abstract	Functional dependencies can express important relationships in data. They can therefore help enforce the semantics of an application domain within a database system. This has applications in many database tasks, including database design, data consistency, update and query processing. Traditional functional dependencies (FDs) were introduced for applications in which data occurs with full certainty. An important problem, that has been studied over the last three decades, concerns the discovery of those functional dependencies that hold on a given data set. In modern applications, such as information extractions, financial risk assessment, or sensor management, data is inherently uncertain. Recently, a new notion of possibilistic functional dependency (pFD) has been introduced. In this data model, each tuple is assigned one of finitely many available degrees of possibility with which it is perceived to occur in the data. Each FD is assigned one of finitely many available degrees of certainty that says to which tuples the FD applies. The new notion of pFD has been shown to have important applications in the schema design of databases. In this thesis, we study the problem of discovering those pFDs that hold on a given possibilistic data set. For this purpose, we review several FD discovery algorithms. We then describe how to extend these algorithms to the discovery problem of pFDs. The extended algorithms have been implemented and their performance in terms of time and memory usage have been tested on real-world benchmark data sets whose tuples have been augmented by randomly assigned degrees of possibility. In addition, we apply canonical cover algorithms to reduce the size of the output of the discovery algorithms, showing that little time is necessary to significantly reduce the size of the output's representation. Given particular features of the input, such as the number of rows, number of columns and number of available possibility degrees, we identify those algorithms that perform the fastest on data sets with these features. This provides guidelines for practitioners to choose the most suitable algorithm for a given data set at hand. Our results support and extend recent findings for the performance of algorithms that address the traditional FD discovery problem.	en
dc.publisher	ResearchSpace@Auckland	en
dc.relation.ispartof	Masters Thesis - University of Auckland	en
dc.relation.isreferencedby	UoA99264880713602091	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher.	en
dc.rights	Restricted Item. Available to authenticated members of The University of Auckland.	en
dc.rights.uri	https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/nz/	en
dc.title	Discovering Functional Dependencies From Possibilistic Data	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	The University of Auckland	en
thesis.degree.level	Masters	en
dc.rights.holder	Copyright: The author	en
pubs.elements-id	544787	en
pubs.record-created-at-source-date	2016-11-07	en
dc.identifier.wikidata	Q112924961

Files in this item

Name: whole.pdf

Size: 881.8Kb

Format: PDF

This item appears in the following Collection(s)

Masters Theses - Authenticated Access [6749]

Show simple item record

Discovering Functional Dependencies From Possibilistic Data

Files in this item

This item appears in the following Collection(s)

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics

Discovering Functional Dependencies From Possibilistic Data

Files in this item

This item appears in the following Collection(s)

Share

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics