Discovery and Ranking of Functional Dependencies

Show simple item record

dc.contributor.author Wei, Z en
dc.contributor.author Link, S en
dc.date.accessioned 2020-01-10T01:36:46Z en
dc.date.available 2020-01-10T01:36:46Z en
dc.date.issued 2019 en
dc.identifier.citation CDMTCS Research Reports CDMTCS-531 (2019) en
dc.identifier.issn 1178-3540 en
dc.identifier.uri http://hdl.handle.net/2292/49481 en
dc.description.abstract Computing the functional dependencies that hold on a given data set is one of the most important problems in data profiling. Our research advances state- of-the-art in various ways. Utilizing new data structures and original techniques for the dynamic computation of stripped partitions, we devise a new hybridization strategy that outperforms the best algorithms in terms of efficiency, column-, and row-scalability. This is demonstrated on real-world benchmark data. We show that current outputs contain many redundant functional dependencies, but canonical covers greatly reduce output sizes. Smaller representations of outputs are easier to comprehend and use. We propose the number of redundant data values as a natural measure to rank the output of discovery algorithms. Our ranking assesses the relevance of functional dependencies for the given data set. en
dc.publisher Department of Computer Science, The University of Auckland, New Zealand en
dc.relation.ispartofseries CDMTCS Research Report Series en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.source.uri https://www.cs.auckland.ac.nz/research/groups/CDMTCS/researchreports/index.php en
dc.title Discovery and Ranking of Functional Dependencies en
dc.type Technical Report en
dc.subject.marsden Fields of Research en
dc.rights.holder Copyright: The author(s) en
dc.rights.accessrights http://purl.org/eprint/accessRights/OpenAccess en


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics