Discovery and Ranking of Functional Dependencies

Wei, Z; Link, S

dc.contributor.author	Wei, Z	en
dc.contributor.author	Link, S	en
dc.date.accessioned	2020-01-10T01:36:46Z	en
dc.date.available	2020-01-10T01:36:46Z	en
dc.date.issued	2019	en
dc.identifier.citation	CDMTCS Research Reports CDMTCS-531 (2019)	en
dc.identifier.issn	1178-3540	en
dc.identifier.uri	http://hdl.handle.net/2292/49481	en
dc.description.abstract	Computing the functional dependencies that hold on a given data set is one of the most important problems in data profiling. Our research advances state- of-the-art in various ways. Utilizing new data structures and original techniques for the dynamic computation of stripped partitions, we devise a new hybridization strategy that outperforms the best algorithms in terms of efficiency, column-, and row-scalability. This is demonstrated on real-world benchmark data. We show that current outputs contain many redundant functional dependencies, but canonical covers greatly reduce output sizes. Smaller representations of outputs are easier to comprehend and use. We propose the number of redundant data values as a natural measure to rank the output of discovery algorithms. Our ranking assesses the relevance of functional dependencies for the given data set.	en
dc.publisher	Department of Computer Science, The University of Auckland, New Zealand	en
dc.relation.ispartofseries	CDMTCS Research Report Series	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher.	en
dc.rights.uri	https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm	en
dc.source.uri	https://www.cs.auckland.ac.nz/research/groups/CDMTCS/researchreports/index.php	en
dc.title	Discovery and Ranking of Functional Dependencies	en
dc.type	Technical Report	en
dc.subject.marsden	Fields of Research	en
dc.rights.holder	Copyright: The author(s)	en
dc.rights.accessrights	http://purl.org/eprint/accessRights/OpenAccess	en