Abstract:
The goal of classical normalization is to maintain data consistency under updates, with a minimum level of effort. Given functional dependencies (FDs) alone,
this goal is only achievable in the special case an FD-preserving Boyce-Codd Normal Form (BCNF) decomposition exists. As we show, in all other cases the level
of effort can be neither controlled nor quantified. In response, we establish the
`-Bounded Cardinality Normal Form, parameterized by a positive integer `. For
every `, the normal form condition requires from every instance that every value
combination over the left-hand side of every non-trivial FD does not occur in more
than ` tuples. BCNF is captured when ` = 1. We demonstrate that schemata
in this normal form characterize the instances that are i) free from level ` data
redundancy and update inefficiency, and ii) permit level ` join efficiency. We establish algorithms that compute schemata in `-Bounded Cardinality Normal Form
for the smallest level ` attainable across all FD-preserving decompositions. Additional algorithms i) attain even smaller levels of effort based on the loss of some
FDs, and ii) decompose schemata based on prioritized FDs that cause high levels
of effort. Our framework informs de-normalization already during logical design.
In particular, level ` quantifies both the incremental maintenance and join support
of materialized views. Experiments with synthetic and real-world data illustrate
which properties the schemata have that result from our algorithms, and how these
properties predict the performance of update and query operations on instances
over the schemata, without and with materialized views.