Abstract:
Spreadsheets are one of the most widely used data management tools. Although intuitive
and easy to use, they suffer from several issues. Numerous publications indicate that most
spreadsheets contain errors and that spreadsheet-based Data Shadow Systems lead to
problems such as the “Spreadmart Dilemma” creating conflicting views on organizational
data.
In this thesis, we describe The Semantic Spreadsheet, a new data model for a semantically
accurate spreadsheet system. Unlike the existing data models, the described model
is a presentation independent data model based on the Resource Description Framework
(RDF) that avoids inconsistencies across spreadsheets by providing a semantically sound
data structure.
Introducing set-theoretic and relational algebraic operations for the Semantic Spreadsheet,
we substantially extend the functionality of the traditional spreadsheets to the
level available in relational databases.
We provide a collaborative implementation based on SPARQL queries with a change
propagation mechanism that enables a synchronized view of the data overall spreadsheets
in the system. In addition, theoretical and experimentally validated time complexities
are presented for all Semantic Spreadsheet operations.