Treffer: Impact Study of NoSQL Refactoring in SkyServer Database
Weitere Informationen
Data modeling in NoSQL databases is notoriously complex and driven by multiple and possibly conflicting requirements. Researchers have proposed methodologies to optimize schema design of a given domain for a given workload; however, due to the agile environment in which NoSQL databases are usually employed, both domain and workload are frequently subject to changes and evolution - possibly neutralizing the benefits of optimization. When this happens, the benefits of a new optimal schema design must be weighed against the costs of migrating the data. In this work, we empirically show the benefits of schema redesign in a real publicly available database. In particular, we identify multiple snapshots (in terms of domain extension and querying workload) in the 20+ years evolution of SkyServer, demonstrate how NoSQL schema optimization at a given time can later backfire, and evaluate the conditions under which data migration becomes beneficial. This takes us to define the foundations and challenges of a framework for continuous NoSQL database refactoring, with the goal of helping DBAs and data engineers decide if, when, and how a NoSQL database should be reconsidered to restore schema design optimality.