Section: New Results
Scalable storage for polystores
Big data applications routinely involve diverse datasets: relations flat or nested, complex-structure graphs, documents, poorly structured logs, or even text data. To handle the data, application designers usually rely on several data stores used side-by-side, each capable of handling one or a few data models (e.g., many relational stores can also handle JSON data), and each very efficient for some, but not all, kinds of processing on the data.
A current limitation is that applications are written
taking into account which part of the data is stored in which store and
how. This fails to take advantage of (
In [11], we present Estocada , a novel approach connecting applications to the potentially heterogeneous systems where their input data resides. Estocada can be used in a polystore setting to transparently enable each query to benefit from the best combination of stored data and available processing capabilities. Estocada leverages recent advances in the area of view-based query rewriting under constraints, which we use to describe the various data models and stored data. Our experiments illustrate the significant performance gains achieved by Estocada .