Datacubes as the New Virtual Research Environment Paradigm

Publication date: 
12 October, 2015
Baumann, P.
Medium / Event: 
EO Science 2.0 Conference, ESA, Frascati, IT

A paradigm shift is becoming reality. We begin to see the datacubes behind the millions of files. We start combining heterogeneous datacubes in an ad-hoc fashion. And we begin to overcome the age-old, technology imposed divide between data and metadata. Query languages like SQL/MDA, the forthcoming extension to the ISO SQL standard with massive multi-dimensional arrays, but also increasing support for large-scale point cloud and mesh handling.

Europe's long-term Big Earth Data activity EarthServer in its Phase 1 has established horizontal platform technology for Agile Analytics based on scalable, distributed processing of complex Big Data requests. According to reviewers and EU DG CNECT, Phase 1 "with no doubt has been shaping the Big Earth Data landscape through the standardization activities within OGC, ISO and beyond". The underlying rasdaman array database will “significantly transform the way that scientists in different areas of Earth Science will be able to access and use data in a way that hitherto was not possible“.

In EarthServer Phase 2, the datacube paradigm will be at the heart. The existing 130+ TB database of ESA will be extended, and likewise will PML and NCI Australia establish 3D Landsat timeseries datacubes; ECMWF will establish 4D climate datacubes. Goal is realtime manipulation, such as scaling a PB datacube below 1 second. Further, ad-hoc fusion of datacubes across continents will be demonstrated.

In our talk we present ongoing activities in this direction, based on OGC, ISO, W3C, and INSPIRE standardization and EarthServer. We point out results achieved and challenges ahead, with the aim of stimulating creativity for novel access paradigms.

Partners involved: