wiki:Features

Version 22 (modified by pbaumann, 7 years ago) (diff)

--

Features

This is a high-level feature summary of rasdaman. For details refer to Why rasdaman??, Technology?, the documentation, and the scientific publications.

Per se, rasdaman is domain neutral, which makes it suitable for all applications where raster data management is an issue.

The petascope package (see documentation) is a servlet package implementing OGC standard interfaces WCS, WCPS, WCS-T, and WPS; see EarthLook for a kaleidoscope of hands-on interactive demos.

Features described on this page below make up the rasdaman community version. Add-ons with further functionality (such as multi-server multiplexing, nearline tape archive management) and performance boosters (such as just-in-time compilation and graphic card support) are available with rasdaman enterprise from rasdaman GmbH. Further, add-ons exist for managing large-scale geo maps (ortho images, thematic layers, DEM, radar, etc.) via OGC WMS.

Array data model

Arrays are determined by their extent ("domain") and their cell ("pixel", "voxel"). The rasdl type definition language (aka DDL) allows to dynamically define new types; syntax is kept close to ODMG's ODL. All base and composite data types allowed in languages like C/C++ (except for pointers and arrays) can be defined as cell types, including nested structs. For example, the following stanza defines an XGA image:

typedef marray< [0:1023,0:767], struct{unsigned char red, green, blue;} > XGA;

Over such typed arrays, collections (ie, tables - ODMG style, again) are built. Collections have two columns (attributes), a system-maintained object identifier (OID) and the array itself. This allows to conveniently embed arrays into relational modeling: foreign keys in conventional tables allow to reference particular array objects, in connection with a domain specification even parts of arrays. A collection of XGA images is defined in rasdl through

typedef set< XGA > XgaSet;

This collection type can be instantiated through

create collection XgaSet myXgaCollection; 

After this instantiation, the collection is ready for population through rasql.

Query language

The rasdaman query language, rasql, offers raster processing formulated through expressions over raster operations in the style of SQL. Consider the following query: "The difference of red and green channel from all images from collection LandsatImages where somewhere in the red channel intensity exceeds 127". In rasql, it is expressed as

select ls.red - ls.green
from LandsatImages as ls
where max_cells( ls.red ) > 127

Rasql is a full query language, supporting select, insert, update, and delete. Additionally, the concept of a partial update is introduced which allows to selectively update parts of an array. In view of the potentially large size of arrays this is a practically very relevant feature, e.g., for updating satellite image maps with new incoming imagery.

Query formulation is done in a declarative style (queries express what the result should look like, not how to compute it). This allows for extensive optimization on server side. Further, rasql is safe in evaluation: every valid query is guaranteed to to terminate in finite time.

C++ and Java API

Client development is supported by the C++ API, raslib, and the Java API, rasj; both adhere to the ODMG standard. Communication with a rasdaman database is simple: open a connection, send the query string, receive the result set. Iterators allow convenient acecss to query results.

Tiled storage

On server side, arrays are stored inside a PostgreSQL database (support for further database systems and file-based storage is available from rasdaman GmbH). To this end, arrays are partitioned into subarrays called tiles; each such tile goes into a BLOB (binary large object) in a relational table. This allows conventional relational database systems to maintain arrays of unlimited size.

A spatial index allows to quickly locate the tiles required for determining the tile set addressed by a query.

The partitioning scheme is open - any kind of tiling can be specified during array instantiation. A set of tiling strategies is provided to ease administrators in picking the most efficient tiling.

Tile streaming

Query evaluation in the server follows the principle of tile streaming. Each operator node processes a set of incoming tiles and generates an output tile stream itself. In many cases this allows to keep only one database tile at a time in main memory. Query processing becomes very efficient even on low-end server machines.

Server multiplexing

A rasdaman server installation can consist of an arbitrary number of rasdaman server processes. A dynamic scheduler, rasmgr, receives incoming connection requests and assigns a free server process. This server process then is dedicated to the particular client until the connection is closed. This allows for highly concurrent access and, at the same time, increases overall safety as clients are isolated against each other.