Thesis / Project Topics 2024
This is an overview on topics offered for Bachelor and Master Thesis projects.
Contact me for further topics, or if you have your own idea that fits into our Big Datacube research.
All work follows the procedures, so you may want to study these first.
Programming prerequisites should be taken serious - in all cases non-trivial implementation in one of several languages is involved.
Code will regularly add functionality to our rasdaman system and, as such, be used by our project partners and the general scientific and technical community;
hence code quality (including, e.g., concise tests and documentation) is an integral evaluation criterion.
Generally, I appreciate not only the result, but also the way towards it - therefore, showing continuous progress, initiative, and planful work for sure is an asset.
Knowledge characterized as "advantageous" means that it is not mandatory, but not bringing it along will increase workload significantly, and make deadlines tight.
We reserve to not give a topic to a student if there is too much risk that a good result will not be achieved, for the student's sake.
If your report is of sufficient quality to be submitted successfully to a conference or journal for publication this will be considered a strong plus.
Note that only the topics below will be accepted for supervision, due to resource constraints.
Overview (strikethrough = topic taken)
Human Brain Datacube
Some of the largest and most widely accessed connectomic datasets is the human cortex “h01” dataset, which is a 3D nanometer-resolution image of human brain tissue.
The raw imaging data is 1.4 petabytes (roughly 500,000 * 350,000 * 5,000 pixels large, and is further associated with additional content such as 3d segmentations and annotations that reside in the same coordinate system, based on a human brain atlas.
The “Neuroglancer precomputed” format is more compact, with less volume.
Google has done an optimized web-based interactive viewing which can be manipulated from TensorStore.
Task on hand is to repeat the Google demo on rasdaman. This involves:
fetch and understand the brain dataset;
establish a datacube with rasdaman;
create an interactive 3D visualization demo.
- team size: 1
- prerequisites: (python?) pogramming skills, data wrangling
- classification: database setup, querying, application programming, visualization
- particularities: none
Vector Files as Datacube Query Parameter
OGC Web Coverage Processing Service (WCPS) is a geo datacube query language with integrated spatio-temporal semantics based on the notion of a multi-dimensional coverage which may represent a datacube.
Queries can be parametrized, among others with vector polygons allowing to "cut out" abritrary regions.
Currently, these vectors have to be provided in an ASCII representation called Well-Known Text (WKT).
However, the most widely used format in the geo universe is not WKT, but ESRI Shapefiles, a binary format.
Goal is to add support for the Shapefile format for vector upload in the petascope component of rasdaman, next to the existing WKT decoder.
Open-source libraries for decoding exist, for example GeoTools and shapelib; one of those should be used.
Appropriate tests should be established to demonstrate that the Shapefile decoder works properly.
- team size: 1
- prerequisites: Java, Linux
- classification: query language enhancement
- particularities: -