header applications

This website uses cookies to manage authentication, navigation, and other functions. By using our website, you agree that we can place these types of cookies on your device.

View e-Privacy Directive Documents

The continuous progress in remote sensor resolutions of Earth observation platforms generates large quantities of hyperspectral data for the mapping and monitoring of natural and man-made land covers.

Current Synthetic Aperture Radar missions – with high spatial resolution and frequent repeat passes – raise huge requirements for the analysis of satellite time series data. This enables the observation and analysis of dynamic processes involving natural landscape and built-up sites with significant socio-economic, environmental, and geopolitical impact. Similarly, 3D point cloud datasets in earth sciences created by 3D laser scanners drive data growth, up to scans of whole countries.

The University of Iceland team provides  applications based on three data analytics methods used in order to extract knowledge from data:
•    clustering (NextDBSCAN),
•    classification (NextSVM),
•    Deep Learning (TensorFlow with Horovod).

 

Data Analytics in DEEP-EST

These data analytics in Earth Science applications explore innovative I/O and storage methods using the DCPMM persistent memory modules and DEEP-EST NAM devices, and will adapt data analytics codes to leverage the DEEP-EST MSA.. The requirements of the three principal data analytics techniques mentioned above are part of the co-design effort of DEEP-EST.

NextDBSCAN

NextDBSCAN explores two different mappings to the MSA to perform density-based clustering on very large 3D point-cloud LiDAR datasets. The first mapping uses the ESB to construct a novel data structure which encodes possible connections between the data points into a compact search-tree. Subsequently, the CM is employed to parse the tree to determine the actual connections between the points and label them. Thereafter, the data is stored in the NAM in the ESB, and optionally further data is selected for processing and the workflow cycle starts again:

 

 

 

The second mapping is similar to the first, but the ESB module is replaced with the DAM, which uses the DCPMM as a fast, persistent storage target rather than the NAM:

 

 

While the existing HPDBSCAN algorithm scaled very well, it was not able to use accelerators like GPUs. This resulted in the completely new NextDBSCAN algorithm designed with accelerators in mind. This new algorithm and implementation outperforms the old HPDBSCAN even without GPUs as shown on our DEEP-EST CM:

 

 

 

NextSVM

NextSVM will explore two different mappings  for supervised learning using support vector machines:

 

Workflow Tk 1.6

One of the major differences between these mappings is that the first one uses the NAM in the ESB as an intermediary storage, and the second uses the DAM’s DCPMM for that purpose. Additionally, the first mapping represents a CPU/GPGPU combination communicating across different interconnects, as it is likely that the CM will use a different interconnect compared to the other two modules.

 

Deep Learning

The Deep Learning application is the third application; it will explore two MSA mappings to the same modules, namely the ESB and DAM:

The main difference between these mappings lies in which MSA module trains the CNN models, and which infers their quality. After training and inference, the obtained models are either stored in the DAM’s DCPMM or the ESB’s NAM, and then later re-used for additional training via transfer learning and/or further inference.

 

Goal

The expected outcome is reduced time to solution for classification, clustering and deep learning applications, including the search for correct parameters through cross-validation, and a significant speed-up with respect to standard parallel versions due to the adoption of the MSA. The innovative integrated computing and data architecture of DEEP-EST will contribute to cutting edge knowledge discovery with unprecedented effectiveness and efficiency.