Examining an alternative approach to heterogeneous computing in the many-core era.
Scientists from The Cyprus Institute examined an alternative approach to heterogeneous computing in the many-core era. In concrete terms they ported the EMAC Earth System Model application to the DEEP platform.
Understanding the evolution and changes of global climate is of utmost importance in the 21st century. The complexity of simulating the climate is reflected in the structure of the scientific codes in this research field. Scientists from The Cyprus Institute examined an alternative approach to heterogeneous computing in the many-core era for such Earth system models – in concrete terms they ported the EMAC Earth System Model application to the DEEP platform. The aim: Test if they were able to achieve better scalability on the new type of architecture the DEEP Cluster-Booster concept offers.
EMAC is composed of two extremely different code parts. It couples a global circulation model (ECHAM) with local physical (MESSy) and chemical (MECCA) models to study climate change and air quality scenarios. Whereas the first one has high communication demands, the latter ones have high computational demands. For running the code on traditional homogenous parallel supercomputers this heterogeneity of the code poses a major challenge as it results in significant load imbalance which again hinders the scalability of the application.
To alleviate this code imbalance and improve scalability, the DEEP Cluster-Booster approach offers a suitable solution: The computationally demanding code parts of the MESSy and MECCA models are offloaded to the Booster, while the communication demanding ECHAM model stays on the Cluster. In order to make the code division work, CYI scientists modelled their code with the help of OmpSs:
- Intranode taskification: The chemistry mechanism of the code (MECCA) was taskified using OmpSs directives. OmpSs allows the user to specify inputs and outputs for blocks of code or functions, giving enough information to the Nanos++ run-time which enables it to detect which tasks are ready to be executed concurrently, and therefore the programmer does not have to explicitly manage the parallelization.
- The result: This helps to massively parallelize the chemistry calculations. Plus, the load imbalance is automatically alleviated by dynamic load balancing via OmpSs.
- Internode taskification: For this, OmpSs provides 2 important features: it allows offloading to remote nodes – not just locally available coprocessors/accelerators – which is a key functionality to effectively use the Booster. Plus, it allows the use of the Booster as a pool of coprocessors, so tasks can be offloaded to any Booster node with enough free cores. The latter enables one to eliminate the load imbalance caused by sunlight gradients in MECCA. The second source of imbalance by heterogeneous reactions is also automatically alleviated by the dynamical load balance using the massive parallelization in the Booster.
The new version of EMAC, running ECHAM with MPI processes and MECCA with shared memory OmpSs tasks, outperforms the old EMAC using pure MPI, and continues to scale beyond the region where the original implementation scaling performance plateaus. It is important to note, that the use of the OmpSs API largely frees the programmer from implementing the offloading logic. Since EMAC is developed and used in a large community working on all aspects of the model, this can facilitate adoption of the concept in the MESSy community.
“We expect that the changes we propose with our work will contribute to the eventual adoption of Cluster-Booster division and Many Integrated Core (MIC) accelerated architectures in presently available implementations of Earth system models. This way, we believe, the codes of our scientific community will be able to exploit the potential of fully Exascale-capable platforms,” says Theodoros Christoudias, Assistant Professor at The Cyprus Institute.
The work presented here was recently published in an open access paper:
Title of the paper: Earth system modelling on system-level heterogeneous architectures: EMAC (version 2.42) on the Dynamical Exascale Entry Platform (DEEP)
Authors: Michalis Christou (The Cyprus Institute, CYI), Theodoros Christoudias (CYI), Julián Morillo (Barcelona Supercomputing Centre), Damian Alvarez (Jülich Supercomputing Centre), and Hendrik Merx (Max Planck Institute for Chemistry)
First appeared in: Geosci. Model Dev., 9, 3483–3491, 2016
The full paper is available at: http://www.geosci-model-dev.net/9/3483/2016/gmd-9-3483-2016.pdf