header software2

This website uses cookies to manage authentication, navigation, and other functions. By using our website, you agree that we can place these types of cookies on your device.

View e-Privacy Directive Documents

Learn more about the comprehensive set of resiliency methods developed in DEEP-ER.

Exascale systems will require a combination of powerful resiliency techniques that are also flexible enough to accommodate the heterogeneous nature of systems like the DEEP and DEEP-ER prototypes. In the DEEP-ER project we develop such a comprehensive set of resiliency methods that are aimed at different failure types but can also be combined to provide a high level of resiliency at an affordable cost.

The overall aim is to isolate soft or partial system failures to avoid the necessity of full application restarts. This will be key to allow compute at the Exascale.
 

The following figure presents the encompassing set of resiliency techniques that are being developed within the project to address these different types of errors.

 

DEEP-ER Resiliency Scheme
DEEP-ER Resiliency Software Stack