On top of the resiliency toolkit, a failure model has been developed.
This toolkit automatically optimises the checkpoint frequency, redundancy level and storage location for each application. The idea is to minimize application execution time and the energy consumption of the system.
This model takes into account the probability and the type of failure of the main hardware and software components MTBF and the time to create and restore a checkpoint to provide the optimal checkpoint frequency for each application.
