IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool

Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool
View Sample PDF
Author(s): Qiang Guan (Los Alamos National Laboratory, USA), Nathan DeBardeleben (Los Alamos National Lab, USA), Sean Blanchard (Los Alamos National Lab, USA), Song Fu (University of North Texas, USA), Claude H. Davis IV (Clemson University, USA)and William M. Jones (Coastal Carolina University, USA)
Copyright: 2016
Pages: 29
Source title: Innovative Research and Applications in Next-Generation High Performance Computing
Source Author(s)/Editor(s): Qusay F. Hassan (Mansoura University, Egypt)
DOI: 10.4018/978-1-5225-0287-6.ch011

Purchase

View Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool on the publisher's website for pricing and purchasing information.

Abstract

As the high performance computing (HPC) community continues to push towards exascale computing, HPC applications of today are only affected by soft errors to a small degree but we expect that this will become a more serious issue as HPC systems grow. We propose F-SEFI, a Fine-grained Soft Error Fault Injector, as a tool for profiling software robustness against soft errors. We utilize soft error injection to mimic the impact of errors on logic circuit behavior. Leveraging the open source virtual machine hypervisor QEMU, F-SEFI enables users to modify emulated machine instructions to introduce soft errors. F-SEFI can control what application, which sub-function, when and how to inject soft errors with different granularities, without interference to other applications that share the same environment. We demonstrate use cases of F-SEFI on several benchmark applications with different characteristics to show how data corruption can propagate to incorrect results. The findings from the fault injection campaign can be used for designing robust software and power-efficient hardware.

Related Content

Radhika Kavuri, Satya kiranmai Tadepalli. © 2024. 19 pages.
Ramu Kuchipudi, Ramesh Babu Palamakula, T. Satyanarayana Murthy. © 2024. 10 pages.
Nidhi Niraj Worah, Megharani Patil. © 2024. 21 pages.
Vishal Goar, Nagendra Singh Yadav. © 2024. 23 pages.
S. Boopathi. © 2024. 24 pages.
Sai Samin Varma Pusapati. © 2024. 25 pages.
Swapna Mudrakola, Krishna Keerthi Chennam, Shitharth Selvarajan. © 2024. 11 pages.
Body Bottom