Abstract
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their performance and flexibility advantages. Typically, CGRAs incorporate many processing elements in the form of an array, which is suitable for implementing spatial redundancy, as used in the design of fault-tolerant systems. This article introduces a recovery time model for transient faults in CGRAs. The proposed fault-tolerant CGRAs are based on triple modular redundancy and coding techniques for error detection and correction. To evaluate the model, several kernels from space computing are mapped onto the suggested architecture. We demonstrate the tradeoff between recovery time, performance, and area. In addition, the average execution time of an application including recovery time is evaluated using area-based error-rate estimates in harsh radiation environments. The results show that task partitioning is important for bounding the recovery time of applications that have long execution times. It is also shown that error-correcting code (ECC) is of limited practical value for tasks with long execution times in high radiation environments, or when the degree of task partitioning is high.
Original language | English |
---|---|
Article number | 42 |
Pages (from-to) | 1-21 |
Number of pages | 21 |
Journal | ACM Transactions on Embedded Computing Systems |
Volume | 17 |
Issue number | 2 |
DOIs | |
Publication status | Published - Nov 2017 |
Keywords
- coarse-grained reconfigurable architecture
- triple modular redundancy
- Triple modular redundancy
- Coarse-grained reconfigurable architecture