A Generic Dual Core Architecture with Error Containment
keywords: Dual core, error detection, self-checking checker, common mode failure, fail-silent processor
The dual core strategy allows to construct a fail-silent processor from two instances (master/checker) of any arbitrary standard processor. Its main drawbacks are its vulnerability with respect to common mode failures and the existence of residual single points of failure. In this paper we propose a generic frame that systematically eliminates these drawbacks. First, we employ temporal redundancy to cope with common mode failures. Unlike similar approaches we can ensure error containment even if -- as a result of the temporal redundancy -- the comparison by the checker core is delayed. We attain this by introducing a specific delay element for outgoing data. Second, we perform a systematic analysis of potential single points of failure and eliminate these by careful layout, self-checking circuits and similar methods. We finally validate our approach by means of exhaustive fault injection experiments. The results indicate a 100% self-checking coverage for stuck-at faults and complete error containment. Since the proposed framework has been kept generic in the sense that the individual standard processor cores are treated as black boxes, these results are valid independent of the core actually used.
reference: Vol. 23, 2004, No. 5-6, pp. 517–535