Error Detection Using Dynamic Dataflow Verification

Albert Meixner and Daniel Sorin
Duke University


Abstract

Continued scaling of CMOS technology to smaller transistor sizes makes modern processors more susceptible to both transient and permanent hardware faults. Circuit-level techniques for reducing fault rates exist, but it is widely accepted that architects will have to design processors to tolerate faults. A large portion of the logic within a superscalar processor is involved in converting the linear instruction stream into a representation that allows the execution of instructions in data dependence order rather than program order to extract instruction level parallelism. Errors caused by faults in this logic—which includes the fetch and decode stages, renaming and scheduling logic, as well as the commit stage—will manifest themselves as an incorrectly constructed dataflow graph. Dynamic Dataflow Verification (DDFV) compares the dynamically constructed and executed dataflow graph to the expected dataflow graph of the static program binary, represented by a checksum embedded in the instruction stream, in order to comprehensively detect errors in the logic and structures involved. DDFV can detect errors due to permanent faults, transient faults, and design bugs with high probability at a low hardware and performance cost.