Thumbnail
Access Restriction
Open

Source CiteSeerX
Content type Text
File Format PDF
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Abstract In coming years, a number of factors will lead to a significant shift in the way computer systems manage reliability, variation, and fabrication. Currently, computer systems assume perfect device fabrication and operation. For high-reliability systems, the usual methods of increasing system reliability involve ECC coding on memories and triple-modular-redundancy (TMR) of critical components. These brute force methods are able to increase system reliability when silicon fabrication processes are able to deliver high individual device reliability and low variation. However, as the critical dimensions of devices, such as transistors and wires, used to implement computer systems shrink to only a few nanometers, rates of transient faults, permanent faults, and variation between devices on the same die are expected to increase to the point where this approach will no longer be practical. Instead, computer systems will need to adopt a model in which each layer in the abstraction hierarchy — applications, O/S, architecture, circuits — is prepared for the layer below to transmit bad data and in which all of the layers in the hierarchy cooperate to deliver correct operation in spite of faults, variations, and other effects. This shift to a multi-level approach to resilience is further motivated by trends in fabrication
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Publisher Date 2008-01-01