News & press releases

Reducing energy while maintaining reliability: FPGAs and neural networks

Presented at the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), the LEGaTO paper ‘Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories’ sought to answer whether you could reduce the supply voltage of hardware accelerators while maintaining reliability. In this article, the paper’s main author Behzad Salami sets out how this innovative research study will help deliver more energy-efficient computing.

Voltage underscaling is a technique to reduce energy applied across all processors and accelerators. However, while reducing the supply voltage means that less energy is used, it also has drawbacks in terms of reliability, as it increases the circuit delay, which can lead to timing errors. In this paper, we studied voltage underscaling for accelerators by experimentally evaluating the trade-off of these two parameters: energy and reliability.

We evaluated this technique for field-programmable gate arrays (FPGAs): more flexible than application-specific integrated circuits (ASICs) and more energy efficient than central processing units (CPUs), they have become increasingly popular for high-performance computing (HPC), cloud computing, and data centres. As an example, a TOP500 report quoted Intel executives as saying that by 2020 FPGAs would be present in roughly one in three data centres. Their increasing popularity is partly due to the wider availability of support such as high-level synthesis tools, which means that developers do not need to have intimate knowledge of the underlying hardware.

The nominal supply voltage for processors and accelerators can be very conservative; indeed, in many cases, it is unnecessarily high, with companies such as Xilinx trying to accommodate worst-case needs. Our experiments showed that the supply voltage of FPGAs is overprovisioned by 40%, meaning that voltage could be reduced to 60% of the nominal level without any performance degradation. This would lead to being able to use around a tenth of the energy currently used. However, further voltage reduction below this guardband means that the reliability is adversely affected.

Evaluating voltage underscaling in FPGAs

FPGAs are composed of many components like on-chip memories, digital signal processors (DSPs), etc. Our initial concentration in this study was on the on-chip memories, due to their crucial role for accelerating state-of-the-art applications like neural networks. Further undervolting below the guardband, faults or bit-flips can appear in on-chip memories. In this paper, we undertook an extensive characterization of these faults, looking into how these faults are distributed inside FPGA on-chip memories, what their behaviour is – rate, location and so on. We obtained some interesting results: for example, the fault rate increases exponentially with voltage reduction, faults are non-uniformly distributed over the chip due to within-die process variation, and while changing the environmental temperature, for example, has some interesting effects.

We also evaluated several techniques for reliability management such as error-correcting code (ECC). ECC is the traditional way of detecting and correcting errors. Although, ECC is normally used for soft errors we found that it covers most of the undervolting-related faults.

As a case study, we evaluated neural network applications. We found that although undervolting leads to significant energy savings, the accuracy of the neural network could be dramatically dropped down. To attain the energy saving gain without compromising the accuracy loss, we have evaluated some novel fault mitigation techniques, which rely on the behaver of the undervolting-related faults. Consequently, we reduce energy use by almost ten times with a negligible neural network accuracy loss.

Enabling vehicles to talk with less energy

In HPC centres, edge computing and cloud computing, FPGAs allow energy savings at a lower cost to ASICs. In particular, an FPGA-based, low-energy, and fault-resilient neural network accelerator could benefit a number of real-world applications. One very promising application is the machine learning use case in LEGaTO, where we are working with LEGaTO partner Machine Intelligence Sweden (MIS) to optimize their tool for smart vehicles. In parallel, we are working with colleagues at BSC and the Universitat Politècnica de Catalunya (UPC)– Barcelona Tech on the low-energy, task-based programming model OmpSs@FPGA.

LEGaTO aims to achieve 10X greater energy efficiency, and this technique will help the project meet this goal through hardware technology and programming models.