The Cisco Global Cloud Index has predicted a three-fold increase in data center traffic by 2019. In addition, the world is likely to witness processing of about 86% of workload at cloud data centers, while the remaining 14% will be processed at traditional data centers. The ever increasing demand from individuals and enterprises has evolved the need of hyperscale infrastructure. Hyperscale datacenters will be of state-of-the-art quality with an ability to adapt to ever fluctuating internet demand.
The FPGAs (Field Programmable Gate Arrays), present a flexible solution to accelerate the network performance. Semiconductor giant Microsoft is using them in Azure Hyperscale as networking cards and accelerators and Intel is shipping Stratix 10 and Xeon using FPGA as accelerators to its customers. The conventional role of SRAM FPGAs in networking has now expanded to speed/accelerate servers.
What is FPGA?
Field-programmable gate arrays (FPGAs) are resourceful silicon chips that are proving to be extremely fast at certain operations. An FPGA is an array of bit-processing units (logic blocks) and interconnection; these logic blocks and interconnects can be programmed after fabrication to connect the inputs & outputs (I/O) in many different ways such that they can even perform massively parallel operations like real time processing.
The FPGA attraction
Multiple challenges in scaling single-thread performance without undue power dissipation has found CPU vendors integrating multiple cores onto a single die. Traditional CPUs run on a code where the operating system queues up instructions and the processor executes them one at a time. FPGAs, however, can offload processing power from the CPU and enable extremely high bandwidths to be managed simultaneously.
The overarching goal of hardware acceleration is to increase the speed at which data can be processed by using custom hardware specifically designed to implement a specific routine. This gives software speeds two advantages:
- One, the CPU can process other data while the computation necessary for the accelerated routine is offloaded to the coprocessor. The only time the processor must spend on the computation is the time that it takes to set up the co-processor to begin its calculation and the time it takes to receive results.
- The second potential gain is realized when the hardware accelerator is structured to calculate the result faster than the software.
Ideally, both conditions are met when the processor can process data concurrently with the co-processor and the custom hardware is faster than a software implementation.
FPGAs offer tremendous performance potential. They can support several different parallel computation applications and implement them in single clockwork execution. Since FPGAs are reprogrammable, they can provide on-chip facility for a number of applications. The on-chip memory can facilitate a co-processor logic’s memory. This ensures the bandwidth is not restricted to the number of I/O pins present in the devices. Moreover, memory is also closely coupled to the algorithm logic which removes the need for an external high-speed memory cache. This in turn saves power and improves coherence. The use of internal memory also means that no additional I/O pins are required to increase its accessible memory size, simplifying design scaling.
The standard method of describing software is using high-level languages (HLLs) such as C, C++, or Fortran. The standard way of describing hardware is using hardware description languages (HDLs), such as VHDL and Verilog. Describing hardware using HLLs is possible and has been tried in several commercial products such as Xilinx Forge, Celoxica Handel-C, Impulse-C, and Mitrion-C,OpenCL. These languages offer a trade-off between a shorter development time and a performance overhead imposed by high level languages.
Describing hardware in HLLs, or at least using data flow diagrams, seems to be a major and distinctive feature of high-performance accelerators. It allows mathematicians and computer scientists to develop entire applications without relying on hardware designers. It also substantially increases the productivity of design processes. A compiler for this design must combine the capabilities of tools for traditional microprocessor compilation and tools for computer-aided design with FPGAs. It must also extend these two separate set of tools with capabilities for mutual synchronization and data transfer between microprocessor and accelerators.
Patented ideas for acceleration tech
Altera was one of the very first to Patent the acceleration technology in 1997. Over the years various networking, electronics, semiconductor, social media and e-commerce giants like LG, IBM, Qualcomm, facebook, Microsoft and Amazon have recognized the technology and implemented it to boost their product performance.
When addressing the hardware needs of data centers, network acceleration and deep learning FPGAs provide an attractive alternative; in particular, the ability to exploit pipeline parallelism and achieve an efficient rate of power consumption give FPGAs a unique advantage, making it a great future tech.
(Featured image source: https://i.ytimg.com/vi/13CHJehqqGs/maxresdefault.jpg)