Comparing FPGA vs Custom Cmos and the Impact on Processor Architecture

Slide Link

Henry Wong, Vaughn Betz, and Jonathan Rose

University of Toronto

February, 2011

As soft processors are increasingly used in diverse applications, there is a need to evolve their microarchitectures in a way that suits the FPGA implementation substrate. As a first step in our research on a new soft processor architecture, we compare the delay and area of a  comprehensive set of processor building block circuits when implemented  on custom CMOS and FPGA substrates. We then use the results of these comparisons to infer how the microarchitecture of soft processors on FPGAs should be different from hard processors on custom CMOS.

We find that the ratios of the area required by an FPGA to that of custom CMOS for different building blocks varies significantly more than  the speed ratios. As area is often a key design constraint in FPGA circuits, area ratios have the most impact on microarchitecture choices. Complete processor cores have area ratios of 17-27x and delay ratios of  18-26x. Building blocks that have dedicated hardware support on FPGAs such as SRAMs, adders, and multipliers are particularly area-efficient (2-7x area ratio), while multiplexers and CAMs are particularly area-inefficient (>100x area ratio).

FPGA implementations have relatively cheaper ALUs and cache capacity, and more expensive bypass networks than on similar hard processors. As soft processor microarchitecture evolves, microarchitecture choices that  take these costs into account can be different from those made for hard processors of similar complexity in the past.