Henry Wong, Vaughn Betz, and Jonathan Rose
University of Toronto
February, 2011
As soft processors are increasingly used in diverse applications, there is a need to evolve their microarchitectures in a way that suits the FPGA implementation substrate. As a first step in our research on a new soft processor architecture, we compare the delay and area of a comprehensive set of processor building block circuits when implemented on custom CMOS and FPGA substrates. We then use the results of these comparisons to infer how the microarchitecture of soft processors on FPGAs should be different from hard processors on custom CMOS.
We find that the ratios of the area required by an FPGA to that of custom CMOS for different building blocks varies significantly more than the speed ratios. As area is often a key design constraint in FPGA circuits, area ratios have the most impact on microarchitecture choices. Complete processor cores have area ratios of 17-27x and delay ratios of 18-26x. Building blocks that have dedicated hardware support on FPGAs such as SRAMs, adders, and multipliers are particularly area-efficient (2-7x area ratio), while multiplexers and CAMs are particularly area-inefficient (>100x area ratio).
FPGA implementations have relatively cheaper ALUs and cache capacity, and more expensive bypass networks than on similar hard processors. As soft processor microarchitecture evolves, microarchitecture choices that take these costs into account can be different from those made for hard processors of similar complexity in the past.