Objective

Diamond is a behavioral synthesis system being developed at the University of Toronto. It automatically synthesizes programs in ANSI C, with its full construct set and unmodified semantics, into an architectural signoff (in hardware carrying code) or register transfer level signoff (in RTL Verilog). The primary objective of the project is to scale the application of behavioral synthesis technology, traditionally limited only to programs of hundred line filter complexity, to large C programs of SPEC 2000 complexity, while maintaining competitive synthesis quality.

 

Background

Despite the intensive research efforts invested since early 1980s, behavioral synthesis has not been embraced by the design community as much as it was intended. Although many have attributed to such reluctance to cultural reasons, we believe it is rather technical.

 
  Size matters The complexity of the application that current behavioral synthesis tools can accept was not compelling enough for the complete departure from RTL design entry.

 
  Verification matters Unlike synchronous RTL, the untimed nature of behavioral specification makes it difficult to compose components into larger systems, and verify against low level implementations by regression.

 
  Quality matters The quality of the synthesized large design is far from what can be produced by human designers. In particular, the lack of integration with downstream physical design tools makes synthesis decisions made at high level sometimes meaningless.

 
  To C or Not C Although recently an encouraging amount of academic and commercial efforts emerge to synthesize hardware directly from C/C++ based algorithms, none of the above mentioned problems were resolved, and more confusions seem to build up: some restrict the C/C++ language into their ``synthesis subsets'', some repeat the same mistake of VHDL/Verilog by mixing specification with simulation, and some effectively defining new languages with merely the syntactic look of C.

 

Ideas Diamond is built on the ground of several basic ideas.

 
  Leaving C alone The true value of using C as design entry is the amount of legacy code already available and the amount of education already installed in engineering schools and industry. It is important to use C in its entirety, and in the same way as it is in software - therefore the starting principle of Diamond is to not only to maintain the full syntax and semantics of C as design entry, but also maintain the feel of a compiler on a synthesis tool.

 
  Scaling application To scale the targeted application to large C programs, we develop interprocedural analysis algorithms to aggressively infer complex behavior. In particular, pointer analysis is performed to attack one of the most cited difficulties in parallelizing C programs.

 
  Scaling architecture Unlike the common practice of synthesizing different procedures in a large program into separate hardware blocks, each with an FSMD microarchitecture (finite state machine based controller with a datapath), we devise a new microarchitecture called a stacked FSMD (SFSMD), which contains a shared datapath and a set of separate controllers, one for each procedure. While a hardware stack is also used to implement procedure call linkage, Diamond can optionally perform inlining and exlining optimizations such that the stack can be reduced into single register.

 
  Scaling quality Diamond adapts and invents interprocedural optimization techniques to perform aggressive optimizations across hyperblock, loop and procedure boundaries. Scheduling, register allocation and binding tasks are also performed in an interprocedural manner.

 
  Standardizing interfaces Despite our enthusiasm, we do not believe C synthesis is a signoff technology, due to C's inherent lack of composability. It is therefore important to keep behavioral synthesis as a component technology and put it in the context of system level design. To this end, we standardize the interaction of the synthesized component with its environment, typically a system-on-chip, upfront, such that verification and integration can be performed seamlessly.

 

Usage Diamond run in three modes.

 
  standalone mode In this mode, Diamond synthesizes a complete C program into a hardware block with procedure main as the entry point. This mode is only used for research purposes.

 
  streaming mode In this mode, Diamond synthesizes arbitrary C functions into hardware blocks that process streaming data with specified throughput. This mode is targeted towards real time signal processing applications.

 
  codesign mode In this mode, Diamond not only synthesizes a hardware accelerator for performance demanding procedures of a complete program, but also the rest of the functionality as software running on an extensible processor, with frequently executed patterns automatically extracted as instructions.

 

Publications

 
 

[1]      

Rami Beidas and Jianwen Zhu, ``Soft register allocation,'' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, submitted 2005.  
 

[2]      

Rami Beidas and Jianwen Zhu, ``Scalable inter-procedural register allocation for high level synthesis,'' in Proceeding of Asia and South Pacific Design Automation Conference (ASPDAC), Shanghai, China, January 2005.
 

[3]      

Jianwen Zhu, ``Color permutation: an iterative algorithm for memory packing,'' in Proceedings of the International Conference on Computer Aided Design (ICCAD), San Jose, California, November 2001.
 

[4]      

Jianwen Zhu, ``Static memory allocation by pointer analysis and coloring,'' in Proceedings of the Design Automation and Test Conference in Europe (DATE), Munich, Germany, March 2001.  
 

[5]      

Rami Beidas, ``Context-flow SOC architecture,'' M.S. thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Apr. 2004.
 

[6]      

Khushwinder Jasrotia and Jianwen Zhu, ``Hardware implementation of a memory allocator,'' in Euromicro Symposium on Digital System Design, Dortmond, Germany, September 2002.  
 

[7]      

Khushwinder Jasrotia, ``Stacked FSMD: A new micro-architecture model for high level synthesis,'' M.S. thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, July 2003.
 

[8]      

Khushwinder Jasrotia and Jianwen Zhu, ``Stacked FSMD: A power efficient micro-architecture for high-level synthesis,'' in International Symposium on Quality Electronic Design (ISQED), San Jose, California, March 2004.
 

[9]      

Rami Beidas and Jianwen Zhu, ``Context-flow system-on-chip platform,'' Tech. Rep. TR-04-01-03, Department of Electrical and Computer Engineering, University of Toronto, Apr. 2003.
 

[10]      

Rami Beidas and Jianwen Zhu, ``A queuing theoretic performance model for context-flow system-on-chip platform,'' in Workshop on Embedded Systems for Real-Time Multimedia, Stockholm, Sweden, September 2004.
 

[11]      

Rami Beidas and Jianwen Zhu, ``Performance efficiency of context-flow system-on-chip platform,'' in Proceedings of the International Conference on Computer Aided Design (ICCAD), San Jose, California, November 2003.

Contributors Toronto Synthesis Group

\includegraphics[height=1in]{figs/rbeidas.ps} \includegraphics[height=1in]{figs/linda.ps} \includegraphics[height=1in]{figs/jzhu.ps}
Rami Beidas Wai Sum Mong Jianwen Zhu

 

Web http://www.eecg.toronto.edu/~jzhu/diamond.html