Diamond is a behavioral synthesis system being developed at the University
of Toronto. It automatically synthesizes programs in ANSI C, with its
full construct set and unmodified semantics, into an architectural
signoff (in hardware carrying
code)
or register transfer level signoff
(in RTL Verilog). The primary objective of the project is to scale the application of behavioral synthesis technology,
traditionally limited only to programs of hundred line filter
complexity, to large C programs of SPEC
2000
complexity, while maintaining competitive synthesis quality.
Despite the intensive research efforts invested since early 1980s,
behavioral synthesis has not been embraced by the design community as
much as it was intended. Although many have attributed to such
reluctance to cultural reasons, we believe it is rather technical.
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
Objective
Background
Size matters
The complexity of the application that
current behavioral synthesis tools can accept was not compelling
enough for the complete departure from RTL design entry.
Verification matters
Unlike synchronous RTL, the untimed
nature of behavioral specification makes it difficult to compose
components into larger systems, and verify against low level
implementations by regression.
Quality matters
The quality of the synthesized large
design is far from what can be produced by human designers. In
particular, the lack of integration with downstream physical design
tools makes synthesis decisions made at high level sometimes
meaningless.
To C or Not C
Although recently an encouraging amount of
academic and commercial efforts emerge to synthesize hardware directly
from C/C++ based algorithms, none of the above mentioned problems were
resolved, and more confusions seem to build up: some restrict the
C/C++ language into their ``synthesis subsets'', some repeat the same
mistake of VHDL/Verilog by mixing specification with simulation, and
some effectively defining new languages with merely the syntactic look
of C.
Ideas
Diamond is built on the ground of several basic
ideas.
Leaving C alone
The true value of using C as design entry
is the amount of legacy code already available and the amount of
education already installed in engineering schools and industry. It
is important to use C in its entirety, and in the same way as it is in
software - therefore the starting principle of Diamond is to not only
to maintain the full syntax and semantics of C as design entry, but
also maintain the feel of a compiler on a synthesis tool.
Scaling application
To scale the targeted application
to large C programs, we develop interprocedural analysis algorithms to
aggressively infer complex behavior. In particular,
pointer analysis
is
performed to attack one of the most cited difficulties in
parallelizing C programs.
Scaling architecture
Unlike the common practice of
synthesizing different procedures in a large program into separate
hardware blocks, each with an FSMD microarchitecture (finite state
machine based controller with a datapath), we devise a new
microarchitecture called a stacked FSMD (SFSMD), which contains a
shared datapath and a set of separate controllers, one for each
procedure. While a hardware stack is also used to implement procedure
call linkage, Diamond can optionally perform inlining and exlining
optimizations such that the stack can be reduced into single register.
Scaling quality
Diamond adapts and invents interprocedural
optimization techniques to perform aggressive optimizations across
hyperblock, loop and procedure boundaries. Scheduling, register
allocation and binding tasks are also performed in an interprocedural
manner.
Standardizing interfaces
Despite our enthusiasm, we do not
believe C synthesis is a signoff technology, due to C's inherent lack
of composability. It is therefore important to keep behavioral
synthesis as a component technology and put it in the context of
system level design. To this end, we standardize the interaction of
the synthesized component with its environment, typically a
system-on-chip, upfront, such that verification and
integration
can be performed seamlessly.
Usage
Diamond run in three modes.
standalone mode
In this mode, Diamond synthesizes a
complete C program into a hardware block with procedure main as
the entry point. This mode is only used for research purposes.
streaming mode
In this mode, Diamond synthesizes arbitrary
C functions into hardware blocks that process streaming data with
specified throughput. This mode is targeted towards real time signal
processing applications.
codesign mode
In this mode, Diamond not only synthesizes a
hardware accelerator for performance demanding procedures of a
complete program, but also the rest of the functionality as software
running on an extensible processor, with frequently executed patterns
automatically extracted as instructions.
Publications
Rami Beidas
and
Jianwen Zhu,
``Soft register allocation,''
IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, submitted 2005.
Rami Beidas
and
Jianwen Zhu,
``Scalable inter-procedural register allocation for high level
synthesis,''
in Proceeding of Asia and South Pacific
Design Automation Conference
(ASPDAC), Shanghai,
China, January 2005.
Jianwen Zhu,
``Color permutation: an iterative algorithm for memory
packing,''
in Proceedings of the International
Conference on Computer Aided Design
(ICCAD),
San Jose, California, November 2001.
Jianwen Zhu,
``Static memory allocation by pointer analysis and coloring,''
in Proceedings of the Design Automation
and Test Conference in Europe
(DATE),
Munich, Germany, March 2001.
Rami Beidas,
``Context-flow SOC architecture,''
M.S. thesis, Department of Electrical and
Computer Engineering, University of Toronto,
Toronto, Apr. 2004.
Khushwinder Jasrotia and Jianwen
Zhu,
``Hardware implementation of a memory allocator,''
in Euromicro Symposium on Digital System Design, Dortmond,
Germany, September 2002.
Khushwinder Jasrotia,
``Stacked FSMD: A new micro-architecture model for high level
synthesis,''
M.S. thesis, Department of Electrical and
Computer Engineering, University of Toronto,
Toronto, July 2003.
Khushwinder Jasrotia and Jianwen
Zhu,
``Stacked FSMD: A power efficient micro-architecture for
high-level synthesis,''
in International Symposium on Quality
Electronic Design
(ISQED), San Jose, California,
March 2004.
Rami Beidas
and
Jianwen Zhu,
``Context-flow system-on-chip platform,''
Tech. Rep. TR-04-01-03, Department of
Electrical and Computer Engineering, University of
Toronto, Apr. 2003.
Rami Beidas
and
Jianwen Zhu,
``A queuing theoretic performance model for context-flow
system-on-chip platform,''
in Workshop on Embedded Systems for Real-Time Multimedia,
Stockholm, Sweden, September 2004.
Rami Beidas
and
Jianwen Zhu,
``Performance efficiency of context-flow system-on-chip
platform,''
in Proceedings of the International
Conference on Computer Aided Design
(ICCAD),
San Jose, California, November 2003.
Contributors
Toronto Synthesis Group
Rami Beidas
Wai Sum Mong
Jianwen Zhu
Web
http://www.eecg.toronto.edu/~jzhu/diamond.html