vision

Goal
The long term goal of the Toronto Synthesis Group is to produce a silicon compiler at the electronic system level (ESL). This research is driven by the need of embedded system-on-chips (SOCs), the heart of many performance demanding electronic products and applications, including but not limited to, handheld, mobile and internet phones, audio/video entertainment, gateways, wireless access points and internet TVs.

Challenge
The design of embedded SOCs from product specification to silicon is fundamentally challenged by the sheer complexity involved, including the function complexity, thanks to the exponential growth of the targeted application functionality, and the silicon complexity, thanks to the electrical complications brought by deep submicron manufacturing processes. Although many concrete point problems need to be attacked, we are particularly intrigued by three haunting, cross-cutting, intellectual challenges.

Abstraction Wall Despite constant debates in the community, the exact meaning of ESL and how it can be abstracted has not been agreed upon. To make things worse, The layered abstraction model no longer reconciles with quality of results: low-level timing and power issues may invalidate decisions made at the high level, leading to design convergence problems.

Heterogeneity Wall The heterogeneity of an SOC architecture, a dichotomy of hardware (processor cores, generic and custom hardware cores) and software (firmware, operating systems, embedded application software), dictates the heterogeneity of design tools, methodologies and teams, which often lead to the reduction of explored design space and prolonged design schedule, therefore decreased design performance and increased design cost.

Scalability Wall The conventional algorithms may not keep up with the exponential growth of the information they process. It is not uncommon even today to see tasks that need to solve billions of instances of problems, yet our options for improving the algorithm capacity seem to have been exhausted.

Wisdom Apparently overcoming the first wall precedes any algorithmic solutions. For that we need to consult the prophets.

Y Chart We first draw wisdom from the Gajski Y-Chart (1983). Y-chart taught us that a design can be orthogonally decomposed into the behavioral domain, which captures the chip function, the structural domain, which captures the chip architecture as a network of computational resources, and the physical domain, which adds the geometric configuration to the architecture. A natural question to ask is: if we are to push the envelop above the register transfer level (RTL), then other than calling it the electronic system level (ESL), how should the behavioral, structural and physical domains be abstracted?

Platform Stack We next draw wisdom from Sangiovanni-Vicentelli's Platform stack model (2002), which states that chip design is carried through stacks of application platform, architecture platform and silicon platform, where a platform is an abstraction layer. It is easy to relate the platform stack to the Y-chart by equating the application, architecture and silicon platforms to the behavioral, structural and physical domains, but what the stacked platform model taught us is that chip design does not have to follow a top-down approach, as we took for granted from silicon compilation. In other words, the architecture does not have to be derived from the application, and the silicon does not have to be derived from the architecture. Instead, part of the architecture or silicon can be designed separately, or reused. What is learned from the mistake of intellectual property (IP) based design is: reuse should not just be components stored in the library, which does not reduce integration and verification complexity, but should somehow be abstracted. What remains unanswered, again, is how exactly can we abstract reuse?

Vision
In answering these questions, we attempt to carry the two conceptual frameworks forward and build an operational design methodology consisting of tangible tools and languages. To do better than merely offering design guidelines, we are particularly interested in highly automated design methodology in the same spirit of silicon compilation. We now summarize our perspectives in a series of simple equations. Note that these equations do not pretend to be a formal model - they serve only as articulation points. Also since they are meant to be a vision, we dare ourselves to be rather idealistic.

[1]
ESL = ESL platform + ESL application
Our first equation breaks the complete SOC design information at ESL into the platform part, or the generic, reused portion, and the signoff part, or the application specific, value-added portion. The importance of this breakdown is to allow us to decompose chip design into two entities that can separate in space, or time, or vendor, and practice drastically different design methodologies. A perfect example is the field programmable gate array (FPGA) business model, where FPGA vendors supply the platforms designed by the expensive custom design methodology, and the system vendors personalize the platforms by supplying applications in programming bits produced by the inexpensive FPGA design methodology. To keep non-recurring-engineering (NRE) cost manageable, it is desirable to have ``thick platform and thin applications''.

[2]
ESL platform = behavioral platform + architectural platform + physical platform
The second equation states that a complete platform at ESL needs to provide reuse abstractions at the behavioral level, architectural level and physical level. Examples of behavioral platforms are C, Matlab, or various extensions of C/C++. Examples of the architectural platforms include processor-centric architectures, heterogeneous multiprocessor architectures, and massive-parallel architectures. Examples of physical platform include placement, routing, power, and communication grid abstraction for the ASIC, structured ASIC, and FPGA technologies. Such abstraction is particularly relevant for the latter. This of course does not say much more than the platform stack model, except that we emphasize their presence to be upfront and simultaneous. Having the abstractions defined upfront makes it possible for automation based synthesis. Have the three abstractions present simultaneously makes it possible to incorporate physical effects during synthesis. Also note that one behavioral platform can work with different types of architecture platforms. For example, the C language can work with either a single processor architecture platform or multiple processor architectural platform. Likewise, an architectural platform can work with different physical platforms. For example, a multiprocessor architecture can work with either FPGA, structured ASIC, or ASIC.

[3]
behavioral platform | architectural platform | physical platform = configuration language + programming language + verification tool + synthesis tool
The third equation materializes platforms into concrete forms. It first states that each platform exposes to the platform user a configuration language so as to add flexibility, and a programming language so as to personalize the platform with the desired function. It then states that each platform encapsulates all the information necessary for final implementation within tools, one for verification, the other for synthesis. It is precisely this the information that should be abstracted away and made implicit in the application development process. Through such abstraction, reuse becomes more powerful than Intellectual Property (IP) assembly based methodology since IPs are not visible - even though they may be used under the hood.

[4]
ESL application = behavioral signoff + architectural signoff + physical signoff
The fourth equation states that the application consists of a series of signoffs (sometimes also called handoffs), that are used to program the behavioral, architectural and physical platforms. The word signoff implies both completeness and verifiability. For example, a C program is complete in the semantic domain of a C-based behavioral platform and is verifiable by using a compiler tool chain on the desktop. A binary executable is complete in the semantic domain of an instruction set based architectural platform and is verifiable by using an instruction set simulator. The programming bits of an FPGA is complete in that it completely defines the FPGA function, and is verifiable by downloading the programming bits into the corresponding FPGA device.

[5]
behavioral signoff | architectural signoff | physical signoff = hardware + software
The fifth equation states that the application signoffs carry both the hardware and the software. Note that this view significantly differs from the layered abstraction model of traditional computer systems, where hardware is abstracted away by the architecture and operating system, and the application is purely software. Constraining the programmability of platforms only to software programmability will seriously limit their applicability. This equation hence lifts the value-added hardware, typically application specific accelerators, to the user space by bypassing the slow operating system/IO interface.

[6]
hardware = software = program
The last equation breaks the dichotomy of application hardware and software and boldly states they should make no difference from the design methodology point of view: they should be programmed in the same language not only in the behavioral signoff, but also in the architectural signoff. This view can only be made possible by a capable behavioral synthesis tool.

Ring Chart We can visualize the above ideas using the new ring chart, which recasts the platform stack model onto the Y chart framework, and enforces several programming abstractions that eliminate the differences between hardware and software.

Action
We now return to the engineering mode and describe the efforts taking place in the Toronto Synthesis Group in pursing the vision in ring chart.

Platforms Given the existing investment in education and the amount of legacy code base, we envision that the behavioral platform should be C/C++, perhaps enhanced with composability and concurrency. Before the dust for yet another language war settles, we concentrate only on the minimal technical core: we build a language platform based on and only on the complete ANSI C language, equipped with a simple component based programming model, called context flow, which formalizes component interaction in and only in the form of interface invocations.
We build an architectural platform, called Metabacus. Metabacus is heterogeneous in nature and contains an extensible processor (OpenRISC, Nios etc), generic IO cores, an on-chip network, and custom cores. By specializing the on-chip network in different flavors, we have several families of the platform. Although targeting middle end, high end and massive parallel applications respectively, a common characteristics of Metabacus I, II, and III is to have direct hardware support for the context flow programming model.
We intend to build several physical platforms to abstract the silicon structures of FPGA, structured ASIC and ASIC technologies. The idea is to extract relevant placement, routing, power and clock grid information necessary for high level design exploration. Although nothing has been carved into stone at this point, it is perhaps best to start with the VPR architecture specifications used in FPGA research and generalize from there.

Diamond: Scalable Behavioral Synthesis

We deliver a new type of behavioral synthesis tool in the Diamond project, which synthesizes C program, or the behavioral signoff, into the binary code for application software and hardware, or the architectural signoff. Optionally, Verilog code for hardware can be generated. In addition to putting behavioral synthesis technology within the complete ESL design setting, Diamond strives to scale processor architecture by automatic instruction extension, scale custom hardware architecture by supporting memory and procedure abstractions, and scale the scope of the applications by employing scalable program analysis techniques separately pursued in the Quanton project.

Matrix: System-On-Chip Programming and Verification

We envision that the architecture signoff should be a binary program based on instruction sets. While the instruction sets for processors are readily available, the challenge is to extend the same concept to custom hardware. We pursue this effort in the Matrix project, which strives to define a binary abstraction called the hardware carrying code (HCC), used as architectural signoff. In addition, we produce a verification environment in which both the application software and hardware can be simulated together with implicit resources in the platform, such as generic IO devices. This makes it possible for full system simulation on which even the operating system kernel is simulated. To scale the simulation performance, we employ the trace-based dynamic compilation technology.

Grid: Scalable Architecture and RTL Synthesis

We deliver a synthesis tool in the Grid project, which bridges the gap between the architectural signoff and the physical signoff. The tool performs several tasks. First, it generates RTL code for custom cores from HCC. Second, it retrieves the reused IP components and generates the top level RTL for the personalized platform. Third, it performs physical driven RTL synthesis. The key challenges we address in this project are first the scalability problem, where we initiate the effort separately in the FBDD logic synthesis project, and then the timing closure problem, where we pursue a strategy called soft synthesis, which delays scheduling, a crucial steps that defines chip timing behavior, after the completion of global physical planning.

Web http://www.eecg.toronto.edu/~jzhu/vision.html

	Goal	The long term goal of the Toronto Synthesis Group is to produce a silicon compiler at the electronic system level (ESL). This research is driven by the need of embedded system-on-chips (SOCs), the heart of many performance demanding electronic products and applications, including but not limited to, handheld, mobile and internet phones, audio/video entertainment, gateways, wireless access points and internet TVs.
	Challenge	The design of embedded SOCs from product specification to silicon is fundamentally challenged by the sheer complexity involved, including the function complexity, thanks to the exponential growth of the targeted application functionality, and the silicon complexity, thanks to the electrical complications brought by deep submicron manufacturing processes. Although many concrete point problems need to be attacked, we are particularly intrigued by three haunting, cross-cutting, intellectual challenges.
	Abstraction Wall	Despite constant debates in the community, the exact meaning of ESL and how it can be abstracted has not been agreed upon. To make things worse, The layered abstraction model no longer reconciles with quality of results: low-level timing and power issues may invalidate decisions made at the high level, leading to design convergence problems.
	Heterogeneity Wall	The heterogeneity of an SOC architecture, a dichotomy of hardware (processor cores, generic and custom hardware cores) and software (firmware, operating systems, embedded application software), dictates the heterogeneity of design tools, methodologies and teams, which often lead to the reduction of explored design space and prolonged design schedule, therefore decreased design performance and increased design cost.
	Scalability Wall	The conventional algorithms may not keep up with the exponential growth of the information they process. It is not uncommon even today to see tasks that need to solve billions of instances of problems, yet our options for improving the algorithm capacity seem to have been exhausted.
	Wisdom	Apparently overcoming the first wall precedes any algorithmic solutions. For that we need to consult the prophets.
	Y Chart	We first draw wisdom from the Gajski Y-Chart (1983). Y-chart taught us that a design can be orthogonally decomposed into the behavioral domain, which captures the chip function, the structural domain, which captures the chip architecture as a network of computational resources, and the physical domain, which adds the geometric configuration to the architecture. A natural question to ask is: if we are to push the envelop above the register transfer level (RTL), then other than calling it the electronic system level (ESL), how should the behavioral, structural and physical domains be abstracted?
	Platform Stack	We next draw wisdom from Sangiovanni-Vicentelli's Platform stack model (2002), which states that chip design is carried through stacks of application platform, architecture platform and silicon platform, where a platform is an abstraction layer. It is easy to relate the platform stack to the Y-chart by equating the application, architecture and silicon platforms to the behavioral, structural and physical domains, but what the stacked platform model taught us is that chip design does not have to follow a top-down approach, as we took for granted from silicon compilation. In other words, the architecture does not have to be derived from the application, and the silicon does not have to be derived from the architecture. Instead, part of the architecture or silicon can be designed separately, or reused. What is learned from the mistake of intellectual property (IP) based design is: reuse should not just be components stored in the library, which does not reduce integration and verification complexity, but should somehow be abstracted. What remains unanswered, again, is how exactly can we abstract reuse?
	Vision	In answering these questions, we attempt to carry the two conceptual frameworks forward and build an operational design methodology consisting of tangible tools and languages. To do better than merely offering design guidelines, we are particularly interested in highly automated design methodology in the same spirit of silicon compilation. We now summarize our perspectives in a series of simple equations. Note that these equations do not pretend to be a formal model - they serve only as articulation points. Also since they are meant to be a vision, we dare ourselves to be rather idealistic.
	[1]	ESL = ESL platform + ESL application Our first equation breaks the complete SOC design information at ESL into the platform part, or the generic, reused portion, and the signoff part, or the application specific, value-added portion. The importance of this breakdown is to allow us to decompose chip design into two entities that can separate in space, or time, or vendor, and practice drastically different design methodologies. A perfect example is the field programmable gate array (FPGA) business model, where FPGA vendors supply the platforms designed by the expensive custom design methodology, and the system vendors personalize the platforms by supplying applications in programming bits produced by the inexpensive FPGA design methodology. To keep non-recurring-engineering (NRE) cost manageable, it is desirable to have ``thick platform and thin applications''.
	[2]	ESL platform = behavioral platform + architectural platform + physical platform The second equation states that a complete platform at ESL needs to provide reuse abstractions at the behavioral level, architectural level and physical level. Examples of behavioral platforms are C, Matlab, or various extensions of C/C++. Examples of the architectural platforms include processor-centric architectures, heterogeneous multiprocessor architectures, and massive-parallel architectures. Examples of physical platform include placement, routing, power, and communication grid abstraction for the ASIC, structured ASIC, and FPGA technologies. Such abstraction is particularly relevant for the latter. This of course does not say much more than the platform stack model, except that we emphasize their presence to be upfront and simultaneous. Having the abstractions defined upfront makes it possible for automation based synthesis. Have the three abstractions present simultaneously makes it possible to incorporate physical effects during synthesis. Also note that one behavioral platform can work with different types of architecture platforms. For example, the C language can work with either a single processor architecture platform or multiple processor architectural platform. Likewise, an architectural platform can work with different physical platforms. For example, a multiprocessor architecture can work with either FPGA, structured ASIC, or ASIC.
	[3]	behavioral platform \| architectural platform \| physical platform = configuration language + programming language + verification tool + synthesis tool The third equation materializes platforms into concrete forms. It first states that each platform exposes to the platform user a configuration language so as to add flexibility, and a programming language so as to personalize the platform with the desired function. It then states that each platform encapsulates all the information necessary for final implementation within tools, one for verification, the other for synthesis. It is precisely this the information that should be abstracted away and made implicit in the application development process. Through such abstraction, reuse becomes more powerful than Intellectual Property (IP) assembly based methodology since IPs are not visible - even though they may be used under the hood.
	[4]	ESL application = behavioral signoff + architectural signoff + physical signoff The fourth equation states that the application consists of a series of signoffs (sometimes also called handoffs), that are used to program the behavioral, architectural and physical platforms. The word signoff implies both completeness and verifiability. For example, a C program is complete in the semantic domain of a C-based behavioral platform and is verifiable by using a compiler tool chain on the desktop. A binary executable is complete in the semantic domain of an instruction set based architectural platform and is verifiable by using an instruction set simulator. The programming bits of an FPGA is complete in that it completely defines the FPGA function, and is verifiable by downloading the programming bits into the corresponding FPGA device.
	[5]	behavioral signoff \| architectural signoff \| physical signoff = hardware + software The fifth equation states that the application signoffs carry both the hardware and the software. Note that this view significantly differs from the layered abstraction model of traditional computer systems, where hardware is abstracted away by the architecture and operating system, and the application is purely software. Constraining the programmability of platforms only to software programmability will seriously limit their applicability. This equation hence lifts the value-added hardware, typically application specific accelerators, to the user space by bypassing the slow operating system/IO interface.
	[6]	hardware = software = program The last equation breaks the dichotomy of application hardware and software and boldly states they should make no difference from the design methodology point of view: they should be programmed in the same language not only in the behavioral signoff, but also in the architectural signoff. This view can only be made possible by a capable behavioral synthesis tool.
	Ring Chart	We can visualize the above ideas using the new ring chart, which recasts the platform stack model onto the Y chart framework, and enforces several programming abstractions that eliminate the differences between hardware and software.
	Action	We now return to the engineering mode and describe the efforts taking place in the Toronto Synthesis Group in pursing the vision in ring chart.
	Platforms	Given the existing investment in education and the amount of legacy code base, we envision that the behavioral platform should be C/C++, perhaps enhanced with composability and concurrency. Before the dust for yet another language war settles, we concentrate only on the minimal technical core: we build a language platform based on and only on the complete ANSI C language, equipped with a simple component based programming model, called context flow, which formalizes component interaction in and only in the form of interface invocations. We build an architectural platform, called Metabacus. Metabacus is heterogeneous in nature and contains an extensible processor (OpenRISC, Nios etc), generic IO cores, an on-chip network, and custom cores. By specializing the on-chip network in different flavors, we have several families of the platform. Although targeting middle end, high end and massive parallel applications respectively, a common characteristics of Metabacus I, II, and III is to have direct hardware support for the context flow programming model. We intend to build several physical platforms to abstract the silicon structures of FPGA, structured ASIC and ASIC technologies. The idea is to extract relevant placement, routing, power and clock grid information necessary for high level design exploration. Although nothing has been carved into stone at this point, it is perhaps best to start with the VPR architecture specifications used in FPGA research and generalize from there.
	Diamond: Scalable Behavioral Synthesis
		We deliver a new type of behavioral synthesis tool in the Diamond project, which synthesizes C program, or the behavioral signoff, into the binary code for application software and hardware, or the architectural signoff. Optionally, Verilog code for hardware can be generated. In addition to putting behavioral synthesis technology within the complete ESL design setting, Diamond strives to scale processor architecture by automatic instruction extension, scale custom hardware architecture by supporting memory and procedure abstractions, and scale the scope of the applications by employing scalable program analysis techniques separately pursued in the Quanton project.
	Matrix: System-On-Chip Programming and Verification
		We envision that the architecture signoff should be a binary program based on instruction sets. While the instruction sets for processors are readily available, the challenge is to extend the same concept to custom hardware. We pursue this effort in the Matrix project, which strives to define a binary abstraction called the hardware carrying code (HCC), used as architectural signoff. In addition, we produce a verification environment in which both the application software and hardware can be simulated together with implicit resources in the platform, such as generic IO devices. This makes it possible for full system simulation on which even the operating system kernel is simulated. To scale the simulation performance, we employ the trace-based dynamic compilation technology.
	Grid: Scalable Architecture and RTL Synthesis
		We deliver a synthesis tool in the Grid project, which bridges the gap between the architectural signoff and the physical signoff. The tool performs several tasks. First, it generates RTL code for custom cores from HCC. Second, it retrieves the reused IP components and generates the top level RTL for the personalized platform. Third, it performs physical driven RTL synthesis. The key challenges we address in this project are first the scalability problem, where we initiate the effort separately in the FBDD logic synthesis project, and then the timing closure problem, where we pursue a strategy called soft synthesis, which delays scheduling, a crucial steps that defines chip timing behavior, after the completion of global physical planning.
	Web	http://www.eecg.toronto.edu/~jzhu/vision.html