Full Custom Layout of an SRAM-Based FPGA
Blair Fort, Daniele Paladino, Franjo Plavec
1. Introduction
The goal of this project was to design an SRAM-based Field Programmable Gate Array (FPGA), and implement it by laying it out as a full-custom design in 0.35m TSMC process. The design was limited by maximal area of 4 mm2 and by the time available for the design process (6 weeks).
The tasks necessary to build and verify an FPGA include system level planning, schematic design, cell layout, and final chip layout. These steps are described in more detail in the following sections.
2. System Outline
On the highest level, an FPGA consists of programmable logic elements and programmable routing resources used to interconnect the logic elements. The logic elements implement the combinational and sequential logic functions the user wants to implement in the FPGA, and the routing resources interconnect logic elements to implement the desired system. FPGA design usually consists of an array of identical blocks of logic and routing resources called tiles. The structure of the tile with the corresponding programming logic used in this work is shown in Figure 1.
Figure 1. The tile structure with programming
Logic element (LE) is connected to two connection boxes: connection box right (CBR) and connection box bottom (CBB). The CBR connects the vertical routing tracks to the inputs of the LEs on its left and right, while the CBB connects the horizontal routing tracks to the inputs of the LEs above and bellow it, and connects the output of the LE above it to the routing tracks. Finally, the switch box (SB) interconnects the horizontal and vertical routing tracks. Our FPGA contains 4 vertical and 4 horizontal routing tracks. The LE, connection boxes, and the switch box contain SRAM cells that can be programmed to implement desired functionality.
3. Design Decisions
The following design decisions were made in our design of an SRAM based FPGA.
In our FPGA design, we use nMOS pass transistors (as opposed to transmission gates) controlled by SRAM cells to implement programmable routing and logic. Only nMOS pass transistors are used to reduce the area since there are many programmable switches in the design. Since nMOS pass transistors cause a degradation of the signal when passing a high value, a weak pMOS feedback transistor was added to restore the strong logical ‘1’ signal being passed.
There has been a great deal of study on how routing tracks should be designed in an FPGA. Higher flexibility of the routing tracks requires the greater size of the switch boxes. Since our design has area restrictions, we decided to use a method that provides reasonable flexibility and keeps the size of the switch box relatively small. This is done using segmented routing. The routing consists of 4 routing wires (segments) of various lengths. The wire length is defined in terms of the number of the tiles the routing segment spans between two switch boxes. In our design, these segments are of lengths 1, 2, 4, and 4 respectively. At every switch box, there are 2 wires that can be programmed to be connected in any direction. This requires 12 pass transistors and associated SRAM cells to configure the switch box. The remaining two routing tracks pass through the switch box. Since not all the segments are of the unit length, this is enough to route all the segments.
To save the area, the programming is performed using shift registers placed on the top and left of the chip, and are common for all the tiles on the chip. The programming bits are shifted into the top programming registers (through inputs prog_data_col and col_clk), while the bits in the left programming registers (shifted in through inputs prog_data_row and row_clk) determine which part of the logic is being programmed by the bits currently in the top programming registers. The programming is only performed when the program input is high.
4. Schematic Design and Simulation
To verify the functionality of our design, we have drawn the schematics of our system in a hierarchical manner. The top-level schematic is shown in Figure 2.
Figure 2. The top-level schematic of the design
Our design contains 240 tiles organized in an array of 16x15 tiles, and contains 10,800 programmable elements. To test the whole device, we would need to determine the state of each of those programming bits, which is normally performed by an FPGA synthesis tool, and then run the simulation with input sequence of the programming bits and test vectors. Since such a simulation overcomes the capacities of the available tools (HSpice), and requires excessive amount of time, we have simulated the functionality of a single tile with the corresponding routing and programming resources. All the simulation results produced the expected behaviour, and since we have tested all the components in our system in a configuration they are used in the system, we consider this a sufficient proof of the correct functionality of our design.
5. Layout
We designed all our basic cells to have the same height. This was necessary to match the power and ground lines when the cells abut. The cell with the largest height in our design was an inverter of size 2. The SRAM was the most used cell in our design and its height was only slightly increased to match the height of the inverter cell. This allowed us to decrease the width of the SRAM cell thus minimizing the area consumed by the SRAM cell.
The top level view of the FPGA layout is shown in Figure 3.
Figure 3. Top-level view of the chip layout
The final design contains 48 pins. The number of pins was limited by the number of pads that can fit on a die of size 2x2 mm. 38 pins are dedicated for general purpose input and output, while the remaining 10 pins are used for power supply (2 pins), ground (2 pins), programming (5 pins), and global clock, which is distributed to the flip-flops in all LEs in the device. The global clock is distributed by an H-tree structure inside the device to provide the low-skew clock.
6. Conclusions and Future Work
Our final design passes all DRC checks, and LVS shows that the netlist matches our schematic. Currently, this project fully implements an FPGA, but some future work may be beneficial. The areas that should be covered by future work include simulation, power distribution network, and I/O pads.
Firstly, we were able to simulate the full tile with programming using HSpice, but no larger designs. To simulate the whole design, we would like to use another tool, such as NanoSim, that is known to have been used for simulation by other researchers that have fabricated FPGAs. Another choice is to use a hybrid VHDL/Verilog and schematic simulation in Cadence, which could also reduce simulation time and complexity.
Secondly, the power distribution grid, could be improved. Although our power distribution lines are comparable to other FPGA designs used in research, there is some concern that the metal lines are too narrow. Since the design is very regular, the cells could be spaced further apart to allow for wider metal lines, which should be done before the tapeout.
Another area for improvement is the use of I/O pads. In the current implementation, all general pins are set to be either inputs or outputs. Generally, most FPGAs allow all general purpose pins to be programmed to be either inputs or outputs, which requires an extra SRAM cell. This was not done in the current implementation, because the provided pad library does not contain I/O pads.
Finally, more advanced features, such as memory resources, PLL blocks, DSP blocks, carry-chains, could be added to the design. This would make the design usable in a wide variety of applications.
7. Division of Tasks
Blair Fort: LUT (planning, schematic, simulation, layout, DRC, LVS), Logic element (planning, schematic, simulation, layout, DRC, LVS), connection boxes (layout, DRC, LVS), parts of reports and presentations
Daniele Paladino: Area estimation for the whole design, SRAM (planning, schematic, simulation, layout, DRC, LVS), shift registers (planning, schematic, simulation, layout, DRC, LVS), programming logic (planning, schematic, simulation, layout, DRC, LVS), parts of reports and presentations
Franjo Plavec: connection boxes (planning, schematic, simulation), switch box (planning, schematic, simulation, layout, DRC, LVS), top-level (schematic, simulation, layout, DRC, LVS), parts of reports and presentations