# The Effect of Switch Box Flexibility on Routability of Field Programmable Gate Arrays

Jonathan Rose, Stephen Brown
Dept. Electrical Engineering, University of Toronto, Ontario, Canada M5S 1A4

#### **Abstract**

This paper explores the relationship between the routability of a Field Programmable Gate Array (FPGA) and the flexibility of its interconnection structures. A set of industrial circuits are implemented as FPGAs in a range of routing structures with varying flexibility. Experiments indicate that high flexibility is essential for the connection box that joins the logic blocks to the routing channel, but a relatively low flexibility is sufficient for switch boxes at the junction of horizontal and vertical channels.

#### 1 Introduction

The architecture of Field Programmable Gate Arrays is a new and difficult science because the programmable nature of the devices require complex building blocks. The architecture of an FPGA consists of its logic block function, interconnection structure, and I/O block design. In previous work, we investigated the effect of logic block functionality on the area of FPGAs [Rose89, Rose90]. In this paper we focus on the interconnection design and study the effect of switching block flexibility on routability. Routability is defined as the percentage of the total number of connections that are successfully routed.

The FPGA was introduced in [Cart86] and newer versions have been presented in [Hsie88] [ElGa89] [Wong89] [Ples89] and [Marr89]. In some cases these architectures were driven by process technology and the assumption that a user would be able to hand-tune the implementation. In other cases, the architectures were a result of a natural progression of PLDs. It is now clear that these devices will soon be so complex that users must have CAD tools to design with them. Thus the architecture should be heavily influenced by its ability to ease the automated design process. This means that architectural decisions should be made at the same time that the CAD tools are crafted, using information generated by those tools and the constraints of the algorithms. We use this approach to explore routing structures for FPGAs.

Figure 1 depicts the general structure of the FPGA that we consider. The logic blocks, ("L" blocks in the figure) connect to the horizontal and vertical channels using a connection box ("C"

This work was supported by NSERC Operating Grant #URF0043298 and a research grant from Bell-Northern Research.

blocks). Connections are switched at the intersection of horizontal and vertical channels by a switch box ("S" blocks). The flexibility of the S and C blocks is the key to obtaining efficient use of the routing area. Loosely defined, flexibility is the number of choices offered to each wire entering a switching block. A high flexibility means a large number of choices, and a low flexibility means few choices. If the blocks are more flexible than is required to obtain complete connectivity, then there will be too many switches and the blocks will be unnecessarily slow and large. If they are insufficiently flexible, then a large number of routing tracks per channel will be required and area will be wasted.



Figure 1 - General Routing Structure of FPGA

The particular questions this paper addresses are:

- What is the effect of connection (C) block flexibility on the routing completion rate? Experiments indicate that the connection block should have high flexibility because it is otherwise unlikely that any path will be able to connect at its terminating point.
- 2. What is the effect of switch (S) block flexibility on the routing completion rate? Our results show that low flexibility is sufficient for switch blocks because a cascade of switch blocks will provide many possible paths, even when there are only a few choices at one block.

## 2 Experimental Procedure

To answer these questions, our approach is to implement a set of circuits in a variety of interconnection structures and to

27.5.1

**IEEE 1990 CUSTOM INTEGRATED CIRCUITS CONFERENCE** 

CH2860-5/90/0000-0140 © 1990 IEE

determine the percentage routing completion for each. Using the FPGA model shown in Figure 1, we assume that the the number of tracks in each of the routing channels, W, is the same.

The logic block used in these experiments is the one resulting from our previous study [Rose89,Rose90]. It is a four-input lookup table-based combinational block, a D flip-flop and a tristate output. In the previous study, this block achieved the minimum total area for 12 industrial circuits when compared to many other logic blocks with differing numbers of inputs, including and excluding a flip-flop, over a wide range of programming technologies (the method of FPGA customization). The block used in these experiments allows all logical pins to connect to all four channels on the sides of the block.

The implementation procedure will be used to transform each circuit into a variety of FPGAs with interconnection structures of varying flexibility. We are concerned with the effect of the flexibility of the connection block and the switch block on routing completion rate. The general nature of the switch box under consideration is illustrated in Figure 2. Define the flexibility of the switch box,  $F_s$ , to be the number of possible connections from each incoming wire to each opposing side. For the example shown in Figure 2, wire R0 can be connected to two wires on each other side (U0, U2, L0, L2, and D0, D2) and so  $F_{s} = 2$ , if the other wires are similarly connected. For the experiments in this paper the connections in the switch box are chosen to be spread out evenly in the following way: the wire on the left side at position i ( $0 \le i < W$ ) is connected to  $F_s$  wires at the positions i,  $i + \frac{W}{F_s}$ ,  $i + \frac{2W}{F_s}$  and so on. Similar connections are made between the other sides. The implementation of the connections depends on the programming technology - they could be SRAM-driven transistor switches [Hsie88], anti-fuses [ElGa89], or EPROM-driven transistors [Wong89], among others.



Figure 2 - Definition of Flexibility of Switch Box

Figure 3 illustrates an example connection box. The channel wires flow uninterrupted through the connection box and are connected to logic block pins via a set of switches. The flexibility of the connection box,  $F_c$ , is defined as the number of channel wires that each logical pin can connect to. For the example shown in Figure 3, logic block pin P0 can connect to channel wires W0 and W2 and so  $F_c$  is 2, as all the other pins are connected similarly.

The procedure described below realizes each circuit as an FPGA. Its inputs are the circuit netlist, the interconnection structure of the switchbox (characterized by  $F_s$ ), the connection box (characterized by  $F_c$ ), and the number of tracks per channel, W. The output of the procedure is the percentage of nets that were routed successfully.



Figure 3 - Definition of Flexibility of Connection Box

#### Implementation Procedure: For each circuit:

- Perform the technology mapping [Keut87] from the original circuit into the logic block. The result of this step is a new netlist that interconnects only logic blocks, and is functionally equivalent to the original circuit.
- Perform the placement of the resulting netlist. This is done using the Altor placement program [Rose85], which is based on the min-cut placement algorithm. Altor makes the array as square as possible.
- 3. Perform the global routing of the circuit. Global routing determines the path of channels that each wire is to take. This procedure gives the number of tracks that would be required in each channel if the switch and connection boxes had full flexibility. The approach used is similar to the LocusRoute standard cell global routing algorithm described in [Rose88].

For the set of parameters  $F_s$ ,  $F_c$ , and W, do the following:

 Perform the detailed routing of the circuit. For each wire's global path, determine the exact wires and explicit connections in the switch box and connection box. We have developed a new algorithm, for this constrained routing problem, that is able to take the configuration of the S and C boxes and W as input [Brow90]. There is reason for confidence in the router because it can consistently achieve high routing completion when the number of tracks is near the absolute minimum possible, as determined by the global router. The detailed routing determines the percentage routing completion.

### 3 Experimental Results

The implementation procedure was performed on five circuits from four sources - Bell-Northern Research, Zymos, and two different designers at the University of Toronto. Table 1 gives the name, size (number of logic blocks, and two-point connections), source and the function of each circuit.

| Circuit | #Blocks | #Conn | Source | Туре        |
|---------|---------|-------|--------|-------------|
| BUSC    | 109     | 392   | UTD1   | Bus Cntl    |
| DMA     | 224     | 771   | UTD2   | DMA Cntl    |
| BNRE    | 362     | 1257  | BNR    | Logic/Data  |
| DFSM    | 401     | 1422  | UTD1   | State Mach. |
| Z03     | 586     | 2135  | Zymos  | 8-bit Mult  |

Table 1 - Experimental Circuit Characteristics

## 3.1 Connection Box Flexibility

Figure 4 is a plot of the percentage routing completion versus connection box flexibility,  $F_c$ , with  $F_s = 1$ , 2 and 3, for circuit BNRE. W is set to 13, slightly more than the global router-indicated minimum. The figure indicates that the routing completion rate is very low for small values of  $F_c$  and only approaches 100% when  $F_c$  is near to W. The figure also shows that the increasing switch box flexibility improves the completion rate at a given  $F_c$ , but to get near 100% the  $F_c$  must still be high. Although it is not shown in the figure, these curves are also affected by the number of tracks per channel, W. As W increases, the curves shift up - as more tracks are added, there are more paths to make the connections. The other circuits show similar trends.

The key observation from the data of Figure 4, for all of the circuits, is that  $F_c$  must be close to W for high routing completion - that is  $\frac{F_c}{W}$  must be between 0.5 and 1. The reason for this is that  $\frac{F_c}{W}$  is the probability that a wire arriving on a

particular track at the C block is able to connect to the required logical pin. As  $\frac{F_c}{W}$  decreases from 1, it is more likely that wires won't be able to connect to the desired pin, and so routing completion declines.



Figure 4 - % Complete vs. Fc and Fs, Circuit BNRE

Table 2 summarizes the results for the other circuits. It gives the minimum value of  $F_c$  and  $\frac{F_c}{W}$  required to achieve 100% routing completion for each circuit for three values of  $F_s$ , at a fixed W (near the minimum). These results clearly show that  $\frac{F_c}{W}$  should be well over 0.5, and are remarkably consistent as a function of  $F_s$ .

| Circuit     | w  | F, | 100% F <sub>c</sub> | $\frac{F_c}{W}$ |
|-------------|----|----|---------------------|-----------------|
| BUSC        | 11 | 1  | 9                   | 0.82            |
| BUSC        | 11 | 2  | 7                   | 0.64            |
| BUSC        | 11 | 3  | 7                   | 0.64            |
| DMA         | 12 | 1  | 10                  | 0.83            |
| DMA         | 12 | 2  | 8                   | 0.66            |
| DMA         | 12 | 3  | 8                   | 0.66            |
| BNRE        | 15 | 1  | - 11                | 0.73            |
| BNRE        | 15 | 2  | 9                   | 0.60            |
| BNRE        | 15 | 3  | 8                   | 0.53            |
| DFSM        | 14 | 1  | 12                  | 0.86            |
| DFSM        | 14 | 2  | 8                   | 0.57            |
| DFSM        | 14 | 3  | 8                   | 0.57            |
| <b>Z03</b>  | 16 | 1  | 13                  | 0.82            |
| <b>Z</b> 03 | 16 | 2  | 9                   | 0.56            |
| Z03         | 16 | 3  | 9                   | 0.56            |

Table 2 - Minimum F. Required for 100% Completion

#### 3.2 Switch Box Flexibility

Table 3 gives the percentage routing completion for each circuit where  $\frac{F_c}{W}=1$ , with the switchbox flexibility,  $F_s$ , varying from one to four. The values of W are set to the minimum possible for 100% completion as indicated by the global router. Clearly, low flexibilities achieve almost complete routability (note that the maximum value of  $F_s$  is W). This makes intuitive sense because even for  $F_s=1$ , every incoming wire to the switch block has at least one way of connecting to each outgoing side. In the general case, it is easy to show that the number of different paths between the beginning and terminating C-blocks is given by:

$$#Paths = F_c \times F_s^N$$

where N is the number of switch blocks in the path. The average value of N for these circuits is about 3, and so for the average connection, using  $F_s = 2$ , and  $F_c = 10$ , there are 80 different paths. For  $F_s = 3$  there are 270 paths. Thus a small increase in flexibility of the switch box vastly increases the number of paths, and hence the routability.

| Circuit | W  | %Routing Completion |           |        |           |
|---------|----|---------------------|-----------|--------|-----------|
|         | İ  | F, = 1              | $F_s = 2$ | F, = 3 | $F_s = 4$ |
| BUSC    | 8  | 97.2                | 99.0      | 99.7   | 99.7      |
| DMA     | 7  | 94.9                | 97.2      | 97.9   | 98.1      |
| BNRE    | 10 | 98.8                | 99.1      | 99.5   | 99.5      |
| DFSM    | 9  | 98.1                | 99.7      | 99.6   | 99.8      |
| Z03     | 11 | 98.5                | 99.8      | 99.9   | 99.9      |

**Table 3 -** Effect of  $F_s$ , for  $\frac{F_c}{W} = I$ 

For lower  $F_c/W$  ratios, we observe that increasing  $F_s$  has quickly diminishing returns, as shown in Table 4. A higher  $F_s$ , however, will compensate for a lower  $F_c$ .

| Circuit     | W  | %Routing Completion |           |           |           |
|-------------|----|---------------------|-----------|-----------|-----------|
|             |    | $F_s = 1$           | $F_s = 2$ | $F_s = 3$ | $F_s = 4$ |
| BUSC        | 11 | 76.8                | 95.2      | 95.2      | 95.2      |
| DMA         | 11 | 83.3                | 93.5      | 93.8      | 93.9      |
| BNRE        | 13 | 82.8                | 94.6      | 94.8      | 95.0      |
| DFSM        | 13 | 84.3                | 96.1      | 96.1      | 96.1      |
| Z03         | 13 | 79.6                | 94.2      | 94.4      | 94.6      |
| <b>Z</b> 03 | 13 | 79.6                | 94.2      | 94.4      | 94        |

**Table 4 -** Effect of 
$$F_s$$
, for  $\frac{F_c}{W} = 0.5$ 

#### 4 Conclusions and Future Work

This paper has explored the relationship between routing structure flexibility and routability of FPGAs. The principal conclusions are that connection blocks should have high flexibility to achieve high percentage routing completion, but that switch blocks need limited flexibility. In the future, we will look at finer gradations of the switchblock flexibility - values of  $F_s$  between 0 and 1, as well as 1 and 2. Also the effect of flexibility on area and speed will be modeled and measured.

### **5 References**

[Brow90]

S. Brown, "Routing Architectures and Algorithms for FPGAs", Ph.D. dissertation, Univ. Toronto, in preparation.

[Cart86]

W. Carter et. al, "A User Programmable Reconfigurable Gate Array," Proc. 1986 CICC, May 1986, pp. 233-235.

[Keut87]

K. Keutzer, "DAGON: Technology Binding and Local Optimization by DAG Matching," Proc. 24th DAC, June 1987, pp. 341-347.

[ElGa89]

A. El Gamal, et. al, "An Architecture for Electrically Configurable Gate Arrays," IEEE JSSC Vol. 24, No. 2, April 1989, pp. 394-398.

[Hsie88]

H. Hsieh, et. al "A 9000-Gate User-Programmable Gate Array," Proc. 1988 CICC, May 1988, pp. 15.3.1 - 15.3.7.

[Marr90]

C. Marr, "Logic Array Beats Development Time Blues," Electronic System Design Mag., Nov. 1989, pp. 38-42.

[Ples89]

Plessey Semiconductor ERA60100 preliminary data sheet.

[Rose8

J. Rose, Z. Vranesic, W. M. Snelgrove, "ALTOR: An Automatic Standard Cell Layout Program," Proc. Can. Conf. on VLSI, Nov. 1985, pp. 168-173.

[Rose88]

J. Rose, "LocusRoute: A Parallel Global Router for Standard Cells," Proc. 25th DAC, June 1988, pp. 189-195.

[Rose89]

J.S. Rose, R.J. Francis, P. Chow, and D. Lewis, "The Effect of Logic Block Complexity on Area of Programmable Gate Arrays," Proc. 1989 CICC, May 1989, pp. 5.3.1-5.3.5.

[Rose90]

J.S. Rose, R.J. Francis, D. Lewis, and P. Chow, "Architecture of Programmable Gate Arrays: The Effect of Logic Block Functionality on Area Efficiency," to appear in IEEE JSSC, 1990.

[Wong89]

S.C. Wong, H.C. So, J.H. Ou, J. Costello, "A 5000-Gate CMOS EPLD with Multiple Logic and Interconnect Arrays," Proc. 1989 CICC, May 1989, pp. 5.8.1 - 5.8.4.