This section describes how the set of architectures obtained by varying the parameters discussed in Section 2 can be compared on the basis of speed, area, and flexibility.
One way to compare architectures would be to gather a set of benchmark circuits (each containing memory), and attempt to map the logical memory configuration of each circuit to each architecture. Each mapping attempt may or may not be successful. A flexibility measure could be obtained by counting the number of successful mappings to each architecture, and the architecture with the highest count would be deemed the most flexible. Detailed access time and area models can be used to estimate the access times and chip area of each memory implementation.
The problem with this approach stems from the fact that circuits typically have only a few logical memories. This is in contrast to previous studies on logic block architectures, where each circuit contains enough logic blocks that, even for a moderate number of benchmark circuits, hundreds (or thousands) of logic blocks will be used. Thus, all architectural features of the logic block are thoroughly exercised. This isn't the case with memory; to adequately exercise each configurable memory architecture, thousands of logical memory configurations would be required. Clearly, it isn't feasible to gather that many benchmark circuits.
As an alternative, we have developed a ``logical memory configuration
generator'' that generates logical memory configurations randomly,
constrained by the set of parameters shown in
Table 3. This table also gives the parameter
values we have used to gather all the results in this paper. Each
configuration is generated as follows. First, the number of logical
memories is randomly chosen (each number between 1 and is equally
likely). Then, for each logical memory, a width between
and
and depth between
and
are selected. The
parameter
is used to indicate what proportion of the generated
dimensions are a power of two. We have chosen
; this means 80%
of the depths and 80% of the widths generated are a power of two (all
powers-of-two between
and
or
and
are
equally likely).
Once the dimensions of all memories have been chosen, the total
number of bits is compared to
, and if it is larger,
a completely new set of dimensions
is chosen (for the same number of logical memories).
This is repeated until the total number of bits is less than
.
To gather all results in the next section, 10,000 logical memory
configurations were generated.
Table 3: Parameters for workload generator
To map a logical memory configuration onto an architecture, an algorithm that assign arrays, address buses, and data buses to each logical memory is required. If the mapping blocks are fully populated, that is, any external bus can be connected to any array, the mapping problem is easy. However, for architectures with mapping blocks similar to Figure 4, the task is much less straightforward. Such an algorithm was developed, and is described in [14].
To compare implementations in terms of speed and area, detailed access time
and area models are needed. The access time model used in this study was
modified from a detailed cache access time model [15]. It
contains terms for the delays due to the decoder, word lines, bit
lines, column multiplexors, and sense amplifiers, as well as routing. The
area model is based on a cache area model [16]. Area
measurements are given in memory bit equivalents or mbe's; one
mbe is equal to the size of one memory cell in an SRAM array (1 mbe = 0.6
rbe in [16] 250
in a 0.8um CMOS process).
As described earlier, flexibility is also an important metric. Each attempt to map a logical memory configuration to an architecture might or might not be successful. There are several reasons why an attempt might not be successful: the architecture might not contain enough bits, it might not have enough data lines, the mapping blocks might not be flexible enough to combine arrays in such a way that the configuration can be mapped, or the granularity of the arrays might be such that too many bits are wasted (the last of these reasons will be described in more detail in Section 4). A measure of flexibility can be obtained by counting the number of successful mappings to each architecture, and by using the following definition:
Using this definition, an architecture that is better able to adapt to the generated logical memory configurations has a higher flexibility.