Automated Debugging with High Level Abstraction and Refinement

Sean Safarpour 1  Andreas Veneris 2

Abstract—Design debugging is a manual and time consuming task which takes as much as 60% of the verification effort. To alleviate the debugging pain automated debuggers must tackle industrial problems by increasing their capacity and improving their performance. This work introduces an abstraction and refinement methodology for debugging that leverages the high level information inherent to RTL designs. Function abstraction uses the modular nature of designs to simplify the debugging problem. If required, refinement re-introduces the necessary circuitry back into the design in order to find all error locations. The abstraction and refinement process is applied throughout the design's hierarchy allowing for a divide and conquer methodology. The proposed technique is shown to reduce the memory requirement by as much as 27 × and reduce the run-time by two orders of magnitude over a conventional debugger.

I. INTRODUCTION

The continuous increase in design size and complexity has led to a significant escalation in the verification cost and effort. This trend is confirmed in the industry as the number of verification engineers has quadrupled with respect design engineers over the last decade [1]. As a result, functional verification has received much attention from the academic and industrial communities to alleviate this pain [2], [3], [4]. In contrast relatively little attention is paid to the task of debugging, or locating the error source, once verification fails.

Today, debugging is predominantly performed manually by verification engineers with no more than waveform viewers and navigation tools at their disposal. The debug process is comprised of manually collecting information from the failed simulation trace (or counter-examples) and back-tracing with “what-if” analysis until the error source is identified. As typical design block sizes today exceed the half million synthesized gates mark and traces range from a few hundred to a few thousand of clock cycles, debugging has grown to take as much as 60% of the verification effort [5]. As a result, scalable automated debugging techniques have become an urgent necessity to alleviate this pain [1].

Broadly speaking, there are two factors that influence the effectiveness of automated debugging. The first factor is the design size that impacts the solution space. As designs are implemented at higher levels of abstraction and the gate count increases, the number of suspect error locations that needs to be examined also increases. In turn, a debugger’s performance can dramatically slow down [6], [7]. Secondly, the length of the error trace, that is, the number of clock cycles from the beginning of simulation until the design fails, increases the solution space one must examine. Modern debugging solutions must cope with the complexity introduced by these factors to be adopted by the industry.

This work aims to bridge this gap between current debugging capabilities and contemporary industrial needs. It does so by introducing the concepts of abstraction and refinement for automated debuggers using high level information. More specifically, the modular and hierarchical nature of a design at the RTL is leveraged to develop an efficient abstraction and refinement methodology. In the past, similar techniques have led to dramatic improvements in the scalability and applicability of model checking methodologies [8], [9], [10]. More recently, results from abstraction/refinement at the gate level demonstrated significant performance gains for debugging as well [11].

The proposed functional abstraction-based debugging methodology operates iteratively on the design hierarchy. At the topmost level, it begins by abstracting or “simplifying” high level components. Next, a conventional debugger [12], [7], [6] is applied to the abstracted model to identify all error locations. Since the debugging problem is much smaller than the original, it can be solved much more efficiently. If the debugger fails to identify all error locations, refinement is performed to systematically re-introduce the abstracted design components back into the design.

This pairing sequence of debugging and refinement is repeated until all solutions are found at the given hierarchy level. Next, the process is repeated at the next lower level to improve the granularity of the solutions. The methodology terminates once all the error locations are found at the lowest hierarchy level (i.e. gate level). It is important to notice that the proposed theory is not tied to any particular debugging technique and applies to SAT, simulation, and BDD based methods [12], [6]. Extensive experiments on large industrial problems demonstrate a drastic memory reduction of over 27 × and run time reductions of over two orders of magnitude. This work demonstrates that abstraction and refinement has a significant impact on the performance of automated debugging.

This paper is organized as follows. Section II provides background material while the main contribution is introduced in Section III. Section IV and V present the experimental results and the conclusion, respectively.

II. PRELIMINARIES

A circuit \( C \) (combinational or sequential) at the RTL can be hierarchically composed of modules or functions. In this work, a function is said to generate a Boolean value for a variable \( y \) based on \( m \) input variables \( x_1, x_2, ..., x_m \) and zero or more state variables. In this work, we are primarily concerned with the structural connectivity between the input variables and the output variable \( y \) of a function. As a result, we label the

01 Vennsa Technologies Inc., Toronto, ON M5V 3B1 (sean@vennsa.com)
02 University of Toronto, ECE Department, Toronto, ON M5S 3G4 (veneris@eecg.toronto.edu)
function of $y$ as $f(x_1, x_2, ..., x_m)$ and omit its dependence on any state variables. The terms modules, components and functions are used interchangeably to refer to entities implementing functions as defined above. For example, a Verilog function or a collection of logic gates and flip-flops can define a module. Each module implements a multi-output function $F = \{f_1(X), f_2(X), ..., f_p(X)\}$ where each single-output function $f_i$ is defined on input variables $X = \{x_1, x_2, ..., x_q\}$. In the remaining paper, single output functions and multi-output functions are not distinguished unless explicitly stated otherwise.

Modules can also contain sub-modules thus resulting in a hierarchy tree $H$ for the design [13]. A hierarchy tree $H$ contains nodes representing modules and edges representing parent and child (sub-module) relationships. The hierarchy tree $H$ can contain many levels, each function is tagged with a superscript that indicates its level and a subscript to uniquely label the function. For example a function $F^i_j$ is at level $i$ of the tree and it can have sub-functions $F^{i+1}_k$ and $F^{i+1}_l$ at the next level $i + 1$. The output of the entire design $C$ is represented by $F^0_1$ at root level 0.

A. Debugging Background

Error traces (or counter-examples) returned by simulation or formal verification tools along with a corresponding set of correct output vector sequences comprise the set of diagnosis vectors $V$. Given an erroneous design $C$ with the corresponding set of diagnosis vectors $V$, design debugging identifies components (gates, modules, etc.) responsible for the erroneous behavior. We assume that the reference (golden) model can be in some high level language (C/C++, Matlab, etc) and provide no abstracted state elements. Since this representation contains less logic than the original one, the size of the problem may be reduced considerably, in favor of debugging. Debugging an abstract model can sometimes return abstracted state elements as error sources. In these cases, a refinement procedure replaces some of the abstracted variables with the original state elements. Note that this technique does not leverage any high level information when performing abstraction and thus cannot iterate over the different hierarchy levels.

III. Debugging with Function Abstraction

The abstraction/refinement method introduced in [11] operates on the circuit’s state elements. Although powerful, deciding which states to abstraction is not a trivial task. In contrast, function abstraction, presented in this paper, utilizes the high level information contained in the circuit RTL and operates on functions and modules providing a natural way to structure the debugging problem. Furthermore, the hierarchical composition of designs can be leveraged to apply abstraction in a systematic and iterative manner.

A. Formulating the Problem

At any hierarchy level, the original design $C$ can undergo function abstraction based on the functions available at that level. The abstraction function is given by

$$C' = h_f(C, i),$$

where $C$ is the original circuit and $i$ is a given hierarchy level. The mapping of $h_f$ is a design $C'$ which contains a set of functions $Abs_i = \{F_j, F_k, \ldots\}$ abstracted at hierarchy level $i$. In other words, the circuitry corresponding to $Abs_i$ is removed from $C$ and hierarchy $H$, resulting in $C'$ and $H'$. In our methodology, determining which functions to abstract is found through heuristics outlined in Section IV. After removing $Abs_i$ from $C$, the functions $\{F_j, F_k, \ldots\}$ are replaced with new primary input $\{X_{F_j}, X_{F_k}, \ldots\}$ in $C'$.

Example 1 Figure 1 (a) and (b) show a design with its corresponding hierarchy tree. The functions of the design are $F^1_1, F^2_1, F^3_1$ at level 1 and $F^2_2, F^2_3$ at level 2. By abstracting functions $F^1_1$ and $F^3_1$ the resulting design $C'$ with the new primary input $X_{F^2_1}$, $X_{F^3_1}$ shown in Figure 1 (c). The corresponding hierarchy tree $H'$ for the new abstracted design is shown in Figure 1 (d). Notice that the sub-module of $F^2_1$, $F^2_3$ is also abstracted in $C'$.

As a consequence of the abstraction operation, some of the transitive fanin of the abstracted functions may be dangling, that is, at level $i$, fanin circuitry of $Abs_i$ may not be connected.
Figure 2 shows the correction models as black dots on the example of Figure 1 at level 1. Notice that suspects are similar models exist for other debugging techniques [13].

In the context of SAT-based debugging a mechanism is required to identify when suspects may be constrained [11], [6], is a multiplexer that allows the debugger to identify any gate as suspects (including primary inputs).

To resolve this situation, the logic values of the abstracted functions [12] are removed. This allows the debugger to identify any gate as suspects (including primary inputs). To find all equivalent locations, the added primary input, which are unconstrained in the vectors must be captured and used to constrain the primary input, while operating on the problem. Such logic can be deleted with a logic removal algorithm to further simplify the concrete design.

New solutions returned by the debugger in this formulation cannot contain the set of newly added variables while operating on the problem. Such logic values of the abstracted design is less complex than the concrete design.

To create a new stimulus, the logic values of the abstracted design must be extracted and the abstracted design (1) (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) (q) (r) (s) (t) (u) (v) (w) (x) (y) (z)

Abstract modules corresponding to spurious solutions must be refined to determine where the error location is with respect to these variables. As explained in [11], modular debugging. Refinement is performed on line 8 if any of the new primary inputs is added to the abstract design. If any of the new primary inputs are found as suspects, then the error in the concrete design may be encapsulated inside the abstracted functions. These primary inputs are constrained by their original fanin logic. Consequently the added primary input, which are unconstrained in the vectors resulting from Algorithm 1 contains the spurious solutions.

The steps of debugging again to determine the error location corresponding to the original functions are re-introduced, all its transitive fanin is also added to the abstract design resulting from Algorithm 1. Once the abstracted design is less complex than the concrete design, the abstracted modules corresponding to spurious solutions are called.

Proof Theorem 1 and other results in this section.

Lemma 1 below is useful to prove Theorem 1 and other results in this section.
Lemma 1 Assuming that an erroneous behavior is only observed at the primary output of a design, a module $m_c$ demonstrates an erroneous behavior only if its parent module $m_p$ demonstrates an erroneous behavior.

Proof: Take any parent module $m_p$ of module $m_c$ from a hierarchical design. All paths from any set of output of module $m_c$ to the primary output of the circuit will contain the output of modules $m_p$. Thus, the outputs of $m_p$ dominate any set of output of $m_c$. As a result, the parent module demonstrates an erroneous behavior if an error behavior from $m_c$ is observed at the primary output.

Theorem 1 The debugging technique presented in Algorithm 1 finds all equivalent error modules at a given hierarchy level.

Proof: Based on Lemma 1, error modules demonstrate an erroneous behavior only if their parent modules demonstrate an erroneous behavior. This means that if a child module at level $i$ is erroneous, then the debugger will find the parent modules erroneous at levels $j < i$. At any hierarchy level, if the erroneous module or its parents are abstracted, the corresponding spurious solutions will be found. Refinement ensures that the content of the abstracted modules are reintroduced into the design thus allowing the erroneous module to be identified. Since all equivalent error locations cannot be distinguished for a given set of vectors, all equivalent error modules will be found.

C. Hierarchical abstraction

This function abstraction scheme introduced in the previous section is most effective when it is used in a hierarchical manner, where module-based debugging can be applied at each hierarchy level [13], [15]. At each level $i$ of the hierarchy, the design can be represented by a set of functions $\{F^1_i, F^2_i, \ldots\}$. As discussed in the previous section, the iterative sequence of abstraction, debugging and refinement is applied until no spurious solutions are found. Thus at each level the abstracted functions in $C'$ and their children in the hierarchy tree $H$ are not required for debugging. In other words, it is not necessary to consider the internal logic of abstracted functions in $C'$ at level $i$ when debugging at the level $i + 1$ as presented in Theorem 2.

Theorem 2 After applying Algorithm 1, only the modules at level $i$ present in design $C'$ are necessary for debugging at level $j \geq i + 1$.

Proof: By the contrapositive of Lemma 1, no errors can be encapsulated in child modules if the parent modules are not refined at level $i$. Thus, to debug errors at level $j \geq i + 1$, it is not necessary to consider abstracted modules at level $i$.

Apart from the functions abstracted at level $i$, more functions can be abstracted at level $i + 1$. Intuitively, at each level of the hierarchy, functions are comprised of one or more sub-functions. This allows for sub-functions to be abstracted at lower levels of hierarchy even though their collection cannot be abstracted at a higher level. For example, consider Figure 3 where an error resides in $F^2_2$. At level 1, function $F^1_1$ cannot be abstracted since it contains the error, however, at level 2 sub-function $F^2_2$ may be abstracted since it is independent from $F^2_2$ and its output.

D. Multiple Errors

The module-based abstraction and refinement methodology described above is capable of debugging multiple error locations. The process for finding multiple error locations is similar to that in [11]. In that process, the cardinality used to find the errors starts at one and increases as the error location is not found. However, when spurious solutions are found and the corresponding module is refined, the error cardinality must be reset to 1. For function abstraction, the same process must be followed since resetting the cardinality is crucial for maintaining correctness.

Consider the example in Figure 4 where the modules $Abs_1 = \{F^1_2, F^1_3\}$ are abstracted at level 1. The abstraction results in the removal of modules $F^1_3$ and $F^3_3$ as well because they fan-in to $Abs_3$. The pre- and post- abstraction circuits are shown in Figure 4 (a) and (b), respectively. Assuming that the error is in module $F^3_3$, the error effect can propagate to the output of $Y_1$ and $Y_2$. Under those conditions, the debugger will not identify a single error module, but will find the error pair of $X_{F^3_3}$ and $X_{F^3_3}$. Through refinement, these modules and their fanin circuitry will be re-introduced in the circuit. At this point, the number of error modules the debugger seeks must be reset to 1, otherwise, the single error source inside $F^3_3$ will be missed.

E. Overall algorithm

Hierarchical debugging and function abstraction can be combined to give further gains when debugging a design. Hierarchy-based debugging can identify erroneous modules in a divide and conquer approach while abstraction simplifies each step of the problem. The overall proposed technique for function abstraction is shown in Algorithm 2. Here the
debugging problem is solved iteratively by descending the hierarchy while the details of abstraction and refinement are performed as require at each level by Module Debug according to Algorithm 1.

IV. EXPERIMENTS

This section presents experimental results for the proposed high level abstraction and refinement methodology. All the circuits used are Verilog designs from the OpenCores.org website [16] except for an industrial communication design (comm). Each circuit contains a functional level error such as an incorrect statement, incorrect module instantiation, bad wiring between modules, etc. These RTL errors typically represent tens or hundreds of gate-level errors. The debugger used in all experiments is the module-aware SAT-based automated debugger of [13]. This set of experiments is conducted on a 64 bit Intel Core 2 Quad processor with 2.66 GHz and 8GB of memory.

Table I presents a summary of the circuits and the statistics using SAT-based debugging [13]. Columns one, two, and three show the name of the debugging problem based on the design, and its size in terms of gates and state elements (DFFs), respectively. Column four presents the length of the erroneous trace in terms of clock cycles required to observe the erroneous behavior from an initial state. When the trace is too long for the debugger, the trace is reduced to only contain the last 25 or 40 transitions in order to make automated debugging feasible. The number of clock cycle traces used to formulate the debugging problem are presented in the parentheses in column four. The column # literals, time (s), and peak mem (M) present the benefit of the proposed technique in terms of the number of literals required in the problem formulation, the total run time in seconds and peak memory requirement by the entire algorithm.

The improvement provided by the proposed technique when compared to a state-of-the-art method such as this of [13] is shown in the last columns of Table II. Its effectiveness is attributed to reducing the problem size which is directly related to the number of literals. For example, in vga1 where 5 / 14 modules are used, it leads to 630.48× reduction in literals which results in a 260.89× improvement in run time and 27.17× reduction in overall memory requirement. For all problems, the number of refinement and debugging iterations performed is larger than one. Therefore, it is clear that each iteration is much easier and faster when abstraction is used, thus it is more advantageous to run more iterations on easier problem than fewer iterations on harder problems.

Table II presents the result of the proposed technique where initially all functions are abstracted. In other words, we rely on refinement to re-introduce all the circuitry required to debug the design. Column one shows the names of the problems, while column two shows the maximum error cardinality (maxN) required to solve the debugging problem. As discussed in Section III-D the cardinality required to locate the bug using an abstracted design can be larger than required to solve the original problem. Even though the problems shown here have a single functional-level (RTL) error, for the problem comm1 and comm3 a higher cardinality of 2 is used by the algorithm to find the error site.

### Algorithm 1

**Hierarchical Debugging**

1. Solutions = ∅, level = 0, N = 1, C’ = C
2. while (1) do
3.  level = level + 1
4.  (New_sols, C’) = Module Debug(C’, level, N)
5.  if New_sols = ∅ then
6.     return Solutions
7.  else
8.     Solutions = Solutions ∪ New_sols
9.   end if
10. end while

### Table I

**SUMMARY OF PROBLEMS FOR FUNCTION ABSTRACTION**

<table>
<thead>
<tr>
<th>design</th>
<th>Problem statistics</th>
<th>Debugger engine [13]</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>size</td>
<td># literals</td>
</tr>
<tr>
<td>wb_com1</td>
<td>81K</td>
<td>19 (19)</td>
</tr>
<tr>
<td>wb_com2</td>
<td>81K</td>
<td>189 (40)</td>
</tr>
<tr>
<td>fdct1</td>
<td>564K</td>
<td>189 (40)</td>
</tr>
<tr>
<td>mem ctrl</td>
<td>39K</td>
<td>1145 (38)</td>
</tr>
<tr>
<td>vga1</td>
<td>147K</td>
<td>1610 (40)</td>
</tr>
<tr>
<td>vga2</td>
<td>197K</td>
<td>17712</td>
</tr>
<tr>
<td>comm1</td>
<td>450K</td>
<td>19 (25)</td>
</tr>
<tr>
<td>comm2</td>
<td>454K</td>
<td>88 (25)</td>
</tr>
<tr>
<td>comm3</td>
<td>26852</td>
<td>138 (25)</td>
</tr>
</tbody>
</table>

In Table II, the column labeled # itr states the number of refinement and debugging iterations required to find all equivalent locations (number of times line 5 of Algorithm 1 is run). The column mod refined / total presents the number of modules refined out of the total number of modules in the concrete design. These modules are the only ones required to diagnose the error. The smaller this number is, the more effective is the abstraction technique. The next three columns, # literals, time (s), and peak mem (M) present the benefit of the proposed technique in terms of the number of literals required in the problem formulation, the total run time in seconds and peak memory requirement by the entire algorithm.

In Table II there are two problems that experience a slow-down. For problem fdct1, six iterations are required to solve the problem, at which stage all 5 modules are used. Thus in this case, the extra iterations simply add overhead as the entire circuit is needed in order to solve the problem. The problem vga2, also experiences a slow down, but in this case, a 2.26× reduction in memory is observed. In this case, unlike the overall trend, the simpler and faster debugging problems cannot compensate for the extra iterations performed.

Figure 5 (a), (b) and (c) provide detail into the numbers of Table II for vga2, fdct1 and comm1, respectively. These figures illustrate the relationship between the run time shown in solid line and the number of literals shown in dashed line against the refinement and debugging iterations. Notice the general trend where both run time and number of literals appear to increase exponentially with the increase in the number of iterations. For the majority of cases where the proposed technique is effective, abstraction allows the problem to be solved with a fraction of its size thus leading to smaller memory requirements and run times. Considering problem vga2, notice that for iterations 3,4,5 the solve time is quite high thus not providing any run time benefit.
In general, it is found that more aggressive abstraction leads to more debugging and refinement steps. However, due to the simplicity of the design when abstracted aggressively, the initial debugging and refinement iterations are relatively much easy problems and thus quicker to solve. This behavior is observed in Figure 5 where the initial iterations have a faster run time than later ones. It may be possible to find an abstraction heuristic that can balance the number of iterations and the functions abstracted, but this is not a trivial task. However, in these experiments, it is found that simply abstracting all functions and modules is quite effective as it results with the smallest problems possible.

V. Conclusion

This work presents abstraction and refinement techniques for automated debugging that leverage the high level information contained in RTL design. More specifically, functions of the designs are first abstracted resulting in smaller debugging problems. To ensure that all the equivalent error locations are found in the original design, a refinement process is performed. The performance of the methodology is enhanced as it is applied hierarchically to the RTL design. The experiments demonstrate two orders of magnitude of performance improvement over a conventional debugging.

REFERENCES