## Body Biased Sense Amplifier with Auto-Offset Mitigation for Low-Voltage SRAMs

Dhruv Patel, Student Member, IEEE, Adam Neale, Member, IEEE, Derek Wright, Member, IEEE, and Manoj Sachdev, Fellow Member, IEEE

Abstract—This paper proposes a Differential-Input Body Bias Sense Amplifier (DIBBSA) with an auto-offset mitigation feature suitable for low-voltage SRAMs where the differential bitline signals are applied to the sources as well as to the body of the critical sensing transistors. We simulated and fabricated the proposed DIBBSA architecture with various operational modes in 65-nm CMOS technology to analyze body biasing's effectiveness in mitigating the offset. The standard deviation of offset ( $\sigma_{OS}$ ) was measured over 5120 SAs in 10 ICs. The iso-gate area reduction in  $\sigma_{OS}$  for the proposed DIBBSA-FL and DIBBSA-PD modes resulted in 68.1% and 61.9% compared to conventional Current Latch SA (CLSA) and 24.1% and 18.1% compared to Voltage Latch SA (VLSA) at 0.4 V supply and 25 °C, respectively. Carried out measurements on 512 SAs in an IC show the minimum required differential input voltage across the temperature range of 0 °C to 75 °C at 0.4 V is achieved to be 48% lower compared to CLSA and 28% lower compared to VLSA by both the DIBBSA-FL and DIBBSA-PD modes.

*Index Terms*— Offset Cancellation, Static Random Access Memory (SRAM), Dynamic Body Biasing, Threshold Voltage Mismatch, Comparator, Variation Tolerant Circuits, Latch.

#### I. INTRODUCTION

pplications with stringent energy constraints such as health **L**monitoring, Internet of Things (IoT), bio-implantable, wearable, and many other battery-operated devices are the impetus for low-energy, low-voltage, and reliable System on Chips (SoCs). Such SoCs are primarily occupied by the Static Random Access Memory (SRAM) used for cache data storage [1]. Therefore, SRAM is a critical block that determines the performance, yield and reliability of SoCs. A Sense Amplifier (SA) is a critical SRAM circuit responsible for faithfully amplifying and digitizing the data signal sensed from the SRAM cell [2]. The key SRAM performance metrics, such as minimum supply voltage, minimum read access time, and power consumption, significantly rely on the SA's performance [3, 4]. Several key characteristics of SA, such as minimum differential input voltage ( $\Delta V_{BL-min}$ ), power consumption, and sensing delay, are the most important [5].

Nanoscale CMOS technologies' aggressive downscaling poses a significant design challenge for reliable low-voltage operation due to increased random mismatch variations and leakage [6-10]. The SA design's primary challenge is to make the correct decisions with the smallest possible  $\Delta V_{BL-min}$  while tolerating any adverse mismatch conditions. Minimizing

 $\Delta V_{BL-min}$  improves overall SRAM read sensing delay as it would take less time to discharge highly capacitive bit-lines; thus, this would also reduce energy consumption. The smallest possible  $\Delta V_{BL-min}$  is mainly determined by the SA input offset,  $V_{OS}$  [3, 12-15]. The work from Abu Rahma et al. [16], implementing 16 Mb SRAM in 28 nm technology, highlighted that a single mV of increase in the standard deviation of SA offset distribution ( $\sigma_{OS}$ ) requires 10 mV of additional bitline discharge to maintain the read yield threshold of 97%. Hence, the  $V_{OS}$  of the SA significantly affects the overall SRAM performance.

The overall  $V_{OS}$  of a SA is an aggregate of the mismatches in the threshold voltages  $(V_T)$ , the drain currents, the gain factors, and the physical SA layout [8,17,18]. However, the  $V_T$  mismatch is the most significant factor determining the  $V_{OS}$  [19-24]. Strategies to compensate for SA's  $V_T$  mismatches fall into two categories: calibration techniques and offset compensation techniques. The simplest way to reduce  $V_{os}$  is by increasing the size of devices [25]; however, the consequences are increased die area, bit-line loading, and power dissipation. One approach is to add additional devices to provide: a feedback mechanism to reduce the sensitivity to  $V_T$  mismatches [26-29]; or calibrate the offset, either dynamically [18, 24, 30-33] or with postprocess trimming [34-37]. Another alternative is to use multistage timing, where the connections between the SAs are changed to reduce  $V_T$  mismatches [38-40]. While there are many potential solutions to the offset problem, they incur added costs in die area, power dissipation, or design complexity. Many researchers have exploited the body terminal to fine-tune circuit performance in CMOS technologies [41-53]. However, few researchers have used the body terminal to calibrate and mitigate the  $V_{OS}$  of SAs [54, 55], the cost being increased circuit area and required periodic recalibration to mitigate ageing effects on  $V_{OS}$ .

This paper builds on our previous work [56] and presents a Differential Input Body Biased Sense Amplifier (DIBBSA) that is suitable for low-voltage SRAMs, does not require calibration, and can compensate the  $V_{OS}$  dynamically. It is compatible with most symmetrical and fully-differential 6T, 8T, and 10T SRAM cells. The rest of the paper is organized as follows: Section II provides a detailed operation of the DIBBSA. Section III analyzes the offset tolerance and estimation. Section IV analyzes the sensing delay and power consumption. Section V analyzes the effectiveness of body bias across the supply voltage. Section VI and VII show the test chip implementation and

This work was supported in part by NSERC under grant NSERC-RGPIN-205034-2012 052714.

Dhruv Patel was with the University of Waterloo and is now with the Electrical and Computer Engineering Department, University of Toronto, Toronto, ON M5S 3G4 (e-mail: dhruv.patel@isl.utoronto.ca).

Adam Neale was with the University of Waterloo and is now with Intel, Hillsboro, OR 97124, USA (e-mail: adam.neale@gmail.com).

Derek Wright, and Manoj Sachdev are with the Electrical and Computer Engineering Department, University of Waterloo, Waterloo, ON N2L 3G1. (e-mails: derek.wright@uwaterloo.ca, msachdev@uwaterloo.ca).

measurement results. Section VIII makes the comparison of the proposed DIBBSA with the state-of-the-art SAs. Finally, Section IX concludes the paper.

## II. DIFFERENTIAL INPUT BODY BIASED SENSE AMPLIFIER

#### A. Body Bias and Threshold Voltage Shift

Body biasing's role is that the source to body voltage  $(V_{SB})$  of a transistor can manipulate  $V_T$ . The threshold voltage can be modeled by (1), where  $V_{T0}$  is the threshold voltage without body bias (i.e.  $V_{SB} = 0$ ),  $\gamma$  is the body effect coefficient, and  $\varphi$  is the Fermi potential [57].

$$V_T = V_{T0} + \gamma \left( \sqrt{|(-2)\varphi_F + V_{SB}|} - \sqrt{|-2\varphi_F|} \right)$$
(1)

Fig. 1 shows the impact of applying a  $V_{SB}$  on the magnitude of a PMOS transistor's  $V_T$ ,  $|V_{TP}|$ , in 65-nm CMOS. The forward body bias ( $V_{SB} > 0$ ) lowers, and the reverse body bias ( $V_{SB} < 0$ ) raises the  $|V_{TP}|$ , respectively. Simulations of a saturated PMOS transistor at 0.4 V and 1.0 V show, for example, that changing the  $V_{SB}$  from -40 mV to +40 mV lowers  $|V_{TP}|$  by 8 mV and 10 mV, respectively. A judicious application of body bias can partially offset the impact of  $V_T$  mismatches in the SA, which is critical in reducing  $\Delta V_{BL-min}$  [18, 54, 56]. The proposed DIBBSA exploits this technique to dynamically alter the  $V_T$  of critical sensing transistors to reduce  $V_{OS}$ .



Fig. 1. Impact of body bias on the  $V_T$  of the PMOS device in a saturation region. Simulations performed at  $V_{DD} = 0.4$  V and 1.0 V with a  $V_{SB}$  resolution of 1 mV.

#### B. DIBBSA Design and Operation

Fig. 2 (a) and (b) shows the schematic of the proposed DIBBSA-FL (7T) and DIBBSA-PD (9T) modes. The only difference between the DIBBSA-FL and DIBBSA-PD is that the latter has additional N3/N4 transistors to predischarge (PD) output nodes, whereas the former only equalizes the output nodes with N5 leaving the outputs float (FL) before the sensing phase. We describe the DIBBSA operation for the DIBBSA-PD mode that enables both body bias and predischarge.

The transistors P1/P2 are connected to the bitlines, BL/BLB while their bodies are connected to BLB/BL, respectively. Transistors N1/N2, together with P1/P2, complete the latching element. Transistors P3/P4 act as switches and provide operational control of the DIBBSA-PD. Arguably, P3/P4 can be merged with the column select signal ( $Y_{mux}$ ) of the SRAM

architecture; however, in this arrangement, it provides an additional benefit as their respective sources and body terminals receive BL/BLB inputs. The N5 transistor equalizes the OUT/OUTB before the sensing phase.



Fig. 2. Schematic of the proposed (a) DIBBSA with floating output nodes (DIBBSA-FL) and (b) DIBBSA with predischarge output nodes (DIBBSA-PD).

The operation of DIBBSA-PD is as follows. Initially, the SAEB signal is high, which disconnects P3 and P4 and turns-on N3/N4/N5 to equalize outputs, discharging OUT/OUTB to GND. The sensing operation starts once an adequate  $\Delta V_{BL}$ signal is developed. Assuming that  $V_{BL} = V_{DD}$  while  $V_{BLB} = V_{DD} - \Delta V_{BL}$ , then P1 is forward body biased by  $\Delta V_{BL}$ while P2 is reverse body biased by the same voltage. Subsequently, SAEB transitions to '0', activating the DIBBSA-PD and making P3 and P4 similarly forward and reverse body biased, respectively. P1/P3 being forward body biased provides a relatively lower resistance to the OUT node, while P2/P3 being reversed body biased provides a relatively higher resistance to the OUTB node. This arrangement creates a current race scheme where OUT starts to charge up faster than OUTB, amplifying their voltage difference. At some point, with N1/N2 transistors conducting, the regenerative feedback kicks in, converging OUT to  $V_{DD}$  and OUTB to GND.

Fig. 3 (a) shows the implemented DIBBSA architecture in a 65-nm GP CMOS with four different operational modes including its non-body biased Differential Input SA (DISA) modes, DISA-FL/PD selectable with SEL[0:1] control signals. The layout of the proposed DIBBSA-PD without test mode devices (i.e. multiplexers and switches) is shown in Fig. 3 (b). The DIBBSA-FL layout removes N3/N4 devices, further reducing the area. The post-layout parasitic capacitance on each n-well is extracted to be 0.6 fF, two to three orders of magnitude smaller than the typical SRAM bitline capacitance range. These modes are listed in Table I. The effectiveness of dynamic body bias is independently controlled and characterized using the mode selection feature. Comparing the differences in performance between modes per-cell helps isolate the differences due to body biasing by avoiding discrepancies between cells, such as process variations and node capacitances.



(b) Fig. 3. (a) On-chip implemented proposed DIBBSA with different operational test modes for characterization purposes. (b) DIBBSA-PD layout (stripped without test mode selection switches).

C<sub>BLn-well</sub> 0.6 fF

TABLE I BODY BIAS SAS WITH OPERATIONAL TEST MODES FOR CHARACTERIZATION

| SEL1 | SEL0 | Mode Name                    | Mode Description                                                             |
|------|------|------------------------------|------------------------------------------------------------------------------|
| 0    | 0    | DIBBSA-FL<br>(proposed mode) | Signals (BL/BLB) to Body, Outputs<br>equalized but not pre-discharged to GND |
| 0    | 1    | DISA-FL                      | V <sub>DD</sub> to Body, Outputs equalized but not pre-discharged to GND     |
| 1    | 0    | DIBBSA-PD<br>(proposed mode) | Signals (BL/BLB) to Body, Outputs equalized and pre-discharged to GND        |
| 1    | 1    | DISA-PD                      | V <sub>DD</sub> to Body, Outputs equalized and<br>pre-discharged to GND      |

Fig. 4 (a) and (b) shows the conventional CLSA and VLSA, respectively, which are also simulated and fabricated alongside the DIBBSA architecture on the same chip for comparison.

Fig. 5 compares the DIBBSA and conventional architectures with transistor gate area and layout area normalized to the respective area of the VLSA. The lavout area for the DIBBSA-FL/PD includes the area required to provide the n-well contacts. The proposed DIBBSA-FL and DIBBSA-PD result in a 14% and 1.5% reduction in layout area and 23.4% and 15.6% reduction in total transistor gate area than the VLSA.



Fig. 4. On-chip fabricated conventional (a) CLSA and (b) VLSA.



Fig. 5. SA comparison of relative gate and layout area in 65-nm CMOS.



Fig. 6. 1000 Monte-Carlo transient simulation of decision-making OUT/OUTB signals of SAs (a) DISA-FL (b) DISA-PD (c) DIBBSA-FL (d) DIBBSA-PD (e) CLSA (f) VLSA at  $V_{DD} = 0.4 \text{ V}$ ,  $\Delta V_{BL} = -40 \text{ mV}$  and TT/25 °C.

1k Monte-Carlo transient simulations of the DIBBSA modes, VLSA, and CLSA are performed at  $V_{DD} = 0.4 \text{ V}$ ,  $\Delta V_{BL} = -40 \text{ mV}$ , and TT/25 °C as shown in Fig. 6. It reveals that at  $V_{DD} = 0.4 \text{ V}$ , all SAs can make correct decisions under such operating conditions except DISA-PD.



Fig. 7. Transient response of the DISA–PD and the proposed DIBBSA-PD at  $V_{DD} = 0.4$  V and TT/25 °C with (a)  $\Delta V_{BL} = +25$  mV,  $\Delta V_{T:P1-P2} = -30$  mV and  $\Delta V_{T:P3-P4} = -30$  mV. (b)  $\Delta V_{BL} = -25$  mV,  $\Delta V_{T:P1-P2} = +30$  mV and  $\Delta V_{T:P3-P4} = +30$  mV.



Fig. 8. (a) Design concept and (b) the corresponding signals of the proposed DIBBSA-PD realized with symmetric and fully differential 6T SRAM cells and peripheral circuits in 65-nm CMOS technology at  $V_{DD} = 0.4$  V and TT/25 °C.

To distinguish between the DISA and DIBBSA modes, transient simulations are performed with adversely forced  $V_T$  mismatches in the critical sensing transistor pairs P1/P2 (i.e.  $V_{TP1} - V_{TP2} = \Delta V_{T:P1-P2}$ ) and P3/P4 (i.e.  $V_{TP3} - V_{TP4} = \Delta V_{T:P3-P4}$ ). For simplicity, the non-critical N1-N5 transistors are kept perfectly matched and are initially either predischarged or equalized close to GND, leaving them off during the critical amplification phase. The  $\Delta V_{T:P1-P2}$  and  $\Delta V_{T:P3-P4}$  mismatches are explicitly enforced in the undesirable direction by parametrizing the  $V_{TP}$ of the PMOS transistor in the model file. For example, to enforce the  $\Delta V_{T:P1-P2}$  of 30 mV, the  $V_{TP}$  of P1 and P2 would be set as  $|V_{TP1}| = |V_{TP0}| + 15$  mV and  $|V_{TP2}| = |V_{TP0}| - 15$  mV, where  $|V_{TP0}|$  is the magnitude of the nominal  $V_T$  of the PMOS transistor given by the foundry model. Fig. 7 (a) shows the transient simulation comparison between DIBBSA-PD and DISA-PD modes with  $\Delta V_{BL} = -25$  mV and  $\Delta V_{T:P1-P2} = V_{T:P3-P4} = +30$  mV at  $V_{DD} = 0.4$  V and TT/25 °C. Fig. 7 (b) illustrates a complementary situation with the polarity of the mismatch and  $\Delta V_{BL}$  changed. In both Fig. 7 (a) and (b), under such adverse mismatch conditions, DIBBSA-PD with dynamic body biasing makes correct decisions, whereas DISA-PD makes incorrect decisions. Note that the DISA-PD logic '1' resolves at the  $V_{DD} - \Delta V_{BL}$  instead of  $V_{DD}$  when it makes an incorrect decision as the BL/BLB are supplied to the PMOS transistors' source.

The design concept of the proposed DIBBSA in symmetrical and fully differential SRAM cells is similar to that of the conveontional SAs. For example, Fig. 8(a) shows the utilization of the DIBBSA-PD with the conventional 6T SRAM cells along with the cooresponding transient signals shown in Fig. 8(b) at  $V_{DD} = 0.4$  V and TT/25 °C.

## **III.** OFFSET TOLERANCE AND ESTIMATION

To highlight the body bias impact alone on the  $V_T$  shift and how its judicial application in the proposed DIBBSA helps to mitigate the offset related to  $V_T$  mismatch, the total  $V_T$ mismatch ( $V_{TP-mis}$ ) of the critical sensing transistors, P1/P2 and P3/P4 defined by (2) is analyzed.

$$V_{TP-mis} = |\Delta V_{T:P1-P2}| + |\Delta V_{T:P3-P4}|$$
(2)

Fig. 9 shows the distribution of  $V_{TP-mis}$ , extracted by evaluating the  $V_T$  of P1/P2 and P3/P4 from the 5120 Monte-Carlo DC simulations. The simulations are performed with  $V_{DD} = 0.4$  V and in the presence of three different  $\Delta V_{BL}$  conditions on the DIBBSA state just before the SAEB is about to make a 1 $\rightarrow$ 0 transition (i.e. SAEB =  $V_{DD}$ ). The black graph illustrates the  $V_{TP-mis}$  distribution with no bitline signal (i.e.  $V_{BL} = V_{BLB} =$  $V_{DD}$  giving  $\Delta V_{BL} = 0$  mV). This distribution of  $V_{TP-mis}$  at  $\Delta V_{BL} = 0$  mV can be defined as an intrinsic  $V_T$  mismatch where the distribution is approximately centred at 0 V, as expected.



Fig. 9. Distribution of 5120 Monte-Carlo simulations evaluating the  $V_{TP-mis}$  of DIBBSA with  $\Delta V_{BL} = -50$  mV, 0 mV and +50 mV at  $V_{DD} = 0.4$  V. It shows the  $V_{TP-mis}$  distribution shifts towards the correct direction as  $|\Delta V_{BL}|$  is applied.

The red graph depicts when  $V_{BL}$  is 400 mV and  $V_{BLB}$  is 350 mV, resulting in  $\Delta V_{BL} = +50$  mV. In this case, P1/P3 are forward body biased, and P2/P4 are reverse body biased, all by 50 mV. The mean ( $\mu_{VTP-mis}$ ) and standard deviation ( $\sigma_{VTP-mis}$ ) of the red curve are +23 mV and 26.1 mV, respectively. Importantly, this positive shift of the  $V_{TP-mis}$  distribution for  $V_{BL} >$  $V_{BLB}$  indicates a lower  $V_T$  for P1/P3 and higher  $V_T$  for P2/P4 compared to the black graph when  $\Delta V_{BL} = 0$  mV (i.e. no bitline signals applied to body). Hence, P1/P3 have become stronger, and P2/P4 have become weaker than the  $\Delta V_{BL} = 0$  mV case. This further amplifies the voltage difference between OUT and OUTB in the correct direction of the expected digital output. The blue graph shows the complementary case. Hence, the autooffset mitigation feature is achieved.

Several researchers have carried out offset estimation and offset model development for different types of SAs. Singh and Bhat [22] developed an analytical model for a VLSA. The intrinsic offset of the SA is mostly due to the  $V_T$  mismatch between the pair of NMOS latching transistors, while the extrinsic offset is caused by the  $V_T$  mismatch between the PMOS pass transistors. Woo et al. derived a complex offset model for CLSA and VLSA latching that considers secondary transistor effects [23]. In this work, we focus on factors that most significantly impact the  $V_{OS}$ , namely the transistors'  $V_T$  mismatches. Equation (3) models the contribution of the  $V_T$  mismatches within the transistor pairs P1/P2, P3/P4, and N1/N2 on the overall offset voltage of the DIBBSA due to  $V_T$  mismatches,  $V_{OS-T}$ .

$$V_{OS-T} = \boldsymbol{\alpha} |\Delta V_{T:P1-P2}| + \boldsymbol{\beta} |\Delta V_{T:P3-P4}| + \boldsymbol{\gamma} |\Delta V_{T:N1-N2}|$$
(3)

 $\Delta V_{T:P1-P2}$ ,  $\Delta V_{T:P3-P4}$ , and  $\Delta V_{T:N1-N2}$  are the  $V_T$  mismatches within the P1/P2, P3/P4, and N1/N2 pairs, respectively. The  $\alpha$ ,  $\beta$ , and  $\gamma$  are coefficients that determine the contribution of  $\Delta V_{T:P1-P2}$ ,  $\Delta V_{T:P3-P4}$ , and  $\Delta V_{T:P3-P4}$ , respectively, to the overall  $V_{OS-T}$  of the DIBBSA.

Simulations in Fig. 10 (a) show the amount of  $\Delta V_{T:P1-P2}$  mismatch in an undesirable direction that can be applied before making a wrong decision for a given  $\Delta V_{BL}$ . The remaining SA transistors are kept perfectly matched to isolate the effect of P1/P2 mismatch. Simulations are performed at  $V_{DD} = 0.4$  V and the TT/25 °C corner. To derive the offset coefficient,  $\alpha$  associated with the  $\Delta V_{T:P1-P2}$  mismatch, the inverse slopes of the mismatch tolerance plots from Fig. 10(a) are calculated and shown in Fig. 10 (b). Similar offset tolerance simulations and offset coefficient extraction are performed for P3/P4 and N1/N2 pairs as shown in Fig. 10 (c)-(f).

A few conclusions can be drawn from the offset tolerance analysis. N1/N2 are not a significant design target because they can tolerate a much higher mismatch. On the other hand, P1/P2 are the most critical as they are the least tolerable of mismatch, followed by P3/P4. TABLE II summarizes the resulting average values of the  $\alpha$ ,  $\beta$  and  $\gamma$  coefficients from Fig. 10 (b) (d) and (f) in the units of mV  $V_{OS-T}$  per mV  $\Delta V_T$  for each of the DIBBSA modes. Lower offset coefficient values indicate higher tolerance in a given transistor pair. DIBBSA-FL/PD have the lowest coefficient values when compared with DISA-FL/PD, indicating that body biasing is mitigating the overall voltage offset.



Fig. 10. Mismatch tolerance and derived offset factors of various modes of the proposed SA architecture across  $\Delta V_{BL}$  at  $V_{DD} = 0.4$  V, TT/25 °C and  $F_{CLK} = 3.33$  MHz. (a) P1/P2:  $\Delta V_{T:P1-P2}$ , (b)  $\alpha$  offset factor (c) P3/P4:  $\Delta V_{T:P3-P4}$ , (d)  $\beta$  offset factor, (e) N1/N2:  $\Delta V_{T:N1-N2}$ , and (f)  $\gamma$  offset factor

 TABLE II

 EXTRACTED OFFSET COEFFICIENTS FOR THE PREDICTION OF  $V_{OS-T}$  FOR

 OPERATIONAL TEST MODES OF THE DIBBSA ARCHITECTURE

| Offset Coefficients (units of mV $V_{OS-T}$ per mV $\Delta V_T$ ) | α     | β     | ¥     |  |  |  |  |
|-------------------------------------------------------------------|-------|-------|-------|--|--|--|--|
| DISA-FL                                                           | 0.584 | 0.370 | 0.199 |  |  |  |  |
| DISA-PD                                                           | 0.545 | 0.407 | 0.187 |  |  |  |  |
| DIBBSA-FL                                                         | 0.520 | 0.340 | 0.181 |  |  |  |  |
| DIBBSA-PD                                                         | 0.491 | 0.366 | 0.184 |  |  |  |  |

To further testify the benefit of dynamic body biasing, the mismatch offset tolerance analysis for critical P1/P2 and P3/P4 pairs is extended across various global corners at fixed  $\Delta V_{BL} =$  40 mV and is shown in Fig. 11. It shows that across all global corners, DIBBSA-FL and DIBBSA-PD tolerates higher mismatch when compared to DISA-FL and DISA-PD, respectively.



Fig. 11. Mismatch offset tolerance across various process corners for DIBBSA modes at  $V_{DD} = 0.4 \text{ V}, \Delta V_{BL} = 40 \text{ mV}$  and  $F_{CLK} = 3.33 \text{ MHz}.$ 

#### A. Offset tolerance across SAE pulse width

The sensitivity of the critical transistor pair's mismatch tolerance to the SAEB pulse width (the sensing window) is analyzed. Fig. 12 (a) shows  $\Delta V_{T:P1-P2}$  mismatch tolerance versus SAE pulse width, with  $\Delta V_{BL} = 40$  mV,  $V_{DD} = 0.4$  V, and TT/25 °C. All transistors are noise enabled, and 5 mV<sub>rms</sub> of white gaussian noise is superimposed on  $V_{DD}$ , SAEB, and BL/BLB to validate the offset tolerance more rigorously. Fig. 12 (b) shows a similar analysis for  $\Delta V_{T:P3-P4}$  mismatch. Both Fig. 12 (a) and (b) show higher tolerance for the proposed DIBBSA-FL/PD modes than their corresponding non-body biasing DISA-FL/PD modes.



Fig. 12. (a)  $\Delta V_{T:P1-P2}$ , and (b)  $\Delta V_{T:P3-P4}$  mismatch tolerance of various modes of the proposed SA architecture across SAEB pulse width at  $V_{DD} = 0.4 V$ ,  $\Delta V_{BL} = 40$  mV, and TT/25 °C. Simulations are performed with 5 mV<sub>rms</sub> noise superimposed on  $V_{DD}$ , SAEB, and BL/BLB, and transistor noise enabled.

## IV. SENSING DELAY AND POWER ANALYSIS

#### A. SA Sensing Delay Simulations

The sensing delay is analyzed with each SA having the identical load and input driving conditions applied. The sensing delay is measured from the 50% rise of the SAEB to the 50% rise of the buffering inverter outputs following OUT/OUTB. Simulation results shown in Fig. 13 are performed by fixing  $\Delta V_{BL} = -40$  mV at TT/25 °C with the transistors being perfectly matched. Fig. 14 shows a snapshot of the sensing delay at  $V_{DD} = 0.4$  V and highlights the proposed DIBBSA-FL/PD modes outperforming the DISA-FL/PD modes by about 10%. The conventional CLSA showed almost similar sensing delay compared to DIBBSA-FL/PD modes, and the conventional

VLSA performed with 44% lower sensing delay than the proposed DIBBSA-FL/PD modes. Nevertheless, the sensing delay of an SA is a relatively small fraction of the overall SRAM read path delay. The majority of the overall delay primarily comes from discharging the highly capacitive bitlines dictated by the SA  $V_{OS}$  worst-case requirement, and hence, reducing the SA  $V_{OS}$  is most important [3, 58].



Fig. 13. Simulated sensing delay trend comparison of SAs across  $V_{DD}$  at  $\Delta V_{BL} = -40$  mV and TT/25 °C.



Fig. 14. Simulated sensing delay comparison of SAs at  $V_{DD} = 0.4$  V,  $\Delta V_{BL} = -40$  mV and TT/25 °C.

#### B. SA Power Consumption Simulations

The power consumption of the SAs is also analyzed. The static power consumption of the SA ( $P_{st-SA}$ ) in low-voltage, energylimited applications is an essential metric for appropriate SA circuit selection since SRAMs predominately stay in an idle state. Fig. 15 shows the  $P_{st-SA}$  comparison across  $V_{DD}$  and indicates that the VLSA has the highest  $P_{st-SA}$ , whereas DIBBSA-FL and DISA-FL have the lowest. The higher  $P_{st-SA}$ of the VLSA is mainly from its relatively large devices, especially the higher W/L ratio of the footer SAE transistor, N3. Fig. 16 illustrates a snapshot of the  $P_{st-SA}$  at  $V_{DD} = 0.4$  V where the DIBBSA-FL and DIBBSA-PD consume 62% and 52% lower  $P_{st-SA}$  than the VLSA, and 42% and 26% lower  $P_{st-SA}$  than the CLSA, respectively.



Fig. 15. Simulated static power consumption of SAs across V<sub>DD</sub> at TT/25 °C.



Fig. 16. Simulated static power consumption comparison of SAs at  $V_{DD} = 0.4$  V and TT/25 °C.

The plot in Fig. 17 shows the total average power consumption  $(P_{AV-SA})$  of the SAs across  $V_{DD}$  while fixing  $\Delta V_{BL} = -40$  mV at  $F_{CLK} = 3.33$  MHz and TT/25 °C. The trendlines indicate that the CLSA has the highest  $P_{AV-SA}$ , whereas both DIBBSA-FL and DISA-FL have the lowest. The snapshot and breakdown of the  $P_{AV-SA}$  at  $V_{DD} = 0.4$  V are shown in Fig. 18. At 0.4 V supply, The DIBBSA-FL consumes 12% and 36% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The Proposed DIBBSA-PD consumes 0.7% and 30% lower  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The  $P_{AV-SA}$  compared to VLSA and CLSA, respectively. The  $P_{AV-SA}$  supply rail where applicable. As expected, the SAEB loading contribution is the least since it is only connected to the transistor gates, whereas the BL/BLB contribution is the most as they act as a supply for the SA via the sources of P1/P2.



Fig. 17. Simulated total average power consumption trend comparison of SAs across  $V_{DD}$  at  $\Delta V_{BL} = -40$  mV,  $F_{CLK} = 3.33$  MHz and TT/25 °C.



Fig. 18. Simulated total average power consumption breakdown comparison of SAs at  $V_{DD} = 0.4$  V,  $\Delta V_{BL} = -40$  mV,  $F_{CLK} = 3.33$  MHz and TT/25 °C.

The SA dynamic power consumption,  $P_{dy-SA}$ , is derived at  $V_{DD} = 0.4 \text{ V}$ ,  $F_{CLK} = 3.33 \text{ MHz}$ , and TT/25 °C by subtracting  $P_{st-SA}$  from  $P_{AV-SA}$ , and is shown in Fig. 19 with a contribution breakdown. Compared to the VLSA, the total  $P_{dy-SA}$  of the DIBBSA-FL is 2% lower, whereas the DIBBSA-PD is 12%

higher. One may trade-off some sensing delay to reduce  $P_{dy-SA}$  by altering the device sizes, but it must be done without impacting the offset voltage negatively. Nevertheless, the  $P_{dy-SA}$  is a small fraction of the total dynamic read power ( $P_{dy-read}$ ) of the SRAM read access, mainly due to discharging the highly capacitive bitlines [24, 33, 59, 60]. Ultimately, the offset reduction benefits of the DIBBSA reduce the overall  $P_{dy-read}$ , described in Section X.



Fig. 19. Simulated dynamic power consumption breakdown of SAs at  $V_{DD}$  = 0.4 V,  $\Delta V_{BL}$  = -40 mV,  $F_{CLK}$  = 3.33 MHz and TT/25 °C.

## V. BODY BIAS EFFECTIVENESS ACROSS SUPPLY

An appropriate supply voltage range for optimum sensing yield of the DIBBSA architecture is analyzed. It is determined by examining the differential drive strength of the current sensing branches at the critical sensing instant as a function of the supply voltage. The initial condition when the SAEB signal makes the  $1 \rightarrow 0$  transition is critical in the SA outputs converging towards the correct trajectory and enabling the positive feedback mechanism. At this instant, OUT and OUTB are at GND; consequently, N1 and N2 are off and do not participate in setting the initial SA decision trajectory. Therefore, only transistors P1 through P4 responsible for this trajectory are shown in the equivalent circuits of Fig. 20 (a) and (b) for DISA-PD and DIBBSA-PD, respectively. The P3/P4 drains and P1/P2 gates are connected to GND since OUT and OUTB are predischarged at this instant. DC simulations are performed for these two equivalent circuits by sweeping  $V_{DD}$  while keeping  $\Delta V_{BL}$  = +25 mV and assuming perfectly matched devices. The transconductance model parameter of each transistor,  $g_m$ , is extracted. Finally, the differences are calculated to find the differential transconductance,  $|\Delta g_m|$ , which indicates the drive strengths for the P1/P2 and P3/P4 pairs. The  $|\Delta g_m|$  trend is plotted on the left y-axis in Fig. 20 (c) as the key sensing strength indicator. The pink curve plotted on the right y-axis in Fig. 20 shows the  $\frac{\Delta V_{BL}}{V_{DD}}$  ratio, indicating relative bitline signal strength.

The main takeaway from Fig. 20 (c) is that the maximum differential drive strength is achieved at around  $V_{DD} = 0.5$  V for both the equivalent circuits of DIBBSA-PD and DISA-PD. This analysis suggests that these SAs should provide the best sensing yield at  $V_{DD}$  from 0.4 V to 0.7 V. Moreover, the  $\frac{\Delta V_{BL}}{V_{DD}}$  ratio emphasizes that higher  $V_{DD}$  alone can lead to a higher  $g_m$ , but not necessarily to a higher  $|\Delta g_m|$ . Instead, the  $\frac{\Delta V_{BL}}{V_{DD}}$  ratio must be increased for a higher  $|\Delta g_m|$ , which necessitates a higher  $\Delta V_{BL}$ 

and, therefore, higher power. Also, the P1/P2 pair's  $|\Delta g_m|$  is lower than the P3/P4 pair's because P1/P2 are operating in the triode region, whereas P3/P4 are operating in the saturation region at that instant of interest. Lastly, owing to body biasing, the peak  $|\Delta g_m|$  of the DIBBSA-PD is higher than the DISA-PD.



Fig. 20. (a) Equivalent circuit of DISA-PD (b) Equivalent circuit of DIBBSA-PD just when their SAEB signal makes the  $1 \rightarrow 0$  transition. (c) The simulated magnitude of differential transconductance,  $|\Delta g_m|$ , of the transistor pair from (a) and (b) across  $V_{DD}$  is plotted on left y-axis. The pink curve plotted on the right y-axis shows  $\frac{\Delta V_{BL}}{V_{DD}}$  ratio. Simulations are performed with  $\Delta V_{BL} = 25$  mV at TT/25 °C.

#### VI. TEST CHIP AND MEASUREMENT SETUP

Arrays of SAs are implemented on a 65-nm GP CMOS technology to measure the offset characteristics, similar to previous works [16, 21, 61]. A test chip with an SA array architecture shown in Fig. 21 is designed with 64 rows, each having 8 SAs of DIBBSA, VLSA and CLSA. Thus, a single test chip contains 512 SAs of each type that can be individually accessed with a row and a column decoder. Each row slice contains a common precharge and keeper circuit, a row selection driver, hold-path buffers, pull-down NMOS transistors, and a level shifter with a transparent latch. The BL/BLB signals are routed vertically throughout the array where  $\Delta V_{BL}$  is achieved by setting appropriate voltages at the BL and BLB analog pads (e.g.,  $V_{BL} = V_{DD}$ ,  $V_{BLB} = V_{DD} - \Delta V_{BL}$ ). All the address and control signals for the mode selection transistors operate under a fixed 1 V supply, whereas the rest of the circuitry, including SAs, operate under the desired  $V_{DD}$ . The fabricated test chip micrograph is shown in Fig. 22.

Fig. 23 (a) illustrates the test bench schematic, and Fig. 23 (b) shows the laboratory setup. CLK, ADDR and SEL signals are provided by the Tektronix DGA-200 data generator, and  $V_{DD}$ , BL, and BLB are provided by the benchtop precision power supplies. A Tektronix TLA-5101 logic analyzer captures DOUT. A temperature chamber controls the temperature of the test chip and the PCB.



Fig. 21. Test chip architecture for characterizing SA offset.

|              | think of |             | Col |             | Dec         |             |
|--------------|----------|-------------|-----|-------------|-------------|-------------|
| ——— 345 µm — |          | 512<br>VLSA | 512 | DIBESA/SISA | 512<br>CLSA | Row Decoder |
| ⊥ 📕          | JU;      |             |     | 170         | um          | 1.1.000     |

Fig. 22. Micrograph of the 65-nm GP CMOS test chip.



Fig. 23. (a) Test bench setup schematic. (b) Laboratory test bench setup.

## VII. MEASUREMENT RESULTS

All the offset measurements are performed across 10 ICs with the hysteresis stress test pattern of  $\Delta V_{BL}$  and SAEB shown in Fig. 24. The hysteresis stress test pattern has two phases. The first phase is the stress phase, where a high negative  $\Delta V_{BL}$ relative to the desired  $\Delta V_{BL}$  under test is applied to ensure that all the SAs flip to the opposite state. The second phase is the test phase, where the desired  $\Delta V_{BL}$  is applied, and the SA yield is recorded. For example, if the SA yield is to be measured for  $\Delta V_{BL} = 10 \text{ mV}$ , then in the stress phase, a  $\Delta V_{BL}$  of -70 mV is first applied to flip all SAs flip to logic '0'. Then in the test phase, the desired  $\Delta V_{BL}$  of 10 mV is applied, and the yield of SAs that output the correct logic '1' is recorded. This procedure stimulates any adverse hysteresis effect caused by asymmetric parasitics and accounts them in the measurements of offset's standard deviation,  $\sigma_{OS}$  and the mean,  $\mu_{OS}$ ; which are the key metrics for SA's offset variation and skew, respectively.



Fig. 24. Hysteresis stress test pattern of  $\Delta V_{BL}$  used for SA  $V_{OS}$  characterization.

Fig. 25 shows the measured percentage of the SAs flipped to logic '1' across  $\Delta V_{BL}$  at  $V_{DD} = 0.4$  V composing a cumulative distribution function (CDF). These measurements are performed across 10 ICs, totalling 5120 SAs of each type at 25 °C, where  $\Delta V_{BL}$  is applied with a 1 mV resolution using the hysteresis stress test pattern. The resulting probability density function (PDF) curves are presented in Fig. 26. They represent the portion of SAs flipped to logic '1' per  $\Delta V_{BL}$ . The standard deviation of the measured PDF of each SA type is computed and indicated as  $\sigma_{OS}$ . The  $\sigma_{OS}$  of DISA-PD and DISA-FL is measured to be 11.8 mV and 11.7 mV, respectively. The conventional VLSA and CLSA performed with the  $\sigma_{OS}$  of 11.3 mV and 18.0 mV, respectively. Both the DIBBSA-FL/PD achieved the measured  $\sigma_{OS}$  of 10.2 mV; which is 9.7 % lower than VLSA and 43% lower than CLSA.



Fig. 25. Measured cumulative distribution showing the SA yield percentage across  $\Delta V_{BL}$ . Measurements are performed at  $V_{DD} = 0.4$  V and 25 °C across 5120 samples (from 10 ICs) of each SA type using the hysteresis stress test pattern of  $\Delta V_{BL}$  with a step resolution of 1 mV.



Fig. 26. Probability density derived from Fig. 25. It shows the portion of SAs flipped to logic 'l' per mV of  $\Delta V_{BL}$  at  $V_{DD} = 0.4$  V and 25 °C.

Similar  $\sigma_{OS}$  measurements are repeated across  $V_{DD}$  from 0.33 V to 1.0 V for all 10 ICs at 25 °C. The  $\sigma_{OS}$  versus  $V_{DD}$  is shown in Fig. 27. It highlights that DIBBSA-FL/PD performed similarly and had lower  $\sigma_{OS}$  compared to VLSA from  $V_{DD}$  of 0.33 V up to 0.7 V, with a minimum at 0.5 V. This optimum  $V_{DD}$  range for the DIBBSA manifests the differential drive strength analysis presented in Section V.



Fig. 27. Measured standard deviation of offset ( $\sigma_{OS}$ ) from 5120 SAs (10 ICs) across  $V_{DD}$  with hysteresis stress test pattern at 25 °C.

The offset distributions parameters,  $\sigma_{OS}$  and  $\mu_{OS}$ , from each of the 10 ICs at  $V_{DD} = 0.4$  V are individually reported in Fig. 28 and Fig. 29, respectively. They show that both  $\sigma_{OS}$  and  $\mu_{OS}$ remained lowest for the DIBBSA-FL/PD, highlighting its offset mitigation feature against within-die and inter-die variations.



Fig. 28. Measured inter-die (die-to-die) standard deviation of offset distribution ( $\sigma_{OS}$ ) from 512 SAs per IC across each of the 10 ICs with hysteresis stress test pattern at  $V_{DD} = 0.4$  V and 25 °C.



Fig. 29. Measured inter-die (die-to-die) mean of offset distribution ( $\mu_{OS}$ ) from 512 SAs per IC across each of the 10 ICs with hysteresis stress test pattern at  $V_{DD} = 0.4$  V and 25 °C.



Fig. 30. Measured minimum required  $\Delta V_{BL} (\Delta V_{BL-min})$  across temperature from 512 cells of IC #1 with hysteresis stress test pattern at  $V_{DD} = 0.4$  V and 25 °C.

Equation (4) gives the minimum  $\Delta V_{BL}$  necessary for all SAs to converge to a correct logic output,  $\Delta V_{BL-min}$ . Fig. 30 shows  $\Delta V_{BL-min}$  measured on IC #1 across temperatures from 0 °C to 75 °C.

$$\Delta V_{BL-min} = \max \begin{pmatrix} |\Delta V_{BLfor\ 100\%} \operatorname{SAs\ in\ Logic\ '1'}|, \\ |\Delta V_{BLfor\ 100\%} \operatorname{SAs\ in\ Logic\ '0'}| \end{pmatrix}$$
(4)

The DIBBSA-FL/PD performed best across the measured temperature range, with  $\Delta V_{BL-min} = 27 \pm 2$  mV, compared to other SAs. On average, this is 28% and 48% lower than the VLSA and CLSA, respectively.

The  $P_{dy-read}$ , given by (5), is analyzed to determine the power savings using DIBBSA-FL/PD. The  $P_{dy-read}$  is composed of bitline discharge,  $P_{dy-BL/BLB}$ , and  $P_{dy-SA}$ , specified in (6) and (7), respectively. The  $\Delta V_{BL-min}$  is computed from (8), where the measured  $\sigma_{0S}$  of an SA is used as derived from Fig. 26. The  $\zeta$ factor given by (9) represents the amount of  $\Delta V_{BL-min}$  required per 1 mV of  $\sigma_{0S}$  to meet a certain read yield for a particular SRAM size in a given technology. Hence, it is a function of bitline capacitance, SRAM cell strength and targeted yield. The concept of  $\zeta$  factor was previously mentioned in [16] for a 16 Mb SRAM in 28-nm CMOS and was measured to be  $\zeta =$ 10 mV/mV for 97% yield. For this analysis in 65-nm CMOS technology, a value of  $\zeta = 8 \text{ mV/mV}$  is chosen as a starting point. The  $P_{dy-SA}$  is used from Fig. 19. Hence, the  $P_{dy-read}$  is computed and its reduction compared to the VLSA is determined for DIBBSA-FL/PD and DISA-FL/PD. This analysis is plotted across a typical range of bitline capacitance,  $C_{BL}$  in Fig. 31. It predicts that DIBBSA-PD provide  $P_{dy-read}$ savings for  $C_{BL} > 25$  fF, where as DIBBSA-FL provide  $P_{dv-read}$ 

savings unrestrictive of  $C_{BL}$ . Similar, analysis is repeated for probable range of  $\zeta$  values where minimum requried  $C_{BL}$  for reduction in  $P_{dy-read}$  compared to VLSA is shown in Fig. 32. It reveals that  $P_{dy-read}$  is assured with DIBBSA-FL without any minimum required  $C_{BL}$ . For DIBBSA-PD, it requires  $C_{BL} > 32$  fF to assure  $P_{dy-read}$  reduction compared to VLSA.

$$P_{dy-read} = P_{dy-BL/BLB} + P_{dy-SA} \tag{5}$$

$$P_{dy-BL/BLB} = C_{BL} \cdot V_{DD} \cdot \Delta V_{BL-min} \cdot F_{CLK}$$
(6)

$$P_{dy-SA} = P_{dy-SA-V_{DD}} + P_{dy-SA-BL/BLB} + P_{dy-SA-SAE}$$
(7)

$$\Delta V_{BL-min} = \zeta \cdot \sigma_{OS} \tag{8}$$

$$\zeta = \frac{\Delta V_{BL-min}}{\sigma_{OS}} \left(\frac{mV}{mV}\right) \tag{9}$$







Fig. 32. Minimum  $C_{BL}$  crossover point for  $P_{dy-read}$  improvement greater than 0% compared to VLSA across various values of  $\zeta$ .

#### VIII. COMPARISON WITH STATE-OF-THE-ART

Table III compares state-of-the-art SAs to the proposed DIBBSA-FL/PD and other recent SAs with offset mitigation features in similar planner CMOS technologies. The SA from [33] with multi-phase MOS capacitor-based  $V_T$  matching implemented in 28 nm offers a 50% offset improvement at  $V_{DD} = 0.5$  V compared to the VLSA. However, it takes an additional 4 transistors, 2 MOS capacitors, and 5 inverters, resulting in 3.2% area overhead in overall SRAM. [54], implemented in 65-nm CMOS, offers a 49% offset improvement compared to the CLSA, but with many additional devices per SA for body-bias-based offset calibration at startup; which results in a 3.5% overall SRAM area increase. [40] considers boosting the VLSA's differential and common mode  $\Delta V_{BL}$  by adding a switched capacitor-based boosting circuit, providing a 23% offset improvement at  $V_{DD} = 0.3$  V, but at the cost of a 12% SA area overhead and increased complexity in SA timing circuits. [62] simply precharges the CLSA outputs with

 $\Delta V_{BL}$  to achieve 140 mV operation with a 1.7% improvement in SRAM read yield. [61] modifies the CLSA by judiciously precharging internal nodes with  $\Delta V_{BL}$  to achieve a 7.6% iso-gate area improvement in offset at  $V_{DD} = 0.4$  V. With the offset mitigation advantage of the dynamic body biasing with bitlines, the proposed DIBBSA-FL and DIBBSA-PD have reduced layout and total gate area compared to conventional VLSA and CLSA. Similar to [21] and [61], the iso-gate offset improvement of the proposed SAs can further be justified by using the Pelgrom's mismatch model, which gives the standard deviation of the  $V_T$  mismatch between two devices,  $\sigma(\Delta V_T)$ , given as

$$\sigma(\Delta V_T) = \frac{A_{T_0}}{\sqrt{WL}} \tag{10}$$

where  $A_{T_0}$ , W, and L are the technology-related area constant, width, and length of the two devices placed in proximity, respectively. If the additional 23.4% gate area is added to the Table III.

DIBBSA-FL to equal the gate area of the VLSA, as shown in Fig. 5, it results in a further reduction of  $\sigma_{OS}$  by (1 - 1) $\frac{1}{\sqrt{1-0.234}}$  = 14.4%. Hence, the overall iso-gate-area improvement of  $\sigma_{0S}$  of the DIBBSA-FL results in 14.4% + 9.7% = 24.1%. Similarly, the iso-gate-area improvement for the DIBBSA-PD results in 18.1% compared to the VLSA. Table III makes similar comparisons with the CLSA.

It is noted that the body biasing dependence of the DIBBSA relies on the body-effect, which varies across technology and has become less significant as the technology scales down. Nevertheless, so long as the body terminal is available, this proposed DIBBSA-FL/PD can help provide a further reduction in offset with lower layout/gate area in low-voltage operations and hence, is a suitable replacement for the conventional VLSA and CLSA.

|            | 10.13         |              | 1711                     | TT1 1 1                   |  |
|------------|---------------|--------------|--------------------------|---------------------------|--|
| COMPARISON | OF THE PROPOS | ED DIBBSA-FI | L/PD WITH STATE-OF-THE-A | RT OFFSET MITIGATING SAS. |  |

|                                                   | [26]                                                            | [45]                                                           | [31]                                                                                            | [52]                                                                           | [51]                                                                   |                                                                          | This                                                                                                                                                        | work                                                                     | This                                                                                       | work                                                                                                                                                          |     |    |
|---------------------------------------------------|-----------------------------------------------------------------|----------------------------------------------------------------|-------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----|
|                                                   | JSSC 2016                                                       | JSSC 2014                                                      | LSSC 2018                                                                                       | JSSC 2017                                                                      | TCAS-I 2019                                                            |                                                                          | DIBBSA-FL                                                                                                                                                   |                                                                          | DIBBSA-PD                                                                                  |                                                                                                                                                               |     |    |
| Technology                                        | 28-nm HP                                                        | 65-nm LP                                                       | 65-nm GP                                                                                        | 65-nm LP                                                                       | 65-nm GP                                                               |                                                                          | 65-nm GP                                                                                                                                                    |                                                                          |                                                                                            |                                                                                                                                                               |     |    |
| Measured                                          | 2080                                                            | 512                                                            | 8192                                                                                            | 16                                                                             | 51                                                                     | 20                                                                       | 5120                                                                                                                                                        |                                                                          |                                                                                            |                                                                                                                                                               |     |    |
| Samples                                           | (1 IC)                                                          | (1 IC)                                                         | (16 ICs)                                                                                        | (1 IC)                                                                         | (10                                                                    | ICs)                                                                     | (10 ICs)                                                                                                                                                    |                                                                          |                                                                                            |                                                                                                                                                               |     |    |
| Char. Type                                        | Full SRAM                                                       | Full SRAM                                                      | SA array                                                                                        | Full SRAM                                                                      | SA                                                                     | array                                                                    | SA array                                                                                                                                                    |                                                                          |                                                                                            |                                                                                                                                                               |     |    |
| Latch Type<br>Sensing<br>Topology                 | Multi-phase<br>MOS capacitor-<br>based<br>threshold<br>matching | CLSA with<br>Body-bias<br>offset<br>calibration on<br>power-up | Large signal<br>differential and<br>common mode<br>boost, and<br>using bitlines<br>as SA supply | Precharging SA<br>outputs with<br>bitlines                                     | Precharging multiple internal<br>nodes with bitlines                   |                                                                          | Applying bitlines as a SA<br>supply and at the body of the<br>critical sensing transistors. criti<br>Outputs are equalized before the<br>evaluation. equali |                                                                          | Applying bit<br>supply and at t<br>critical sensin<br>Outputs are pre-<br>equalized before | Applying bitlines as a SA<br>upply and at the body of the<br>critical sensing transistors.<br>utputs are pre-discharged and<br>ualized before the evaluation. |     |    |
| # of Devices                                      | 11T+ 2 MOS<br>Caps + 5 INV                                      | 15T + 2 NOR + 2<br>NAND + 3 INV<br>+ 1 Latch                   | 11T + 2 MOS<br>cap                                                                              | 9T                                                                             | 11T                                                                    |                                                                          | 7T                                                                                                                                                          |                                                                          | 9Т                                                                                         |                                                                                                                                                               |     |    |
| Design Effort<br>Overhead                         | None                                                            | Static body<br>biasing +<br>calibration                        | Complex<br>timing                                                                               | None                                                                           | No                                                                     | one                                                                      | Body-biasing with bitlines s                                                                                                                                |                                                                          | th bitlines signal                                                                         | s signals                                                                                                                                                     |     |    |
| SA Area<br>Reduction or<br>Overhead <sup>a.</sup> | 3.2%<br>overhead in<br>overall 128<br>kb SRAM<br>(w.r.t VLSA)   | 3.5%<br>overhead in<br>overall 128<br>kb SRAM<br>(w.r.t CLSA)  | 12% <sup>d.</sup><br>overhead<br>w.r.t<br>VLSA                                                  | 16.8%<br>overhead<br>w.r.t<br>CLSA                                             | 6.5% <sup>c.</sup> ,<br>4.5% <sup>d</sup><br>overhead<br>w.r.t<br>CLSA | 30.7% <sup>c.</sup> ,<br>18.8% <sup>d</sup><br>overhead<br>w.r.t<br>VLSA | 35.9% <sup>c.</sup> ,<br>21% <sup>d.</sup><br>reduction<br>w.r.t<br>CLSA                                                                                    | 23.4% <sup>c.</sup> ,<br>14% <sup>d.</sup><br>reduction<br>w.r.t<br>VLSA | 29.3% <sup>c.</sup> ,<br>16.2% <sup>d.</sup><br>reduction<br>w.r.t<br>CLSA                 | 15.6% <sup>c.</sup> ,<br>1.5% <sup>d.</sup><br>reduction<br>w.r.t<br>VLSA                                                                                     |     |    |
| Offset<br>Improvement                             | 49% <sup>a.</sup><br>w.r.t<br>VLSA<br>@ 0.5 V,<br>85 °C         | 50% <sup>a.</sup><br>w.r.t<br>CLSA<br>@ 1.2 V                  | 23.3% in<br>Std. of offset<br>@ 0.3 V,<br>25 °C                                                 | BER<br>improvement<br>from 2.8%<br>to 1.1% <sup>a.</sup><br>@ 0.14 V,<br>25 °C | 46.6% <sup>f.</sup><br>w.r.t<br>CLSA<br>@ 0.4 V                        | 7.6% <sup>f</sup><br>w.r.t<br>VLSA<br>/, 25 °C                           | 68.1% <sup>f</sup><br>w.r.t<br>CLSA<br>@ 0.4 \                                                                                                              | <b>24.1%</b> <sup>f.</sup><br>w.r.t<br>VLSA<br>7, 25 °C                  | 61.8% <sup>f.</sup><br>w.r.t<br>CLSA<br>@ 0.4 V                                            | <b>18.1%</b> <sup>f.</sup><br>w.r.t<br>VLSA                                                                                                                   |     |    |
| V <sub>DD-min</sub> 25 °C                         | 500 mV                                                          | 370 mV                                                         | 230 mV                                                                                          | 140 mV                                                                         | 260 mV                                                                 |                                                                          | 260 mV                                                                                                                                                      |                                                                          | 330                                                                                        | mV                                                                                                                                                            | 330 | mV |

<sup>a.</sup> Compared to conventional topology implemented in <sup>c.</sup> Total gate area overhead respective work

<sup>b.</sup> Area overhead in 2 Metal layers only and not in

d. Layout area overhead

<sup>e.</sup> Improvement in measured  $\Delta V_{BL-min}$ 

<sup>f.</sup> Iso-gate area improvement in measured  $\sigma_{
m OS}$ 

## IX. CONCLUSION

The judicious application of body bias to the sensing transistors of a sense amplifier can help mitigate offset voltage, which reduces the required discharge of bitlines in SRAMs. We designed, simulated, fabricated, and tested proposed CMOS sense amplifiers, the DIBBSA-FL and DIBBSA-PD, that use the body bias effect to constructively reinforce output convergence to the right decision. The dynamic body biasing applied to the

critical PMOS sensing transistors shifts the threshold voltages, setting the output trajectory in the correct direction towards the expected digital levels. The DIBBSA-FL/PD are the 7T and the 9T designs, respectively, with lower gate and layout area than the conventional VLSA. Importantly, DIBBSA-FL/PD provides iso-gate-area standard deviation of offset reduction by 24.1% and 18.1% compared to VLSA, respectively. This study also showed that the DIBBSA-FL/PD works reliably at  $V_{DD} = 0.4$  V from 0 °C to 75 °C. Ultimately, for low-voltage applications,

DIBBSA-FL/PD can replace conventional SAs in SRAMs implemented in planner CMOS technologies.

#### REFERENCES

- [1] R. Kuppuswamy, S. R. Sawant, S. Balasubramanian, P. Kaushik, N. Natarajan and J. D. Gilbert, "Over one million TPCC with a 45nm 6-core Xeon® CPU," 2009 IEEE Intl. Solid-State Circuits Conf. (ISSCC) Digest of Technical Papers, San Francisco, CA, 2009, pp. 70-71,71a.
- [2] Y. Zorian and S. Shoukourian, "Embedded-memory test and repair: infrastructure IP for SoC yield," in *IEEE Design & Test of Computers*, vol. 20, no. 3, pp. 58-66, May-June 2003.
- [3] B. S. Amrutur and M. A. Horowitz, "Speed and power scaling of SRAM's," in IEEE J. Solid–State Circuits, vol. 35, no. 2, pp. 175-185, Feb. 2000, doi: 10.1109/4.823443.
- [4] K. Zhang, K. Hose, V. De, and B. Senyk, "The scaling of data sensing schemes for high speed cache design in sub-0.18 μm technologies," VLSI Symp. Tech. Dig., 2000, pp. 226–227.
- [5] R. Houle, "Simple statistical analysis techniques to determine minimum sense amp set times," *Proc. IEEE CICC*, Sep. 2007, pp. 37–40.
- [6] K. Sasaki et al., "A 9 ns 1 Mb CMOS SRAM," 1989 IEEE Intl. Solid-State Circuits Conf. (ISSCC). Digest of Technical Papers, New York, NY, USA, 1989, pp. 34-35.
- [7] S. Saxena et al., "Variation in Transistor Performance and Leakage in Nanometer-Scale Technologies," in *IEEE Trans. on Electron Devices*, vol. 55, no. 1, pp. 131-144, Jan. 2008.
- [8] S. Lovett, G. Gibbs, and A. Pancholy, "Yield and matching implications for static RAM memory array sense-amplifier design," *IEEE J. Solid-State Circuits*, vol. 35, no. 8, pp. 1200–1204, Aug. 2000.
- [9] A. Fritsch et al., "24.1 A 6.2 GHz Single Ended Current Sense Amplifier (CSA) Based Compileable 8T SRAM in 7nm FinFET Technology," 2021 IEEE Intl. Solid- State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2021, pp. 334-336.
- [10] J. Keane et al., "17.2 5.6Mb/mm2 1R1W 8T SRAM arrays operating down to 560mV utilizing small-signal sensing with charge-shared bitline and asymmetric sense amplifier in 14nm FinFET CMOS technology," 2016 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2016, pp. 308-309.
- [11] B. S. Amrutur and M. A. Horowitz, "Speed and power scaling of SRAM's," in IEEE J. Solid-State Circuits, vol. 35, no. 2, pp. 175-185, Feb. 2000, doi: 10.1109/4.823443.
- [12] B. Wicht, T. Nirschl and D. Schmitt-Landsiedel, "Yield and speed optimization of a latch-type voltage sense amplifier," *IEEE J. Solid-State Circuits*, vol. 39, no. 7, pp. 1148-1158, July 2004.
- [13] B. Wicht, J. -. Larguier and D. Schmitt-Landsiedel, "A 1.5V 1.7ns 4k /spl times/ 32 SRAM with a fully-differential auto-power-down current sense amplifier," 2003 IEEE Intl. Solid-State Circuits Conf. (ISSCC). Digest of Technical Papers., San Francisco, CA, USA, 2003, pp. 462-508 vol.1.
- [14] N. Verma and A. P. Chandrakasan, "A High-Density 45nm SRAM Using Small-Signal Non-Strobed Regenerative Sensing," 2008 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2008, pp. 380-621.
- [15] B. Wicht, D. Schmitt-Landseidel, S. Paul and A. Sanders, "SRAM current-sense amplifier with fully-compensated bit line multiplexer," 2001 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2001, pp. 172-173.
- [16] M. Abu-Rahma, Y. Chen, W. Sy, W. L. Ong, L. Y. Ting, S. S. Yoon, M. Han, and E. Terzioglu, "Characterization of SRAM sense amplifier input offset for yield prediction in 28 nm CMOS," *Proc. IEEE CICC*, Sep. 2011, pp. 1–4.
- [17] D. Laurent, "Sense amplifier signal margins and process sensitivities," *IEEE Trans. Circuits Syst. I, Fundam. Theory Appl.*, vol. 49, no. 3, pp. 269–275, Mar. 2002.
- [18] J. S. Shah, D. Nairn, and M. Sachdev, "Am Energy-Efficient Offset-Cancelling Sense Amplifier." *IEEE Trans. Circuits and Systems –II*, 2013.
- [19] A. Bhavnagarwala, S. Kosonocky, C. Radens, K. Stawiasz, R. Mann, Y. Qiuyi, and K. Chen, "Fluctuation limits and scaling opportunities for CMOS SRAM cells," *IEEE IEDM Tech. Dig.*, Dec. 2005, pp. 659–662.
- [20] A. Do, Z. Kong and K. Yeo, "Criterion to Evaluate Input-Offset Voltage of a Latch-Type Sense Amplifier," in *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 57, no. 1, pp. 83-92, Jan. 2010.

- [21] L. Pileggi, G. Keskin, X. Li, Ken Mai and J. Proesel, "Mismatch analysis and statistical design at 65 nm and below," 2008 IEEE CICC, San Jose, CA, 2008, pp. 9-12.
- [22] R. Singh and N. Bhat, "An offset compensation technique for latch type sense amplifiers in high-speed low-power SRAMs," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 6, pp. 652–657, Jun. 2004.
- [23] S. H. Woo, H. Kang, K. Park and S. O. Jung, "Offset voltage estimation model for latch-type sense amplifiers," *IET Circuits, Devices & Systems*, vol. 4, no. 6, pp. 503-513, Nov. 2010.
- [24] Giridhar, Bharan, et al. "13.7 A reconfigurable sense amplifier with autozero calibration and pre-amplification in 28nm CMOS." 2014 IEEE Intl. Solid-State Circuits Conf. (ISSCC). IEEE, 2014, San Francisco, CA, USA, 2014, pp. 242-243.
- [25] M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, no. 5, pp. 1433–1439, Oct. 1989.
- [26] K. Seno, K. Knorpp, L.-L. Shu, N. Teshima, H. Kihara, H. Sato, F.Miyaji, M. Takeda, M. Sasaki, P. T. Chuang, and K. Kobayashi, "A 9-ns 16-Mb CMOS SRAM with offset-compensated current sense amplifier," *IEEE J. Solid-State Circuits*, vol. 28, no. 11, pp. 1119–1124, Nov. 1993.
- [27] K. Ishibashi, K. Takasugi, K. Komiyaji, H. Toyoshima, T. Yamanaka, A. Fukami, N. Hashimoto, N. Ohki, A. Shimizu, T. Hashimoto, T. Nagano, and T. Nishida, "A 6-ns 4-Mb CMOS SRAM with offsetvoltage-insensitive current sense amplifiers," *IEEE J. Solid-State Circuits*, vol. 30, no. 4, pp. 480–486, Apr. 1995.
- [28] Bhatia, Praneet, B. S. Reniwal, and S. K. Vishvakarma. "An offsettolerant self-correcting sense amplifier for robust high speed SRAM." VLSI Design and Test (VDAT), 2015 19th Intl. Symp. on. IEEE, 2015.
- [29] Khayatzadeh, Mahmood, et al. "A Reconfigurable Sense Amplifier with 3X Offset Reduction in 28nm FDSOI CMOS." 2015 Symp. on VLSI Circuits (VLSI Circuits). IEEE, 2015.
- [30] J. Takahashi, T. Wada, and Y. Nishimura, "A dynamic current-offset calibration (DCC) sense amplifier with fish-bone shaped bit-line (FBB) for high-density SRAMs," *VLSI Symp. Tech. Dig.*, Jun. 1994, pp. 115– 116.
- [31] Beshay, Peter, Benton H. Calhoun, and Joseph F. Ryan. "Sub-threshold sense amplifier compensation using auto-zeroing circuitry." *Subthreshold* 'Microelectronics Conf. (SubVT), 2012 IEEE. IEEE, 2012.
- [32] Xu, Heqing, et al. "A current mode sense amplifier with selfcompensation circuit for SRAM application." ASIC (ASICON), 2013 IEEE 10th Intl. Conf. on. IEEE, 2013.
- [33] M. E. Sinangil et al., "A 28 nm 2 Mbit 6 T SRAM With Highly Configurable Low-Voltage Write-Ability Assist Implementation and Capacitor-Based Sense-Amplifier Input Offset Compensation," *IEEE J. Solid–State Circuits* 51.2 (2016): 557-567.
- [34] M. Bhargava, M. McCartney, A. Hoefler, and K. Mai, "Low-overhead, digital offset compensated, SRAM sense amplifiers," *Proc. IEEE CICC*, Sep. 2009, pp. 705–708.
- [35] Beshay, Peter, et al. "SRAM sense amplifier offset cancellation using BTI stress." 2012 IEEE Subthreshold Microelectronics Conf. (SubVT).
- [36] N. Verma and A. P. Chandrakasan, "A 65nm 8T Sub-Vt SRAM Employing Sense-Amplifier Redundancy," 2007 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2007, pp. 328-606.
- [37] B. Goll and H. Zimmermann, "A 65nm CMOS comparator with modified latch to achieve 7GHz/1.3mW at 1.2V and 700MHz/47μW at 0.6V," 2009 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2009, pp. 328-329,329a.
- [38] Y. Watanabe, N. Nakamura, and S. Watanabe, "Offset compensating bitline sensing scheme for high density DRAMs," *IEEE J. Solid-State Circuits*, vol. 29, no. 1, pp. 9–13, Jan. 1994.
- [39] M. Sharifkhani, E. Rahiminejad, S.M. Jahinuzzaman, and M. Sachdev, "A compact hybrid current/voltage sense amplifier with offset cancellation for high-speed SRAMs," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 5, pp. 883–894, May 2011.
- [40] D. Patel and M. Sachdev, "0.23-V Sample-Boost-Latch-Based Offset Tolerant Sense Amplifier," in *IEEE Solid-State Circuits Letters*, vol. 1, no. 1, pp. 6-9, Jan. 2018.
- [41] J. Kong, L Siek, and C.-L. Kok, "A 9-bit Body-biased Vernier Ring Timeto-Digital Converter in 65 nm CMOS Technology," *IEEE ISCAS*, 2015.
- [42] J. Han and K. Kwon, "A SAW-Less Receiver Front-End Employing Body-Effect Control IIP2 Calibration," *IEEE TCAS I*, Sept. 2014

# > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK 13 HERE TO EDIT) <

- [43] C.-P. Chang, J.-H. Chen, and Y.-H. Wang, "A Fully Integrated 5 GHz Low-Voltage LNA Using Forward Body Bias Technology," *IEEE MWCL*, vol. 19, no. 3, Mar. 2009.
- [44] Masuda, Chotaro, et al. "High current efficiency sense amplifier using body-bias control for ultra-low-voltage SRAM." 2011 IEEE 54th MWSCAS. IEEE, 2011.
- [45] Feki, Anis, et al. "280mV sense amplifier designed in 28nm UTBB FD-SOI technology using back-biasing control." 2013 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conf. (S3S). IEEE, 2013.
- [46] R. L. Reger, K. Verma, S. K. Jaiswal and D. Jain, "Design and Analysis of High Speed Body Bias Control and CLSA Sense Amplifier," 2013 Int. Conf. on Machine Intelligence and Research Advancement, Katra, 2013, pp. 383-386.
- [47] L. Yang, Y. Cheng, Y. Wang, H. Yu, W. Zhao and A. Todri-Sanial, "A body-biasing of readout circuit for STT-RAM with improved thermal reliability," 2015 IEEE Intl. Symp. on Circuits and Systems (ISCAS), Lisbon, 2015, pp. 1530-1533.
- [48] J. W. Tschanz, S. G. Narendra, Y. Ye, B. A. Bloechel, S. Borkar and V. De, "Dynamic sleep transistor and body bias for active leakage power control of microprocessors," in *IEEE J. Solid–State Circuits*, vol. 38, no. 11, pp. 1838-1845, Nov. 2003.
- [49] J. T. Kao, M. Miyazaki and A. R. Chandrakasan, "A 175-MV multiplyaccumulate unit using an adaptive supply voltage and body bias architecture," in *IEEE J. Solid–State Circuits*, vol. 37, no. 11, pp. 1545-1554, Nov. 2002.
- [50] S. Chatterjee, Y. Tsividis and P. Kinget, "0.5-V analog circuit techniques and their application in OTA and filter design," in *IEEE J. Solid–State Circuits*, vol. 40, no. 12, pp. 2373-2387, Dec. 2005.
- [51] J. Kim, P. K. T. Mok and C. Kim, "A 0.15 V Input Energy Harvesting Charge Pump With Dynamic Body Biasing and Adaptive Dead-Time for Efficiency Improvement," in *IEEE J. Solid–State Circuits*, vol. 50, no. 2, pp. 414-425, Feb. 2015.
- [52] Y. Huang, W. Woo, Y. Yoon and C. Lee, "Highly Linear RF CMOS Variable Attenuators With Adaptive Body Biasing," in *IEEE J. Solid–State Circuits*, vol. 46, no. 5, pp. 1023-1033, May 2011.
- [53] J. W. Tschanz et al., "Adaptive body bias for reducing impacts of die-todie and within-die parameter variations on microprocessor frequency and leakage," in *IEEE J. Solid–State Circuits*, vol. 37, no. 11, pp. 1396-1402, Nov. 2002.
- [54] Y. Sinangil, and A. Chandrakasan, 128 kbit SRAM with an Embedded Energy Monitoring Circuit and Sense-Amplifier Offset Compensation Using Body Bias," *IEEE J. Solid–State Circuits*, pp. 2730-2739, November 2014.
- [55] B. Liu, J. Cai, J. Yuan, and Y. Hei, A Low Voltage SRAM Sense Amplifier with Offset Cancelling Digitized Multiple Body Biasing," *IEEE Trans. Circuits and Systems-II*, 2016.
- [56] Shakir, Tahseen, and Manoj Sachdev. "A body-bias based current sense amplifier for high-speed low-power embedded SRAMs." 2014 27th IEEE Intl. System-on-Chip Conf. (SOCC). IEEE, 2014.
- [57] Rabaey, J., 1996. Digital Integrated Circuits. Upper Saddle River (NJ): Prentice Hall.
- [58] M. E. Sinangil and A. P. Chandrakasan, "An SRAM using output prediction to reduce BL-switching activity and statistically-gated SA for up to 1.9× reduction in energy/access," 2013 IEEE Intl. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2013, pp. 318-319.
- [59] P. Chiu, B. Zimmer and B. Nikolić, "A double-tail sense amplifier for low-voltage SRAM in 28nm technology," 2016 IEEE Asian Solid-State Circuits Conf. (A-SSCC), Toyama, 2016, pp. 181-184.
- [60] Xiaoyao Liang, Kerem Turgay and D. Brooks, "Architectural power models for sram and cam structures based on hybrid analytical/empirical techniques," 2007 IEEE/ACM Intl. Conf. on Computer-Aided Design, San Jose, CA, 2007, pp. 824-830.
- [61] D. Patel, A. Neale, D. Wright and M. Sachdev, "Hybrid Latch-Type Offset Tolerant Sense Amplifier for Low-Voltage SRAMs," in *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 66, no. 7, pp. 2519-2532, July 2019.
- [62] K. Sarfraz, J. He and M. Chan, "A 140-mV Variation-Tolerant Deep Sub-Threshold SRAM in 65-nm CMOS," in *IEEE J. Solid–State Circuits*, vol. 52, no. 8, pp. 2215-2220, Aug. 2017.







**Dhruv Patel** (M'12) received BASc. (Co-op, Hons.) and MASc. degrees in Electrical Engineering from the University of Waterloo and the University of Toronto in 2016 and 2020. He was involved with variation tolerant sub-threshold SRAM circuits research during undergraduate studies. Since 2019, he is pursuing a PhD. at the University of Toronto in integrated circuits for optical links.

Adam Neale (M'08) received the BASc., MASc. and Ph.D. degrees in Electrical and Computer Engineering from the University of Waterloo in 2008, 2010 and 2015. He is currently working as a Quality and Reliability Engineer with Intel Corporation in Hillsboro, Oregon, where he is responsible for developing radiation-induced upset reliability models. Dr. Neale was the Best Poster Paper Award receipient at the 2014 IEEE Custom Integrated Circuits Conference (CICC) for his work on multiple adjacentbit upset error correcting codes for SRAMs.

**Derek Wright** (M'09) is a faculty member in the Electrical and Computer Engineering department at the University of Waterloo. His research interests include low-power digital circuit design, multidomain modeling and simulation, biomedical devices, and neuromorphic circuits and systems. He received the BASc. and MASc. degrees in Electrical and Computer Engineering from the University of Waterloo in 2003 and 2005, and the PhD degree in collaborative Electrical and Biomedical Engineering from the University of Toronto in 2010.

Manoj Sachdev (M'85-SM'97-F'12) received his B.E. degree (with Honors) in Electronics and Communication Engineering from IIT Roorkee (India), and Ph.D. from Brunel University (UK). He was with Semiconductor Complex Limited, Chandigarh (India) from 1984 till 1989 where he designed CMOS Integrated Circuits. From 1989 till 1992, he worked in the ASIC division of SGS-Thomson at Agrate (Italy). In 1992, he joined Philips Research Laboratories, Eindhoven (The Netherlands), where he researched on various aspects of VLSI testing and manufacturing.

Dr. Sachdev is a professor in the Electrical and Computer Engineering department, at the University of Waterloo, Canada. His research interests include low-power and high-performance digital circuit design, mixed-signal circuit design, test and manufacturing issues of integrated circuits. He has contributed to five books, two book chapters, and has co-authored over 200 technical articles in conferences and journals. He holds more than 30 granted and several pending US patents on various aspects of VLSI circuit design, reliability, and test. He, his students, and his colleagues have received several international awards. In 1997, at the IEEE European Design and Test Conference, he received the best paper award. In 1998, he was a co-recipient of the honorable mentioned award in the IEEE International Test Conference. He received the best paper award in IEEE INTERNATIONAL Symposium on Quality Electronics Design. In 2015, he was a co-recipient of the best poster award in IEEE Custom Integrated Circuits Conference.

Professor Sachdev is an IEEE Fellow, Fellow of Engineering Institute of Canada, Fellow of Canadian Academy of Engineering; and has served on several conference committees. In 1999, he was the Technical Program Co-Chair for IEEE Mixed-signal Test Workshop. He also served as the Technical Program Chair from IEEE IDDQ Test Workshop in 1999, and 2000. In 2007-08, he was an Associate Editor for IEEE Transactions on Vehicular Technology. In 2008 and 2009, he was the Program Chair for Microsystems and Nanoelectronics Research Conference, Ottawa, Canada. He serves on the editorial board of the Journal of Electronic Testing: Theory and Applications. He was a member of technical program committee of IEEE Custom Integrated Circuits Conference from 2008 till 2014, and since 2016 he is a program committee member of IEEE Design and Test in Europe (DATE) conference.