# A Multi-Level Cell for STT-MRAM with Biaxial Magnetic Tunnel Junction

Aynaz Vatankhahghadim, Ali Sheikholeslami Department of Electrical and Computer Engineering University of Toronto, Toronto, Canada Email: {aynaz, ali}@ecc.utoronto.ca

Abstract—A multi-level cell for STT-MRAM is proposed using biaxial magnetic tunnel junction (MTJ). The proposed cell consists of one transistor and one MTJ (1T1MTJ) with biaxial magnetic layer to store two bits per cell. Using the four stable states of the biaxial layer, the proposed cell allows 2 bits to be stored per cell without voltage headroom limitations. Current pulses with different amplitudes are applied during write operation to switch the magnetization vector to the corresponding region. This avoids multi-step write operations required for previously proposed multi-level cells using uniaxial MTJs. On average, the simulated write speed of the proposed cell is 33% faster than that of previous work, and the proposed cell consumes 8% less power. Also, current sensing vs. voltage sensing is compared for the biaxial MTJ; Current sensing provides uniform distribution of the sense margin.

#### I. INTRODUCTION

Spin-transfer-torque magnetoresistive random access memory (STT-MRAM) has been studied extensively during the last decade [1]. STT-MRAM is now a prime candidate for a universal memory as it is nonvolatile and could accommodate a high endurance (in the order of  $10^{15}$ ) and a low access time (in the order of nanoseconds) [2]. Much research has been dedicated to improving this memory technology by reducing the switching current, eliminating the read disturbance issue, and other device/cell challenges. In particular, the high ratio between the high and low resistance of the cell has inspired interest in multi-level cell structures for STT-MRAM [4]-[5]. Multi-level structures have the additional benefit of increasing the cell density, allowing higher memory capacity to be achieved. While conventional STT-MRAM multi-level cells utilize the conventional uniaxial MTJs, we propose a multilevel cell using biaxial MTJs.

While magnetic tunnel junctions (MTJs) can be stacked in series or used in parallel, series multi-level cell (MLC) demonstrates a much higher read and write reliability [3]. Previous MLC structures of STT-MRAM use two serially stacked MTJs with one access transistor (1T-2MTJ) [4]-[5]. This is shown in Fig. 1(a). Each MTJ is used to store one bit and the access transistor is shared leading to improved cell density. However, this approach is not extendible to more than two bits per cell due to the limitation of voltage headroom. Also, this design requires two-step write operations in half of the write cases resulting in longer average write times and more power consumption. We propose an MLC STT-MRAM that does not run into voltage headroom issues by using biaxial MTJ to store two bits in one magnetic layer. The proposed cell, illustrated in Fig. 1(b), uses a biaxial MTJ to store two bits. Also, with novel write scheme (to be discussed in this paper),



Fig. 1. (a) Previous work on multi-level cell STT-MRAM [4]-[5], (b) Proposed multi-level cell STT-MRAM.

the proposed cell has faster write operation with less power consumption. MRAM cells with biaxial MTJs are discussed in [10]. However, MRAM cells require a magnetic field to perform write operation. MLC therefore requires a sequence of magnetic fields for write. Also, compared to STT-MRAM, they are less scalable and require more switching current. Therefore, our proposed STT-MRAM deploying biaxial Magnetic layers, makes MLC possible while avoiding problems of MRAM and previous MLC structures.

This paper is organized as follows: The previous work on STT-MRAM MLCs is discussed in section II. We propose our cell and its read/write operation in Section III. Simulation results are illustrated in Section IV, followed by comparison in Section V and conclusions in Section VI.

## II. PREVIOUS WORK

A uniaxial MTJ, the main element of the conventional STT-MRAM cells, consists of two ferromagnetic layers with a thin insulating layer in between as shown in Fig. 2 [2]. It includes one pinned magnetized layer and one free layer, whose magnetization can be changed by spin-polarized current in the process of writing to the memory. Depending on the direction of the current, magnetization of the free layer will be aligned either in parallel or anti-parallel to the pinned layer. The read operation involves measuring the effective resistance of the MTJ in two different states representing "0" and "1". Resistance between the pinned layer and the free layer is higher when the layers have their magnetization in anti-parallel state  $(R_{AP})$  than when they are in parallel  $(R_P)$ .

Authors in [4]-[5] stack two MTJs with different size/switching current (i.e. twice the area and twice the switching current of the other one) as shown in Fig. 3(a). Each



Fig. 2. Conventional STT-MRAM cell and MTJ characteristic.

MTJ stores one bit and depending on the direction of the magnetization of the free layer, it is considered either "0" or "1". Combination of these two MTJs provides the four possible states of "00", "01", "10", and "11".

Writing to the cell consists of a one-step operation for "00" and "11" as the current passes in one direction or the other. But this is a two-step operation for "01" and "10" as these bits require passing the current in different directions; first, a higher current is applied to switch both of the MTJs to the same state, and then a lower current in the opposite direction switches the direction of the smaller MTJ only. This is illustrated in Fig. 3(b). As a result of this two-step process, longer write time and higher power consumption is expected. Writing "01" and "10" to the cell is expected to take twice the time for writing "00" and "11". Therefore, writing a sequence of random data is expected to result in a write time that is 50% longer than the basic write.

To read the bits from the cell, three reference levels are required to distinguish between the four resistance levels of the cell:  $R_{P1} + R_{P2}$ ,  $R_{P1} + R_{AP2}$ ,  $R_{AP1} + R_{P2}$ , and  $R_{AP1} + R_{AP2}$ . The read operation can follow parallel sensing, serial sensing, or binary search [6]. In parallel sensing, all the levels are compared in parallel and at the same time using multiple sense amplifiers [4]. In serial sensing, one comparison is performed one at a time and the levels are compared one by one using only one sense amplifier. Although the binary search follows the serial sensing basics by comparing one level at a time, it is based on dichotomic algorithm and does the comparisons by dividing the corresponding ranges to half at each step, reducing the number of comparisons to two [5].

We propose a cell that eliminates the voltage headroom issue (as it does not introduce an additional MTJ resistance to the path). Furthermore, the proposed cell reduces the number of write steps resulting in faster and more power-efficient write operations.

#### III. PROPOSED CELL

The proposed STT-MRAM cell using biaxial magnetic layer is shown in Fig. 4. A biaxial magnetic layer exhibits an in-plane magnetization that could be stable along two distinct axes, corresponding to four distinct directions in the plane [7]. Assuming four stable states in y-z plane in four quadrants, identified as regions 1, 2, 3, and 4 on Fig. 4(b) and in the remainder of this paper, it is possible to store 2 bits per cell.

The stable states correspond to the minimum anisotropy energy levels of the magnetic layer. The anisotropy energy of



Fig. 3. (a) Serially stacked multi-level-cell STT-MRAM, (b) Write operation [5].



Fig. 4. (a) Proposed MLC STT-MRAM, (b) Fixed and free layer from top view.

uniaxial and biaxial layers can be calculated [8] from:

$$E_{uniaxial} = K_u \sin^2(\theta) \tag{1}$$

$$E_{biaxial} = K_u sin^2(\theta) + 1/4K_1 sin^2(2\theta)$$
(2)

where  $\theta$  is the angle between the magnetization vector and the z-axis. These two energies are shown in Fig. 5. While the minimum energy points of uniaxial layer occur at  $\theta = 0$ and  $\theta = \pi$ , they occur at two points somewhere in between  $\theta = 0$  and  $\theta = \pi$  for the biaxial layer depending on the values of  $K_u$  and  $K_1$ . Considering the easy plane (which is y-z plane), there are two other stable states at the negative-y region, resulting in four possible stable states. This is shown in Fig. 6 by plotting the anisotropy energy of the biaxial layer versus z-axis. The four minimum energy points lie on the intersection of the energy mesh with  $y^2 + z^2 = 1$  circle on the y-z plane, as marked.

To identify the  $\theta$  for which  $E_{biaxial}$  is minimum, we equate  $\frac{\partial E}{\partial \theta}$  to zero. This results in

$$\theta = 0.5 * \cos^{-1}(\frac{-K_u}{K_1}) \tag{3}$$

For reasons to be discussed later, we pick  $K_u = 2.75$ ,  $K_1 = 3.75$ , and find the angles of stable states (minimum energy points) as  $68.4^{\circ}$  and  $111.6^{\circ}$  with the positive direction of the z-axis, and it will be the same angles in regions 2 and 3 with negative y-values. Note that while the magnetization vector has



Fig. 5. Energy of uniaxial (dashed) and biaxial (solid) magnetic layers.



Fig. 6. Energy of biaxial magnetic layer vs. z-axis intersected with  $y^2 + z^2 = 1$  circle on y-z plane

both positive y and z components in region 1, both of these components are negative in region 3. Similarly, the y and z components are negative and positive, respectively, for region 2 and are the reverse for region 4. The magnetization vector of the fixed layer has  $45^{\circ}$  angle with the z-axis ( $\alpha = 45^{\circ}$ ). This vector is in parallel with the z-axis in the conventional cells. The reason behind the tilted magnetization vector of the fixed layer will be discussed later in this section.

The four stable states in four regions are verified with our developed MTJ model [9] (with modified magnetic anisotropy) as shown in Fig. 7. In the absence of any applied current, depending on the region in which the initial angle of the magnetization vector lies, it will settle around one of the stable states of (0, 0.92, 0.36), (0, 0.92, -0.36), (0, -0.92, -0.36), or (0, -0.92, 0.36), where the magnetization vector is assumed to have a unity amplitude. Adjusting the energy constants results in different angles. We now discuss how these angles are chosen.

The conductance of the MTJ is calculated based on,

$$G(\phi) = G_0(1 + p^2 \cos(\phi)) \tag{4}$$

in which  $G_0$  is the average conductance over  $\phi$ , p is the polarization factor, and  $\phi$  is the angle between the magnetization vectors of the two layers [11] (i.e.  $\phi = \theta - \alpha$ ). Conventionally,  $\alpha = 0$ , which means  $\phi = \theta$ . In this case, the two stable states



Fig. 7. Four stable states in the four regions.

correspond to  $\theta = 0^{\circ}$  and  $\theta = 180^{\circ}$ . In the proposed cell, to have a distinction between resistances of region 1 and 2 and also region 3 and 4, the magnetization vector of the fixed layer is tilted to provide four distinct angles between the fixed layer and stable states of the free layer.

Ideally, we should choose a  $\phi$  that results in a uniform distribution of R's, i.e. in a fixed  $\Delta R$  between adjacent regions. However, as shown in Fig. 8, this is not possible, as there is no  $\phi$  that results in an equal  $\Delta R$ . By increasing the angle between the fixed layer and the free layer, the resistances tend to deviate from each other, but while  $R_{region4}$  (plotted in red) deviates from  $R_{region1}$  (shown in blue), it gets closer to  $R_{region2}$ (illustrated in green). We can however, achieve a constant  $\Delta G$  if we consider conductances rather than resistances. We compare the conductances vs. angle  $(\phi)$  as shown in Fig. 9. According to Fig. 9, a  $\phi$  around 24° results in similar  $\Delta G$  between the regions, ensuring equal sense margins if we deploy current sensing while applying a constant voltage to the MTJs. This translates to 69° angle with the z-axis (as  $\theta = \alpha + \phi, 45^{\circ} + 24^{\circ} = 69^{\circ}$ ) and requires aforementioned energy constants of  $K_u = -2.75, K_1 = 3.75$ .

Based on these angles and MTJ parameters such as p = 0.7, and  $G_0 = 0.67$ , the corresponding resistances/conductances are calculated as

 $R_{region1} = R_{00} = 1.03 \text{ Kohm}, G_{00} = 0.97 \text{ mmho}$  $R_{region2} = R_{10} = 1.82 \text{ Kohm}, G_{10} = 0.55 \text{ mmho}$  $R_{region3} = R_{11} = 2.76 \text{ Kohm}, G_{11} = 0.36 \text{ mmho}$  $R_{region4} = R_{01} = 1.27 \text{ Kohm}, G_{01} = 0.79 \text{ mmho}$ 



Fig. 8.  $R_{00}$ ,  $R_{01}$ ,  $R_{11}$ , and  $R_{10}$  vs. the angle between the fixed and free layer.



Fig. 9.  $G_{00}$ ,  $G_{01}$ ,  $G_{11}$ , and  $G_{10}$  vs. the angle between the fixed and free layer.

### A. Read Operation

Read operation is based on comparing the resistance or conductance of the MTJ with reference resistors or conductances. The three reference resistances/conductances are picked to be in between the resistance/conductance values of two regions (between region 1 and 4, between region 4 and 2, between region 2 and 3). While in voltage sensing, current is applied to the cell and the voltages are compared as shown in Fig. 10, in current sensing the current through the cell is compared with a reference cell using current sense amplifiers (CSAs) [12] as illustrated in Fig. 11. In precharge phase, bitlines are forced to the same potential and also sense amplifier outputs are equalized. Then in sensing phase, the outputs are pushed to latch to one side due to the difference in the current of the cells. To have almost equal sense margins for all the states of the cell (as discussed in the previous section), current sensing is preferred over the voltage sensing. To have a reference conductance with similar device as the biaxial MTJ and with conductance value equal to the average of the two neighbouring conductances (in terms of value of the conductance), the following approach based on averaging the current is used [13]: neighbouring conductances are connected in parallel to provide the average of the currents of the neighbouring regions. The conductance pairs of  $(G_{00}, G_{01})$ ,  $(G_{01}, G_{10})$ , and  $(G_{10}, G_{11})$  are used for each of the three reference cells.

To get the sense margin and to characterize the current through the reference cells, the voltage  $V_{in}$  is swept across them as well as across the data cell in different states. Corresponding currents are shown in Fig. 12. While with current sensing, the sense margins will be proportional to 0.18x, 0.24x, and 0.19x, with voltage sensing, sense margins

are in relation with 0.24x, 0.55x, and 0.94x (in which x is related to the amplitude of the applied voltage or applied current, respectively).



Fig. 10. Parallel read operation, voltage sensing [4].

Parallel search is used for read operation as in [4] since performing parallel read operation saves time with not much area overhead (about 14% of the memory macro-area). Three sense amplifiers are used to compare the cell conductance against three reference conductances. Their outputs will be encoded to give the two stored bits.



Fig. 11. Current sensing [12].



Fig. 12. Current of the cell in different states as well as the reference currents.

### B. Write Operation

Write operation of the proposed cell consists of a read operation to determine the original stored bit, followed by passing the current with different amplitude through the MTJ according to the stored bit and the bit to be stored. In other words, once the currently stored state is determined, current pulse with the corresponding amplitude is applied to switch the magnetization vector accordingly. While switching from region 1 and 4 to other regions requires a positive current (current from fixed layer to the free layer), switching form region 2 and 3 occur when negative current (current from free layer to the fixed layer) is applied. Reading the stored bit as part of write operation was introduced and discussed in [14]-[15]. This eliminates the unnecessary transitions and unwanted write operations.

#### IV. WRITE OPERATION SIMULATION RESULTS

The magnetization trajectories as we force the magnetization vector to move from region 1 to regions 2, 3, and 4, are shown in Fig. 13, 14, and 15, respectively. Corresponding to these three trajectories, we plot components of the magnetization vector,  $m_z$ ,  $m_y$ , and  $m_x$  as well as the voltage across the MTJ as a function of time, in Fig. 16, 17, and 18, respectively. The voltage across the MTJ changes as it switches to different regions, confirming the change in the resistance of the MTJ. With a pulsewidth of 20ns, the switching current amplitude for switching between regions are reported in Table I.

While  $m_y$  switches from 0.92 to -0.92 in switching from region 1 to 2,  $m_z$  and  $m_y$  switch from 0.36 to -0.36 and 0.92 to -0.92 respectively in switching from region 1 to 3. In switching from region 1 to 4,  $m_x$  and  $m_y$  remain constant while  $m_z$  reveals an oscillatory behaviour between -0.36 and 0.36 and will settle to either 0.36 or -0.36 depending on the exact timing at which the applied current is reset to zero.



Fig. 13. Switching from region 1 to 2.



Fig. 14. Switching from region 1 to 3.



Fig. 15. Switching from region 1 to 4.

A similar situation exists when switching from region 4 to 1 and when switching between regions 2 and 3. To avoid this problem, we propose a two-step write operation only when we deal with these particular write cases while we use a onestep operation in all other transitions. A two-step operation in moving from region 1 to 4, for example, includes moving from region 1 to 2 (or 3) as an intermediate region followed by moving from regions 2 (or 3) to region 4. Out of a total of 16 write cases, there are only 4 cases that require two-step write operations but there are also 4 cases that require no write when the data to be written is the same as the data already stored in the cell. As a result, the write time is equivalent to a one-step write time. This should be compared against an average of 1.5 steps per case in the conventional MLC, resulting in a 33% faster write operation. Note that even by applying a read-first to the conventional case, it still has 1.25x average switching time compared to the proposed cell.



Fig. 16. Components of magnetization vector and the voltage across the MTJ for switching from region 1 to 2.

 
 TABLE I.
 CURRENT AMPLITUDE TO SWITCH FROM ONE REGION TO ANOTHER

| From To  | Region 1 | Region 2 | Region 3 | Region 4 |
|----------|----------|----------|----------|----------|
| Region 1 | -        | I=100uA  | I=70uA   | I=30uA   |
| Region 2 | I=-150uA | -        | I=-130uA | I=-160uA |
| Region 3 | I=-75uA  | I=-30uA  | -        | I=-95uA  |
| Region 4 | I=35uA   | I=120uA  | I=80uA   | -        |

#### V. AREA AND POWER COMPARISON

In this section we compare our proposed cell against the previous work in terms of area and power.

In terms of area, if minimum size transistor is used as the access transistor for both cases, the difference would be in the



Fig. 17. Components of magnetization vector and the voltage across the MTJ for switching from region 1 to 3.



Fig. 18. Components of magnetization vector and the voltage across the MTJ for switching from region 1 to 4.

MTJs. In conventional MLC, two MTJs will be stacked on top of the transistor, while in our case there is one biaxial MTJ to be integrated with the transistor. Since the MTJs sit on top of the CMOS, the area for both cases would be the same.

For write power consumption, while the conventional MLC requires  $60\mu$ A and  $120\mu$ A for anti-parallel switching of MTJ1 and MTJ2 (with similar parameters as the proposed cell), it requires  $25\mu$ A and  $50\mu$ A for the parallel switching, respectively. This translates to the following switching currents for P-P, AP-P, P-AP, and AP-AP switchings:  $50\mu$ A,  $60\mu$ A+ $50\mu$ A,  $25\mu$ A+ $120\mu$ A, and  $120\mu$ A. Assuming random and equiprobable transitions, this results in average switching current of  $106.25\mu$ A. On the other hand, with the proposed cell and the currents given in Table. I, and also considering some two-step operations, the average current of  $97.8\mu$ A is achieved. Therefore the proposed cell achieves an 8% power saving in addition to the 33% faster write operation compared to the conventional MLC design.

## VI. CONCLUSION

We propose an STT-MRAM along with biaxial MTJ to store two bits/cell. We propose a new write operation based on reading the bit first and then applying corresponding current amplitude. This facilitates new STT-MRAM MLCs without having to deal with limited voltage headroom, resulting in faster and more power efficient operations on average. By using current sensing rather than voltage sensing, equal sense margin is achieved for all states of the cell.

#### ACKNOWLEDGMENT

This work is partially supported by NSERC of Canada. Authors would like to thank CMC for providing the CAD tools, and Safeen Huda and Joshua Liang for providing insightful feedback.

#### REFERENCES

- [1] M. Hosomi, H. Yamagishi, T. Yamamoto, K. Bessho, Y. Higo, K. Yamane, H. Yamada, M. Shoji, H. Hachino, C. Fukumoto, H. Nagao, H. Kano, "A novel nonvolatile memory with spin torque transfer magnetization switching: spin-ram," *IEEE International Electron Devices Meeting*, 2005. *IEDM Technical Digest*. pp. 459-462, Dec. 2005.
- [2] D. D. Tang, Y. Lee, Magnetic memory: Fundamentals and Technology, Cambridge University Press, 2010.
- [3] Yaojun Zhang, Lu Zhang, Wujie Wen, Guangyu Sun, Yiran Chen, "Multilevel cell STT-RAM: Is it realistic or just a dream?," 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 526-532, 5-8 Nov. 2012.
- [4] M. Aoki, H. Noshiro, K. Tsunoda, Y. Iba, A. Hatada, M. Nakabayashi, A. Takahashi, C. Yoshida, Y. Yamazaki, T. Takenaga, T. Sugii, "Novel highly scalable multi-level cell for STT-MRAM with stacked perpendicular MTJs," 2013 Symposium on VLSI Technology (VLSIT), pp. T134,T135, 11-13 June 2013.
- [5] Ishigaki, T.; Kawahara, T.; Takemura, R.; Ono, K.; Ito, K.; Matsuoka, H.; Ohno, H., "A multi-level-cell spin-transfer torque memory with seriesstacked magnetotunnel junctions," 2010 Symposium on VLSI Technology (VLSIT), pp. 47,48, 15-17 June 2010.
- [6] C. Calligaro, V. Daniele, R. Gastaldi, A. Manstretta, G. Torelli, "A new serial sensing approach for multistorage non-volatile memories," *Records* of the 1995 IEEE International Workshop on Memory Technology, Design and Testing, pp. 21-26, 7-8 Aug 1995.
- [7] Taehee Yoo, S. Khym, Sun-young Yea, Sunjae Chung, Sanghoon Lee, X. Liu, J.K. Furdyna, "Four discrete Hall resistance states in single-layer Fe film for quaternary memory devices," *Applied Physics Letters*, vol. 95, no. 20, pp. 202505,202505-3, Nov 2009.
- [8] M.V. Pitke, "Biaxial anisotropy for memory applications," *Czechoslovak Journal of Physics B*, vol. 21, no. 4-5, pp. 467-469, 1971.
- [9] A. Vatankhahghadim, S. Huda, A. Sheikholeslami, "A Survey on Circuit Modeling of Spin-Transfer-Torque Magnetic Tunnel Junctions," *IEEE Trans. on Circuits and Systems I (TCAS-I)*, vol. 61, no. 9, pp. 2634-2643, 2014..
- [10] T. Uemura, T. Marukame, K.-I Matsuda, M. Yamamoto, "Four-State Magnetic Random Access Memory and Ternary Content Addressable Memory Using CoFe-Based Magnetic Tunnel Junctions," *37th International Symposium on Multiple-Valued Logic, 2007. ISMVL 2007*, pp.49,49, 13-16 May 2007.
- [11] M. Julliere, "Tunneling between ferromagnetic films," *Phys. Lett.*, vol. 54A, pp. 225226, 1975.
- [12] T.N. Blalock, R.C. Jaeger, "A high-speed clamped bit-line current-mode sense amplifier," *IEEE Journal of Solid-State Circuits*, vol. 26, no. 4, pp. 542,548, Apr 1991.
- [13] Taehui Na, Jisu Kim, Jung Pill Kim, S.H. Kang, Seong-Ook Jung, "Reference-Scheme Study and Novel Reference Scheme for Deep Submicrometer STT-RAM," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 61, no. 12, pp. 3376-3385, Dec. 2014.
- [14] Yiran Chen, Xiaobin Wang, Wenzhong Zhu, Hai Li, Zhenyu Sun, Guangyu Sun, Yuan Xie, "Access scheme of Multi-Level Cell Spin-Transfer Torque Random Access Memory and its optimization," 2010 53rd IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1109-1112, 1-4 Aug. 2010.
- [15] R. Bishnoi, F. Oboril, M. Ebrahimi, M.B. Tahoori, "Avoiding unnecessary write operations in STT-MRAM for low power implementation," 2014 15th International Symposium on Quality Electronic Design (ISQED), pp.548-553, 3-5 March 2014.