Course: ECE658 VLSI DESIGN PRINCIPLES
Instructor: Professor Wayne Burleson
Semester/year: Fall/2000
Ramshankar Ramanarayanan
In this project, a low-voltage sense amplifier technique advocated for high-performance sub-90nm Static CMOS RAMs is studied. The performance of a recently proposed [1] sense amplifier, Charge Transfer Sense Amplifier (CTSA), is presented with the bit-line interconnect of a 128 bit SRAM. The project involves custom layout of a parameterized single bit SRAM using Cadence [2] pcell technology as well as custom layout of the CTSA. The available technology of 0.24 um is used for layouts and 2.5 V is used for Vdd. Transistors are appropriately sized to save area and to ensure good performance.
The architecture used for this project is shown in Figure 1.
The model contains 128 SRAM cells in a column and bit-line interconnect with RC
parasitics. Also shown is the CTSA circuit which functions by transferring
charge from the high capacitance of the interconnect to low output capacitance
of the sense amplifier. The CTSA is controlled by use of two input signals
SAen and Precharge which control the charge transfer and
provide the output via SAout and SAout'. The following
sections detail the schematic and simulation of a 1-bit SRAM with the CTSA
circuit, pcell implementation of the 1-bit SRAM cell and the CTSA layout. HSpice
plots are presented toward the end of this document. 
Figure 1: 128-SRAM bit cell architecuture
The schematic is shown in Figure 2. The schematic features a 1-bit SRAM cell using minimal width PMOS and NMOS pass transistors. The NMOS transistors in the cross-coupled invertor pair were sized with a large W/L, around 5 times minimum size, to allow for fast discharge of the corresponding bit-line during the read operation. The basic operation of the CTSA is based on charge redistribution from high bit-line capacitance to the low capacitance of the nodes SA and SA' shown in Figure 2. The CTSA consists of two parts. The first part is a common gate cascode formed by M1, M3 and M5 for bit-line BL and M2, M4 and M6 for bit-line BL'. The second part is the cross-coupled invertor latch that latches the outputs of the common-gate amplifiers, SA and SAbar. There are two phases of operation for the CTSA circuit [1]. In the precharge phase, the bit-lines and all the intermediate nodes (A, B, C and D) are precharged high. The nodes SA and SAbar are pre-discharged by keeping SAen high. In the evaluation phase, Pch is pulled high and CTSA is enabled by pulling SAen low. When the bit-line going low goes near Vb + |Vtp|, M1 or M2 go into sub-threshold region of operation and prevents the corresponding node SA or SA' from getting charged. The node corresponding to the bit-line going high charges to high. SA and SA' node voltages are latched by the cross-coupled invertors. The latch outputs are connected to invertors to generate a rail-to-rail output swing.

Figure 2: Schematic of a SRAM cell with the CTSA circuit
The layout of a single SRAM cell is shown in Figure 3. Minimum sized transistors were used for the PMOS pull up transistors and for the NMOS pass transistors. Since the large capacitance bit-line is pulled low by one of the NMOS pull down transistors during memory read, these transistors are liberally sized with a W/L of approximately 5. This translates to a width of 360 nm for the minimal sized transistors and a width of 1800 nm for the large NMOS transistors. The length which is technology dependent is 240nm. Figure 3 is parameterized using Cadence pcell technology. The input and output lines, WL, BL and BL' are laid out in Metal 2 layer and the supply lines, Vdd and Gnd are laid out in Metal 1 layer. A section of the layout of 128-column SRAM that uses pcell SRAM instances is shown in Figure 4.


|
|
The layout of the CTSA circuit is shown in Figure 6. Poly has been replaced with a metal whenever possible. Since the bitlines are long, the precharge PMOS transistors are sized with a W/L of 10. The invertors are sized as usual with a W/L ratio of 3:1. The input/output lines, Vb, Pch, SAen, SAout and SAoutbar are laid out in Metal 2 layer while the supply is laid out in Metal 1 layer. The width of this layout is exactly the same as that of one bit column. Typically each sense amplifier is used for 4 bit lines with column multiplexing and there is a potential for reducing the height and increasing the width of the CTSA layout.

Figure 5: Layout of CTSA
Figure 6 shows the simulation for the layout. Initially, Q and Qbar of the top most SRAM cell in the column is set to logic high and logic low respectively. In the layout, word lines of the bottom 127 cells are connected to Gnd to disable them and to simulate reads on the top most cell. When Pch and WL for the top most cell goes high, the circuit evaluates the voltage in the bit lines. A small latching delay is introduced by delaying pulling SAen low after Pch goes high. This leads to latching increasing differential voltages after a significant differential is acheived thereby reducing the effect of common mode noise. After SAen goes low which is around .1 ns, the output SAout is pulled low and SAout' is pulled high. A rail-to-rail swing is obtained by the use of invertors before the output as shown in Figure 2. The circuit shows stable operation for a read clock of 333 MHz as indicated by the time period of 3 ns for the Pch/WL waveform. The cascode gate amplifier pulls down the bit-line BL' that is swinging low. The increasing differential between BL and BL' during the read operation leads to maximum bit-line swing of .82V for BL' and .33V for BL, around the end of the read operation. A small amount (100mV) of static voltage exists for the 0-bit line. Figure 7 shows the simulation for the schematic in Figure 2.

Figure 6: Hspice simulations for layout of 128-bit SRAM and CTSA

Figure 7: Hspice simulation of the 1-bit SRAM and CTSA schematic