Memory with Single-Ended Sensing Using Reset-Set Latch
Various implementations provide systems and methods for reading data from memory bit cells. An example implementation includes a read circuit that provides a single-ended output from a sensing stage. The single-ended output is received by a reset-set (RS) latch, which also receives a virtual bit line signal. The single-ended output and the virtual bit line signal provide complementary inputs to the RS latch, and the RS latch stores a sensed bit, and the sensed bit may be driven onto a data bus.
The present application relates, generally, to memory read circuits and, more specifically, to single-ended memory read circuits employing reset-set latches.
BACKGROUNDAn example memory may include a multitude of memory bit cells (also called memory cells) arranged in columns, with the bit cells in a given column sharing bitlines. The bitlines may be driven by memory write circuits and read by a sense amplifier coupled to a latch. For instance, some differential read circuits may use a reset-set (RS) latch that receives as inputs a value derived from a bitline and a value derived from a complementary bitline. By contrast, some single-ended read circuits may use a clocked latch. The example single-ended read circuits do not include complementary values, such as may be derived from a bitline and a complementary bitline. Single-ended read circuits that use clocked latches may suffer from excessive dynamic power.
Accordingly, there is a need in the art for techniques for reading data from bit cells that is both compatible with a single-ended scheme as well as having lower power consumption.
SUMMARYVarious implementations provide systems and methods for reading data from memory bit cells. An example implementation includes a single-ended sensing scheme that includes a virtual complementary bitline signal. A value derived from a bitline and a value derived from the virtual complementary bitline signal may be input to a reset-set (RS) latch, where the value may be stored.
According to one implementation, a memory includes: a first bitline coupled to a bit cell; a second bitline coupled to the first bitline; a first logic gate coupled to the second bitline; and a reset-set (RS) latch having a first input and a second input, wherein the first input is coupled to the second bitline and the second input is coupled to the second bitline via the first logic gate.
According to one implementation, a method of reading data from a bit cell includes: causing a first bit line to assume a state in accordance with a digital bit that is stored in the bit cell; sensing the state, including causing a second bit line, which is coupled to the first bit line, to assume a single-ended digital value based at least in part on the digital bit; generating a virtual bit line value; receiving the single-ended digital value and the virtual bit line value at a reset-set (RS) latch; and storing the digital bit in the RS latch.
According to one implementation, a memory device includes: means for storing a bit of data; means for sensing a state of a bit line that is coupled to the means for storing the bit of data; means for generating a virtual bit line value from an output of the means for sensing; and a reset-set (RS) latch having a first input coupled to the means for sensing and a second input coupled to the means for generating the virtual bit line value.
According to one implementation, a system includes: a memory device coupled to a processor and configured to perform read operations and write operations in response to the processor; a plurality of bit cells arranged in rows and columns within the memory device; and read circuitry coupled to a first one of the bit cells, the read circuitry including: a bit line pair coupled to a first logic gate and to the first one of the bit cells; a reset-set (RS) latch having a first input coupled to an output of the first logic gate; and a second logic gate coupled to a second input of the RS latch, the second logic gate further coupled to the output of the first logic gate and to a control signal.
An example implementation includes memory having a single-ended sensing architecture that employs a reset-set (RS) latch to receive and store a sensed bit of data. Looking at a given bit cell that is being read from, it includes a pair of bitlines—a bitline and a complementary bitline. One of the bitlines may be coupled to a local bitline that feeds into a global bitline at a sensing stage of the memory system.
The sensing stage may include an upper bitline and lower bitline pair that are coupled to the local bitline through, e.g., the upper bitline. A logic gate, such as a NAND gate, may receive the lower bitline and the upper bitline pair as inputs and provide output onto the global bitline. The global bitline may include one or more inverters in line with an output of the logic gate. The global bitline may take on a value that is derived from one of the bitlines of the bitline pair of the bit cell. For instance, if the local bitline is coupled to a complementary bitline of the bit cell, then the value of the global bitline may be dependent, at least in part, upon the value of the complementary bitline.
Continuing with the example, the memory system may further include a logic gate that is coupled to the second bitline, e.g., the global bitline. In one implementation, the logic gate may include a NOR gate that receives as inputs the global bitline and a control signal. For instance, the control signal may include any appropriate signal, but in some implementations may be a periodic signal that has a desirable period and duty cycle. The output of the logic gate may then be used as a virtual bitline (or bitline bar) signal that is input to the RS latch. The RS latch may include a pair of cross-coupled NOR gates. A first one of the cross-coupled NOR gates receiving as input the output of the logic gate (the virtual bitline or bitline bar signal) and an output from the second one of the cross-coupled NOR gates. The second cross-coupled NOR gate may include as its input the output of the first cross-coupled NOR gate as well as the global bitline. The output of the system may be taken from the output of the second cross-coupled NOR gate.
Various implementations may include methods as well. An example method includes sensing a value from a bit cell, where the bit cell stores a data value. The example method may include outputting from the sensing stage a single-ended value from a logic gate, such as a NAND gate. The output from the sensing stage may correspond to a global bitline that provides a value to be latched. The method may further include latching the value from the sensing stage by employing an RS latch. One example action of the method may include generating a virtual bitline (or bitline bar) signal to use as an input to the RS latch, where another input may include the value on the global bitline. The RS latch then stores the value that was sensed in the sensing stage.
High power consumption is a concern for Systems on Chip (SoCs), central processing units (CPUs), graphics processing units (GPUs), and the like. The circuits and methods discussed herein may be implemented in SoCs, CPUs, GPUs, and other circuits that include memory systems. For example, some implementations may include a multi-port memory providing two reads and one write per clock cycle (2R+1 W), which may occupy large portion of a GPU (768 instances). In some traditional GPUs, the dynamic power (e.g., read and write operations) coming from the memories may be as high as 15% of total GPU power. Traditionally multi-port memories may have single-ended sensing followed by a latch controlled by local clock signals. But the single-ended sensing scheme with the clocked latch may contribute to undesirably excessive input-output (TO) read dynamic power from the local clock and control signals. By contrast, various implementations described herein may instead use single-ended sensing with an RS latch, which may be a lower-power solution than the clock-controlled latch of other systems.
Furthermore, another advantage of various implementations includes that the RS latches may be used for voltage level shifting. For instance, some implementations may include different power domains at the sensing stage versus at the latch stage. As described in further detail below, the latch stage may provide the level shifting to interface between the two power domains. Moreover, the various implementations described herein are counterintuitive because they use an RS latch with two inputs to interface with a global bitline having a single-ended value. The single-ended value is accommodated by use of the virtual bitline or bitline bar signal so that the RS latch receives two inputs.
Various aspects of a memory will now be presented in the context of a static random access memory (SRAM). SRAM is volatile memory that requires power to retain data. However, as those skilled in the art will readily appreciate, such aspects may be extended to other memories and/or circuit configurations. Examples of other memories include random access memory (RAM), dynamic RAM (DRAM), synchronous dynamic RAM (SDRAM), double data rate RAM (DDRAM), read only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), a general register on a processor, processor cache, flash memory, or any other suitable memory. Accordingly, all references to an SRAM are intended only to illustrate exemplary aspects of memory with the understanding that such aspects may be extended to a wide range of applications.
The processor 102 illustrated in
The control bus 112 may include a read signal and a write signal. The read signal and the write signal may be used to generate a read enable and a write enable, respectively, within the memory 104. The address bus 106 may be used to indicate which location within the memory 104 the processor is reading or writing. For example, if the processor 102 wishes to read a memory location in the memory 104 the processor 102 may output the address of the memory location on the address bus 106. Additionally, the processor 102 may drive the read signal, which may be part of the control bus 112, active. The memory 104 may then output the data in the memory location indicated by the address bus 106 on the read data bus 110. Similarly, if the processor 102 is writing a memory location in the memory 104, the processor 102 may output the address of the memory location to be written on the address bus 106. Additionally, the processor 102 may drive the write signal, which may be part of the control bus 112, active. The processor 102 may drive the write data bus 108 with the data that is to be written to the memory 104.
The write data bus 108 and the read data bus 110 are illustrated as separate buses in
The memory cell 214 is illustrated with two inverters 202, 204. The first inverter 202 includes a positive-channel metal-oxide-semiconductor field effect (PMOS) transistor 206 and a negative-channel metal-oxide-semiconductor field effect (NMOS) transistor 208. The second inverter 204 includes a PMOS transistor 210 and an NMOS transistor 212. In the described implementation, the inverters 202 and 204 are powered by VDD and have a return power rail VSS (e.g., ground). The first inverter 202 and the second inverter 204 are interconnected to form a cross-coupled latch. A first NMOS access transistor 217 couples the output node 216 from the first inverter 202 to a bitline b1, and a second NMOS transistor 218 couples the output node 220 from the second inverter 204 to a bitline b1b 224 (the value of which is the opposite or inverse of the bitline b1 222). The gates of the NMOS access transistors 214, 218 are coupled to a wordline WWL.
A write operation may be performed by setting the bitlines b1 222 and b1b 224 to the value to be written to the memory cell 214 and asserting the wordline WWL. The wordline WWL may be asserted before the value to be written (e.g., write data) is provided to the bitlines b1 222 and b1b 224. A high value, e.g., a logic level “1” may be written to the memory cell 214 by setting the bitline b1 to a logic level “0” and the bitline b1b 224 to a logic level “1.” The logic level “0” at the bitline b1 222 is applied to the second inverter 204 through the first NMOS transistor 214, which in turn forces the output node 220 of the second inverter 204 to VDD. The output node 220 of the second inverter 204 is applied to the input of the first inverter 202, which in turn forces the output node 216 of the first inverter 202 to VSS. A logic level “0” may be written to the memory cell 214 by inverting the values of the bitlines b1 222 and b1b 224.
Once the write operation is complete, the wordline WWL is de-asserted, thereby causing the NMOS access transistors 214 and 218 to disconnect the bitlines b1 222 and b1b 224 from the two inverters 202, 204. The cross-coupling between the two inverters 202, 204 maintains the state of the inverter outputs as long as power is applied to the memory cell 214.
The memory cell 214 stores data according to the data values stored at output nodes 216 and 220. If the memory cell 214 stores a logic high (i.e., a ‘1’), then output node 216 is at a logic high and output node 220 is at a logic low (i.e., a ‘0’). If the memory cell 214 stores a logic low, then output node 216 is at a logic low and output node 220 is at logic high. During a read operation, differential bitlines b1 222 and b1b 224 may be pre-charged by a pre-charge circuit. The word line WWL is then asserted, thereby turning on NMOS transistors 214, 218. The timing between the pre-charging and asserting the wordline WWL may be controlled by a row decoder 204 (not shown).
If memory cell 214 stores a logic high, then bitline b1 remains charged via the first NMOS access transistor 214, and complimentary bitline b1b 224 is discharged via the second NMOS transistor 218. If memory cell 214 stores a logic low, then bitline b1 222 is discharged via the first NMOS transistor 214, and complimentary bitline b1b 224 remains charged via the second NMOS transistor 218.
The upper bitline UBL and the lower bitline LBL are pre-charged via transistors 402 and 403, respectively. For instance, when the signals upre_n and 1pre_n are low, the PMOS transistors 402, 403 are on, thereby charging the upper bitline and lower bitline to a logic one in the power domain MX. The local data path of
The output of NAND gate 401 is provided to the global bitline 410. In other words, the local bitline rb1 301 of
The signal grb1_2 is a first bitline signal, which is supplemented and complemented in this example by a virtual bitline signal q. In this example, the virtual bitline signal is used in the RS latch like a bitline signal, but it is not a bitline signal itself because it is derived from grb1_2 and delayed_pre after the single-ended sensing. The virtual bitline signal is generated by logic gate 411, which in this example is a NOR gate. NOR gate 411 receives as an input grb1_2 as well as the signal delayed_pre. The signal delayed_pre is shown in more detail in
The NOR gates 412, 413 are cross-coupled and, together, form an RS latch to temporarily store the bit that results from the single-ended sensing provided by NAND gate 401. The NOR gates 412, 413 are described in more detail with respect to
Examples of reading a logic zero and a logic one are now discussed with respect to
The output sro_n of the NOR gate 412 in the latch has already gone to logic one. Now ub1 is pre-charged back to logic one at times T3-T4 so that the output of the NAND gate 401 goes to logic zero, which causes grb1_2 (qb) to go back to logic zero at time T4. Now both NOR gates 412, 413 have inputs of zero, which holds the value in the latch.
Reading a logic one—once again, the bitlines are pre-charged. The read word line (RWL) goes high at time T5, and both ub1 and 1b1 remain high. Now grb1_2 (qb) is zero. The signal delayed_pre goes to zero at time T6, so that NOR gate 411 receives both logic zeros as inputs, and that pulls the node q from 0 to 1 at time T7. The logic one at the output of NOR gate 411 is similar to a virtual bitline value of logic one. Now the output sro_n of the NOR gate 412 is logic zero, and the output sro of the NOR gate 413 is logic one. The signal delayed_pre returns to logic one at time T8, which causes the node q to go back to logic zero at time T9, and now both q and qb are at logic zero, which causes the latch to hold its value.
Table 1 is a truth table for the NOR gate 413:
NOR gate 412 is constructed similarly. It has two PMOS transistors 611 and 612 as well as two NMOS transistors 613, 614. The virtual bitline signal at node q is applied to the gates of transistors 612, 613, and the cross-coupled output of transistor 413 is applied to the gates of transistors 611, 614. Table 2 is a truth table for the NOR gate 412:
An advantageous consequence of using NOR gates 412, 413 as the RS latch is that NOR gates may provide voltage level shifting without modification. For instance, in this example, there are two power domains shown in
Some implementations include methods, such as method 700 of
At action 710, the system stores a digital bit in a bit cell. An example bit cell is shown at
At action 720, a first bitline is caused to assume a state in accordance with the digital bit. In the example of
At action 730, the state of the bitline (e.g., rb1) is sensed. In the example of
At action 740, the method includes generating a virtual bitline value. In the example of
At action 750, the single-ended digital value is received at an RS latch. The RS latch also receives the virtual bitline value. An example of
At action 760, the digital bit is stored in the RS latch. Action 760 may include causing the single-ended digital value in the virtual bitline value to both be zero, thereby putting the RS latch in a stable state.
Action 760 may further include level shifting from one power domain to another power domain. For instance, the latch itself may operate in a power domain (CX) different from a power domain (MX) of the sensing stage. As explained above with respect to
The scope of implementations is not limited to the specific actions shown in
Various implementations described herein may be suitable for use in a system on chip (SoC). An example of a SoC includes a semiconductor chip having multiple processing devices within it, including a graphics processing unit (GPU), a central processing unit (CPU), a modem unit, a camera unit, and the like. In some examples, the SoC may be included within a chip package, mounted on a printed circuit board, and disposed within a portable device, such as a smart phone or tablet computer. However, the scope of implementations is not limited to a chip implemented within a tablet computer or smart phone, as other applications are possible.
RAM memory unit 890 may include reading circuits, such as those described above with respect to
Furthermore, in this example, GPU 820 includes memory 821. Memory 821 may be implemented as a local memory for GPU 820. In one implementation, memory 821 may be a single-bank or multi-bank memory having the architecture described above with respect to
As those of some skill in this art will by now appreciate and depending on the particular application at hand, many modifications, substitutions and variations can be made in and to the materials, apparatus, configurations and methods of use of the devices of the present disclosure without departing from the scope thereof. In light of this, the scope of the present disclosure should not be limited to that of the particular implementations illustrated and described herein, as they are merely by way of some examples thereof, but rather, should be fully commensurate with that of the claims appended hereafter and their functional equivalents.
Implementation examples are described in the following numbered clauses:
1. A memory comprising:
a first bitline coupled to a bit cell;
a second bitline coupled to the first bitline;
a first logic gate coupled to the second bitline; and
a reset-set (RS) latch having a first input and a second input, wherein the first input is coupled to the second bitline and the second input is coupled to the second bitline via the first logic gate.
2. The memory of clause 1, wherein the first logic gate comprises a NOR gate.
3. The memory of clause 2, wherein the first logic gate is further coupled to a periodic control signal.
4. The memory of clauses 1-3, wherein the second bitline is implemented in an upper bitline and lower bitline pair and coupled to the first input through a NAND gate.
5. The memory of clauses 1-4, wherein the RS latch comprises a second logic gate and a third logic gate, the second logic gate comprising a first p-type metal oxide semiconductor (PMOS) transistor and a second PMOS transistor arranged in series, wherein the first PMOS transistor is gate-coupled to a first control voltage associated with a first power domain and the second PMOS transistor is gate-coupled to a second control voltage associated with a second power domain.
6. The memory of clause 5, wherein the third logic gate comprises a third PMOS transistor and fourth PMOS transistor arranged in series, wherein the third PMOS transistor is gate-coupled to the first control voltage and the fourth PMOS transistor is gate-coupled to a voltage associated with the second power domain.
7. The memory of clause 5, wherein the first power domain has a lower voltage level than does the second power domain.
8. The memory of clauses 1-7, wherein the first bitline comprises a local bitline, and wherein the second bitline comprises a global bitline.
9. The memory of clauses 1-8, wherein the second bitline is associated with a first power domain, and wherein the RS latch is associated with a second power domain.
10. A method of reading data from a bit cell, the method comprising:
causing a first bit line to assume a state in accordance with a digital bit that is stored in the bit cell;
sensing the state, including causing a second bit line, which is coupled to the first bit line, to assume a single-ended digital value based at least in part on the digital bit;
generating a virtual bit line value;
receiving the single-ended digital value and the virtual bit line value at a reset-set (RS) latch; and
storing the digital bit in the RS latch.
11. The method of clause 10, wherein generating the virtual bit line value comprises:
receiving the single-ended digital value at a logic gate;
receiving a periodic control signal at the logic gate; and
outputting the virtual bit line value from the logic gate.
12. The method of clause 11, wherein the logic gate comprises a NOR gate that is coupled to an input of the RS latch, wherein the RS latch receives the single-ended digital value as an additional input.
13. The method of clauses 10-11, wherein sensing the state comprises:
coupling the first bit line to a pre-charged third bit line, wherein the pre-charged third bit line is coupled to an input of a logic gate, and wherein a pre-charged fourth bit line is coupled to an additional input of the logic gate; and
outputting the single-ended digital value from the logic gate.
14. The method of clause 13, wherein the logic gate comprises a NAND gate.
15. The method of clauses 10-14, wherein the virtual bit line value and the single-ended digital value corresponds to a first power domain, and wherein an output of the RS latch corresponds to a second power domain, further wherein the second power domain has a lower voltage level than the first power domain.
16. The method of clauses 10-15, further comprising:
performing voltage level shifting between a first power domain and a second power domain at the RS latch, wherein the virtual bit line value and the single-ended digital value correspond to the first power domain, and wherein an output of the RS latch corresponds to the second power domain.
17. The method of clauses 10-16, further comprising:
driving the digital bit from an output of the RS latch to a data bus.
18. A memory device comprising:
means for storing a bit of data;
means for sensing a state of a bit line that is coupled to the means for storing the bit of data;
means for generating a virtual bit line value from an output of the means for sensing; and
a reset-set (RS) latch having a first input coupled to the means for sensing and a second input coupled to the means for generating the virtual bit line value.
19. The memory device of clause 18, wherein the means for generating the virtual bit line value comprises a NOR gate.
20. The memory device of clause 19, wherein the NOR gate is further coupled to a periodic control signal.
21. The memory device of clauses 18-20, wherein the means for sensing comprises: an upper bitline and lower bitline pair and coupled to the RS latch through a NAND gate.
2. The memory device of clauses 18-21, wherein the RS latch comprises a first logic gate and a second logic gate, the first logic gate comprising a first p-type metal oxide semiconductor (PMOS) transistor and a second PMOS transistor arranged in series, wherein the first PMOS transistor is gate-coupled to a first control voltage associated with a first power domain and the second PMOS transistor is gate-coupled to a second control voltage associated with a second power domain.
23. The memory device of clause 22, wherein the second logic gate comprises a third PMOS transistor and fourth PMOS transistor arranged in series, wherein the third PMOS transistor is gate-coupled to the first control voltage and the fourth PMOS transistor is gate-coupled to a voltage associated with the second power domain.
24. The memory device of clause 22, wherein the first power domain has a lower voltage level than does the second power domain.
25. A system comprising:
a memory device coupled to a processor and configured to perform read operations and write operations in response to the processor;
a plurality of bit cells arranged in rows and columns within the memory device; and
read circuitry coupled to a first one of the bit cells, the read circuitry including:
-
- a bit line pair coupled to a first logic gate and to the first one of the bit cells;
- a reset-set (RS) latch having a first input coupled to an output of the first logic gate; and
- a second logic gate coupled to a second input of the RS latch, the second logic gate further coupled to the output of the first logic gate and to a control signal.
26. The system of clause 25, wherein the processor comprises a graphics processing unit (GPU), and wherein the memory device is included within the GPU.
27. The system of clauses 25-26, wherein the bit line pair comprises a pre-charged upper bit line and lower bit line, wherein either the upper bit line or the lower bit line is coupled to a bit line bar of the first one of the bit cells.
28. The system of clauses 25-27, wherein the second logic gate comprises a NOR gate, and wherein the control signal comprises a periodic control signal.
29. The system of clauses 25-28, wherein the RS latch comprises a third logic gate and a fourth logic gate, the third logic gate comprising a first p-type metal oxide semiconductor (PMOS) transistor and a second PMOS transistor arranged in series, wherein the first PMOS transistor is gate-coupled to a first control voltage associated with a first power domain and the second PMOS transistor is gate-coupled to a second control voltage associated with a second power domain, wherein the fourth logic gate comprises a third PMOS transistor and fourth PMOS transistor arranged in series, wherein the third PMOS transistor is gate-coupled to the first control voltage and the fourth PMOS transistor is gate-coupled to a voltage associated with the second power domain.
30. The system of clause 29, wherein the first power domain has a lower voltage level than does the second power domain.
Claims
1. A memory comprising:
- a first bitline coupled to a bit cell;
- a second bitline coupled to the first bitline;
- a first logic gate coupled to the second bitline; and
- a reset-set (RS) latch having a first input and a second input, wherein the first input is coupled to the second bitline and the second input is coupled to the second bitline via the first logic gate.
2. The memory of claim 1, wherein the first logic gate comprises a NOR gate.
3. The memory of claim 2, wherein the first logic gate is further coupled to a periodic control signal.
4. The memory of claim 1, wherein the second bitline is implemented in an upper bitline and lower bitline pair and coupled to the first input through a NAND gate.
5. The memory of claim 1, wherein the RS latch comprises a second logic gate and a third logic gate, the second logic gate comprising a first p-type metal oxide semiconductor (PMOS) transistor and a second PMOS transistor arranged in series, wherein the first PMOS transistor is gate-coupled to a first control voltage associated with a first power domain and the second PMOS transistor is gate-coupled to a second control voltage associated with a second power domain.
6. The memory of claim 5, wherein the third logic gate comprises a third PMOS transistor and fourth PMOS transistor arranged in series, wherein the third PMOS transistor is gate-coupled to the first control voltage and the fourth PMOS transistor is gate-coupled to a voltage associated with the second power domain.
7. The memory of claim 5, wherein the first power domain has a lower voltage level than does the second power domain.
8. The memory of claim 1, wherein the first bitline comprises a local bitline, and wherein the second bitline comprises a global bitline.
9. The memory of claim 1, wherein the second bitline is associated with a first power domain, and wherein the RS latch is associated with a second power domain.
10-30. (canceled)
Type: Application
Filed: Nov 29, 2021
Publication Date: Jun 1, 2023
Inventors: Arun Babu PALLERLA (San Diego, CA), Anil Chowdary KOTA (San Diego, CA), Changho JUNG (San Diego, CA), Chulmin JUNG (San Diego, CA)
Application Number: 17/456,773