Memory malfunction prediction system and method
A memory malfunction prediction system and method, such as those that sequentially stress each row of memory cells in an array by decreasing the refresh rate of the row. Prior to doing so, the data stored in the row can be copied to a holding row, and a CRC value for the data can be generated and stored. After the test, the data stored in the row being tested can be read, and a CRC value for the data can then be generated. This after test CRC value can be compared to the stored pre-test CRC value. In the event of a match, the row can be considered to be functioning properly, and the next row can then be tested. If the CRC values do not match, a predicted malfunction of the row can be considered to exist, and corrective action can be taken, such as by repairing the row by substituting a redundant row of memory cells.
Latest Micron Technology, Inc. Patents:
- ASSOCIATIVE PROCESSING MEMORY SEQUENCE ALIGNMENT
- Memory Arrays Comprising Strings Of Memory Cells And Methods Used In Forming A Memory Array Comprising Strings Of Memory Cells
- ASSOCIATIVE PROCESSING MEMORY SEQUENCE ALIGNMENT
- APPARATUSES, SYSTEMS, AND METHODS FOR DATA TIMING ALIGNMENT WITH FAST ALIGNMENT MODE
- Apparatus with circuit-locating mechanism
This application is a continuation of U.S. patent application Ser. No. 12/141,716, filed on Jun. 18, 2008, U.S. Pat. No. 7,773,441, which application is incorporated herein by reference, for any purpose.
TECHNICAL FIELDThis invention relates to memory devices, and, more particularly, in various embodiments, to a system and method for predicting memory malfunctions before they occur to allow corrective action to be taken before the memory device malfunction occurs.
BACKGROUND OF THE INVENTIONA wide variety of memory devices are found in electronic systems. For example, dynamic random access memory devices (“DRAM”) are commonly used as system memory in computer systems. Although DRAM devices are highly reliable, they nevertheless do, at times, malfunction. Common DRAM device malfunction mode are data retention errors, which result when memory devices are unable to store data for a period of adequate duration. As is well-know in the art, DRAM cells must be periodically refreshed to retain. Data retention errors often result from the inability of DRAMs memory cells to retain data between refreshes.
DRAM devices used in a computer system are normally tested during “boot-up” of the computer system. However, even if the DRAM devices pass the test during boot-up, they may malfunction during subsequent use. A DRAM device malfunction usually does not create too much of a problem because the system can simply be powered down and repaired by obtaining and installing a new DRAM device. Although the system must be shut down while the DRAM device is being installed, that also is usually not much of a problem. However, there are systems that cannot be shut down without creating somewhat greater problems. For example, shutting down a computer used to service a network of automatic teller (“ATM”) machines would render the ATM machines unusable for the entire period that the repair was being made. Another example results from malfunctions of a DRAM device used as system memory in a computer system performing a computation that may take a very long time, such as several weeks, to complete. If the DRAM device malfunctions well into the computation, it is often necessary to repeat the entire calculation after the malfunctioning DRAM device has been replaced. Unfortunately, there have been no suitable techniques to mitigate the adverse effects of such DRAM malfunctions.
There is therefore a need for a system and method that, for example, reduces the risk of unexpected memory device malfunctions from occurring during use of electronic systems, such as computer systems, containing DRAM devices.
A DRAM device 10 according to one embodiment of the invention is shown in
With further reference to
The DRAM device 10 includes a stress controller 24 that controls the operation of the DRAM device to predict future malfunctions, as explained in greater detail below. The DRAM device also includes a test counter 26 that is incremented to provide row addresses in sequence as each row is tested. The address of the row currently being tested is applied to the steering logic 22, which remaps that address to a holding row 28 when the steering logic 22 receives the address of the row currently being tested from the auto refresh counter 20. As a result, when the auto refresh counter 20 outputs the address of the row being tested, the address is remapped to the holding row 28 so that the row being tested is not refreshed. Instead, the holding row 28 is refreshed. As explained in greater detail below, the holding row 28 is where the data that was stored in the row being tested is stored during the test so that no data is lost during testing. Although a dedicated holding row 28 is used in the embodiment of
The DRAM device 10 also includes a refresh stress counter 30, which is incremented by an address comparator 34. The address comparator 34 receives the address of the row being refreshed from the auto refresh counter 20 and the address of the row being tested from the test counter 26. In the event of an address match, the address comparator 34 outputs a signal that causes the refresh stress counter 30 to increment. The refresh stress counter 30 thus keeps track of how many times a refresh of the row being tested has been skipped. When the count of the refresh stress counter 30 reaches a particular (e.g., predetermined) number, it outputs a “row complete” signal to the stress controller 24 to indicate that testing of the row has been completed. The stress controller 24 then issues a signal to the test counter 26 that causes it to increment to the address of the next row to be tested.
As mentioned above, prior to testing each row of DRAM cells, the data stored in that row is transferred to the holding row 28. This is accomplished by the stress controller 24 outputting a signal to a row copy controller 38. The row copy controller 38 outputs a signal to the steering logic 22 and a row decoder 40 which causes the row to be tested to be actuated so that the data in that row are output from the sense amplifiers 18. The steering logic 22 then actuates the holding row 28 so that the data output from the sense amplifiers 18 are stored in the holding row 28.
When the sense amplifiers 18 output the data stored in the row to be tested, the data is received by a cyclic redundancy check “CRC” generator 44 which generates a CRC value corresponding to the data. The CRC value is then stored in a CRC storage device 46, such as a conventional register, during the testing of the row that stored that data. When the test of each row is completed, the refresh stress counter 30 outputs a signal to the stress controller 24. The stress controller 24 then outputs a signal to the steering logic 22, which again actuates the row being tested. The data stored in that row during the test is then output by the sense amplifiers 18, and the CRC generator 44 generates a CRC value corresponding to that data. The generated CRC value is applied to a CRC comparator 48, which also receives the CRC value stored in the CRC storage device 46. In the event of a match, which indicates that the row being tested was able to retain the stored data during the test despite being skipped for refreshes, the CRC comparator 48 outputs a pass signal to the stress controller 24. The stress controller 24 responds by applying a signal to the test counter 26, which causes it to increment to the address of the next row to be tested.
If the data stored in the row being tested at the end of the test does not match the data that was stored in the row prior to the test, the CRC values corresponding to the different data will not match. As a result, the CRC comparator 48 will output a “fail” signal to the stress controller 24. The stress controller 24 then issues a signal to row repair logic 50 that causes a redundant row of memory cells in the array 14 to be substituted for the malfunctioning row. This may be accomplished by programming the row repair logic 50 to remap the address of the malfunctioning row to the address of the redundant row that is being substituted for the malfunctioning row. However, in other embodiments, the stress controller 24 causes other types of corrective action to be taken. For example, the stress controller 24 may output a signal to circuitry (not shown) such as a clock generator that generates a signal that increments the auto refresh counter 20 to cause it to more quickly increment. Doing so decreases the refresh interval so that the memory cells in the malfunctioning row are refreshed more frequently. Other types of corrective action may also be taken.
A method of testing the memory cells in the array 14 according to another embodiment of the invention is shown in
Returning to step 78, if a determination is made that the CRC value generated from the data stored in the row being tested before the test matches the CRC value generated from the data stored in that row after the test, the data stored in the holding row is copied back to the row being tested at step 86, and the test counter is advanced at step 88, as explained above. However, in some embodiments, the method progresses directly to step 88 from step 78 if the CRC generated from the data stored in the row under test after the test matches the CRC generated from the data stored in the row before the test since the row under test will be storing the correct data, thus making step 86 unnecessary.
The memory device 10 or a memory device according to some other embodiment of the invention may be used in a wide variety of electronic systems. For example, the memory device 10 is used in a computer system 100 as shown in
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. For example, although the memory malfunction prediction system and method has been described in the context of a system for predicting malfunctions of DRAM devices resulting from data retention problems, it may also be applied to predict a variety of other problems in DRAM devices or other types of memory devices. For example, it may be used to predict malfunctions in flash memory devices. Instead of testing and repairing the flash memory cells on a row-by-row basis, the flash memory cells could be tested and repaired on a block-by-block or other basis. In such case, the data stored in the block to be tested would be transferred to a holding block during the test. The data stored there, or a compressed version of the data such as a CRC value, would then be compared to the same generated from the data stored in the block after the test had been completed. Since flash memory cells need not be refreshed, the flash memory cells could be tested (e.g., stressed) in ways other than by reducing a refresh rate. For example, the memory cells in the block could be erased and then rewritten in a manner not normally used during normal operation, such as by altering the word line voltage from the word line voltage normally used for write operations. Also, although the system and method is explained in the context of testing and refreshing rows of memory cells, it will be understood that the memory cells may be tested and/or refreshed in groups of other types. Finally, although the predicted malfunctions may be failures, they can also be limitations on the performance of the memory cells or memory device that do not amount to a failure. Other variations and alternatives will be apparent to one skilled in the art. Accordingly, the invention is not limited except as by the appended claims.
Claims
1. A system for monitoring memory cells, comprising:
- refresh circuitry operable to refresh the memory cells, the refresh circuitry being operable to refresh a selected plurality of memory cells with a test refresh rate and to refresh the remaining memory cells with a normal refresh rate that is faster than the test refresh rate;
- data transfer circuitry coupled to the memory cells, the data transfer circuitry being operable to copy the data stored in selected plurality of memory cells to another storage location before the refresh circuitry refreshes the selected plurality of memory cells with the test refresh rate;
- data comparison circuitry coupled to the memory cells, the data comparison circuitry being operable to compare at least some of the data stored in the another storage location to at least some of the data stored in the selected plurality of memory cells after the selected plurality of memory cells have been refreshed with the test refresh rate, the data comparison circuitry being further operable to generate a malfunction indication if the data comparison circuitry determines that at least some of the data stored in the another storage location does not match the data stored in the selected plurality of memory cells after the selected plurality of memory cells have been refreshed with the test refresh rate; and
- repair logic coupled to the comparison circuitry, the repair logic being operable responsive to the malfunction indication to remap accesses to the selected plurality of memory cells to a redundant plurality of memory cells.
2. The system of claim 1, wherein the data comparison circuitry is configured to compare a CRC value derived from the at least some of the data stored in another storage location to a CRC value derived from the at least some of the data stored in the selected plurality of memory cells.
3. The system of claim 1, wherein the repair logic is further configured to change at least one of the test refresh rate or normal refresh rate responsive to the malfunction indication.
4. The system of claim 1, wherein the refresh circuitry further comprises:
- steering logic coupled to the data comparison circuitry and repair logic and configured to refresh the memory cells responsive to a malfunction indication.
5. The system of claim 1, wherein the another storage location includes another plurality of memory cells.
6. The system of claim 1, wherein the data comparison circuitry is further configured to output an external signal responsive to at least some of the data stored in the another storage location not matching the data stored in the selected plurality of memory cells after the selected plurality of memory cells have been refreshed with the test refresh rate.
7. A system for monitoring memory malfunctions, comprising:
- test circuitry configured to compare a first plurality of memory cells to a second plurality of memory cells, the test circuitry further configured to provide a pass signal responsive to a match between the first plurality of memory cells and the second plurality of memory cells and provide a fail signal responsive to a mismatch between the first plurality of memory cells and the second plurality of memory cells;
- transfer circuitry coupled to the first plurality of memory cells and second plurality of memory cells and configured to transmit data from the first plurality of memory cells to the second plurality of memory cells; and
- malfunction correction circuitry coupled to the test circuitry and configured to remap accesses to the first plurality of memory cells responsive to receipt of a fail signal.
8. The system of claim 7, wherein the test circuitry is configured to compare CRC values of data in the first and second plurality of memory cells.
9. The system of claim 7, wherein the malfunction circuitry is further configured to change a memory refresh rate responsive to receipt of a fail signal.
10. The system of claim 7, wherein the malfunction circuitry is further configured to output an external signal responsive to receipt of a fail signal.
11. The system of claim 7, further comprising:
- a test counter coupled to the test circuitry and configured to specify the location of the first plurality of memory cells on a memory device.
12. The system of claim 7, wherein the first plurality of memory cells comprises a row of memory cells in a memory device.
13. The system of claim 7, wherein the transfer circuitry is further configured to transfer data from the second plurality of memory cells to the first plurality of memory cells.
14. A method for monitoring memory cells, comprising:
- generating a test address corresponding to at least one of a first plurality of memory cells;
- generating a plurality of test addresses corresponding to a second plurality of memory cells;
- refreshing the first plurality of memory cells with a test refresh rate;
- refreshing the second plurality of memory cells with a normal refresh rate;
- copying data from the first plurality of memory cells to another location before the first plurality of memory cells are refreshed with the test refresh rate;
- comparing at least some stored data at the another location to at least some of the data in the first plurality of memory cells; and
- refreshing the first plurality of memory cells if the stored data does not match the data in the first plurality of memory cells.
15. The method of claim 14, further comprising:
- refraining from refreshing the first plurality of memory cells if the at least some stored data matches the data in the first plurality of memory cells.
16. The method of claim 14, further comprising:
- generating a malfunction signal if the at least some stored data does not match the data in the first plurality of memory cells.
17. The method of claim 14, further comprising:
- remapping accesses to the first plurality of memory cells responsive, at least in part, to the at least some stored data not matching the data in the first plurality of memory cells.
18. The method of claim 14, wherein said comparing at least some stored data comprises:
- generating a pre-test CRC value corresponding to the data stored in the first plurality of memory cells before the first plurality of memory cells is refreshed;
- generating a post-test CRC value corresponding to the data stored in the first plurality of memory cells after the first plurality of memory cells are refreshed; and
- comparing the pre-test CRC value and the post-test CRC value.
19. The method of claim 14, wherein the normal refresh rate is faster than the test refresh rate.
20. The method of claim 14, further comprising:
- decreasing the refresh interval for the first plurality of memory cells responsive to the stored data not matching the data in the first plurality of memory cells.
6097644 | August 1, 2000 | Shirley |
6272588 | August 7, 2001 | Johnston et al. |
6697992 | February 24, 2004 | Ito et al. |
6862240 | March 1, 2005 | Burgan |
6868021 | March 15, 2005 | Tanabe et al. |
7158433 | January 2, 2007 | Riho et al. |
7167403 | January 23, 2007 | Riho et al. |
7450458 | November 11, 2008 | Mori et al. |
7773441 | August 10, 2010 | Bunker et al. |
20030147295 | August 7, 2003 | Frankowsky et al. |
20030204689 | October 30, 2003 | Shimoda |
20040062119 | April 1, 2004 | Stimak et al. |
20050249010 | November 10, 2005 | Klein |
20070030746 | February 8, 2007 | Best et al. |
20070174718 | July 26, 2007 | Fouquet-Lapar |
20080046798 | February 21, 2008 | Brown |
20090316501 | December 24, 2009 | Bunker et al. |
Type: Grant
Filed: Jul 12, 2010
Date of Patent: Sep 20, 2011
Patent Publication Number: 20100271896
Assignee: Micron Technology, Inc. (Boise, ID)
Inventors: Layne Bunker (Boise, ID), Ebrahim Hargan (Boise, ID)
Primary Examiner: Son Mai
Attorney: Dorsey & Whitney LLP
Application Number: 12/834,618
International Classification: G11C 7/00 (20060101); G11C 29/00 (20060101);