TECHNIQUES FOR DETECTING AND CORRECTING ERRORS IN A MEMORY DEVICE

Info

Publication number: 20110131471
Type: Application
Filed: Feb 14, 2011
Publication Date: Jun 2, 2011
Inventor: Guillermo Rozas (Los Gatos, CA)
Application Number: 13/026,607

Abstract

A technique for detecting and correcting errors in a memory device, in accordance with one embodiment, includes a data storage area arranged in a plurality of blocks, wherein each block contains a plurality of words. The memory device also includes an error detection/correction storage area for storing error detection/correction bytes corresponding to each word in each block and error detection words corresponding to each block.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of and claims priority to U.S. patent application Ser. No. 11/395,710, filed on Mar. 31, 2006, which is hereby incorporated by reference in its entirety.

BACKGROUND

In the conventional art various forms of error detection and correction are utilized to correct errors in memories, such as caches, system memory, frame buffers and the like that are implemented using static and dynamic random access memory (RAM), read only memory (ROM), and the like. A conventional memory device 100 is illustrated in FIG. 1. The memory device includes an array of memory cells 110, 120, a row decoder 130, a column decoder 140, and error detection/correction logic 150. Typically, the array of memory cells for storing data 110 is extended with additional memory cells for storing error detecting and/or correcting codes 120. The memory cells for storing the error detection/correction codes 120 store a quantity derived from the memory cells utilized for storing data 110. The error detection/correction codes allow corrupted data to be detected and corrected most of the time. One conventional error detecting technique extends every 8 bits of data with an additional parity bit used for detecting a single bit error. One conventional error technique extends every 64 bits of data with an additional 8 bits of error correcting code (ECC) to detect and correct single bit errors and to detect double-bit errors without correction.

Other techniques for detecting and correcting multi-bit errors have been developed. However, conventional methods for detecting and correction multi-bit errors consume a large portion of the memory cell array and/or result in undesirable memory latency.

SUMMARY

Embodiments are directed toward techniques for detecting and correcting errors in a memory device. In one embodiment, a memory device includes a data storage area arranged in a plurality of blocks, wherein each block contains a plurality of words. The memory device also includes an error detection/correction storage area for storing error detection/correction bytes corresponding to each word in each block and error detection words corresponding to the words in each block.

In another embodiment, a method of writing data in a memory device includes computing an error detection/correction byte for each word in a block. An error detection word is computed from the words in the block and an error detection/correction byte is computed for the error detection word. The words, the corresponding error detection/correction bytes, the error detection word and its error detection/correction byte are written to the corresponding block in the memory device.

In yet another embodiment, a method of reading data in a memory device includes detecting errors in a word using an error detection/correction byte corresponding to the word and an error detection word corresponding to a block containing the word to be read. Single-bit errors are corrected using the error detection/correction byte, if a single-bit error in the word is detected. A Double-bit error is corrected using the error detection/correction byte and the error detection word, if a double-bit error in the word is detected.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments are illustrated by way of example and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1 shows a block diagram of a memory device according to the conventional art.

FIG. 2 shows a block diagram of a memory device in accordance with one embodiment.

FIG. 3 shows a flow diagram of a method of writing data in a memory device in accordance with one embodiment.

FIGS. 4A and 4B show a flow diagram of a method of reading data in a memory device in accordance with one embodiment.

FIG. 5 shows a block diagram of a memory device in accordance with another embodiment.

DETAILED DESCRIPTION

Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. While this disclosure will be described in conjunction with these embodiments, it will be understood that they are not intended to limit the disclosure to these embodiments. On the contrary, the disclosure is intended to cover alternatives, modifications and equivalents, which may be included within the scope of the disclosure as defined by the appended claims. Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a thorough understanding. However, it is understood that embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail as not to unnecessarily obscure aspects of the disclosure.

Referring to FIG. 2, an exemplary memory device, in accordance with one embodiment, is shown. The memory device may be a computer readable medium, such as dynamic or static random access memory (RAM), read only memory (ROM), flash memory or the like. The memory device includes an array of memory cells 210, 220, 230, a row decoder 240, a column decoder 250 and error detection/correction logic 260. The memory cell array includes a data storage area 210, a word-wise (e.g., row) error detection/correction storage area 220 and a bit-wise (e.g., column) error detection storage area 230.

The data storage area 210 includes an array of memory cells arranged in a plurality of blocks. Each block contains m words. Each word includes n bytes of p bits. A q-bit error detection/correction code is calculated for each word to produce an error detection/correction byte corresponding to the particular word. The error detection/correction byte of each word is stored in the corresponding word-wise error detection/correction storage area 220. An error detection bit is also calculated from each respective bit in the words of a block to produce an error detection word for each block. The error detection word of each block is stored in the corresponding bit-wise error detection storage area 230. A q-bit error detection/correction code is also calculated for the error detection word and stored in the corresponding word-wise error detection/correction storage area 220.

The error detection/correction logic 260 may be adapted to generate the error detection/correction bytes and/or the error detection words. The error detection/correction logic 260 may also be adapted to detect and correct single bit errors in a word utilizing the error detection/correction bytes. In addition, the error detection/correction logic 260 may also be adapted to detect and correct double-bit errors in a single word in a given block utilizing the error detection/correction bytes of the given block in combination with the error detection word of the given block. Although the memory device is discussed herein as having error detection/correction logic 260 coupled to a single array of memory cells 210, 220, 230, it is understood that the error detection/correction logic 260 may be external to the array since it can be shared per access rather then per row or column.

In an exemplary implementation, the data storage area 210 is organized in 64-bit units. Each 64-bit unit, referred herein to as a word, is arranged as eight 8-bit bytes and extended by an additional 8-bit ECC byte in a word-wise ECC byte storage area 220. The data storage is further organized in blocks of eight words. The eight words (e.g., eight 8-bit bytes) in each block are extended by an additional parity bit in a bitwise ECC word storage area 230. That is, bit 0 of byte 0 of the ECC word storage area 230 stores the bit-wise exclusive-or (XOR) of all the words in the data storage area 210 for the given block. The parity bits for each respective bit in the eight words form an ECC word of eight bytes of 8-bit each. An ECC byte of the ECC word is determined and stored in the word-wide ECC byte storage area 220. The data bits arranged in eight words of eight 8-bit byes, the corresponding ECC bytes and the ECC word are illustrated in Table 1.

TABLE 1 [word 0] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 1] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 2] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 3] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 4] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 5] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 6] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [word 7] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte [ECC word] byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 ECC byte

Any conventional double-bit error detection (DED) single-bit error correction (SEC) code may be utilized to generate the error detection/correction bytes. Accordingly, the error detection/correction byte can be utilized to detect two bit errors in the corresponding word and correct a single error in the corresponding word. Any conventional single-bit error detection (SED) code may be utilized to generate the error detection word. For example, in one implementation, the bits of the error detection word are generated by the column-wise parity (e.g., XOR) of all the data bits for the n words in a block. Implementations may use either positive or negative parity to generate the bits of the error detection word. The error detection/correction byte of the error detection word is computed using the selected double-bit error detection single-bit error correction (DED-SEC) code applied to the XOR generated bytes of the error detection word.

Referring now to FIG. 3, a method of writing data in a memory device, in accordance with one embodiment, is shown. The method includes computing one or more error detection/correction bytes for each word in a block and one or more error detection words from all the words in the block. The error detection/correction bytes are stored in the respective error detection/correction extension corresponding to the words in the block. The error detection word is stored in an error detection extension corresponding to the block. In addition, an error detection/correction byte is computed for the error detection word and stored in a corresponding error detection/correction byte extension. In one implementation, the method of writing data includes computing and storing one or more ECC bytes and one or more ECC words.

More specifically, the method includes writing one or more words in a given block of a memory device, at 310. At 320, an ECC byte is computed for each of the words that are written in the block. The ECC byte may be computed in accordance with any DED-SEC technique. Each ECC byte is written to a corresponding portion of the ECC byte extension of the given block, at 330. In one implementation, the ECC byte extension of the given block may be located adjacent to the block and arranged along the rows of the block. In other embodiments, the ECC byte extension may be organized based on a plurality of blocks, one or more pages, one or more sectors, one or more banks, or the like.

At 340, parity bits are computed from all corresponding data bits of all the words in the block to generate an ECC word. That is a column-wise exclusive-OR (XOR) is calculated for each of the respective data bits of the words 0 through n of the block. The ECC word is written to the ECC word extension of the given block at 350. In one implementation, the ECC word extension of the given block may be located adjacent to the block and arranged as an additional row in the block. In another embodiment, the ECC word extension may be located adjacent to the block and arranged along the rows of the block by dividing the ECC word into n chunks, as described in more detail with reference to FIG. 5.

At 360, an ECC byte of the ECC word is computed. The ECC byte of the ECC word is written to the corresponding ECC byte extension, at 370. Accordingly, the bits corresponding to the ECC byte for the ECC word is the double-bit detection single-bit error correction code of the parity bits forming the ECC word.

Referring now to FIG. 4, a method of reading data in a memory device, in accordance with one embodiment, is shown. The method of reading data includes detecting data errors using the error detection/correction bytes. Single-bit errors within a data word are corrected using the error detection/correction bytes. In addition, a double-bit error in a single word of a block is corrected using the error detection word. In one implementation, the method of reading includes detecting data errors and correcting the errors using one or more ECC bytes and one or more ECC words.

More specifically, each word in a block, the corresponding ECC byte, and the ECC word for the block is read, at 405. At 410, each word is checked against the corresponding ECC byte. It is determined from the check whether: 1) the ECC byte indicates that there is no error in the corresponding word 415, 2) the ECC byte indicates a single-bit error in the corresponding word 420, or 3) the ECC byte indicates a multi-bit error in the corresponding word 425. The check, at 410, is repeated for each word read in the block 430.

If there are no errors in any of the words read in the block, at 435, then the read process is done, at 440. If the ECC byte indicates a single bit error in a given corresponding word, then the error in the given word is corrected according to the ECC algorithm that is utilized, at 420. At 445, the corrected quantity is stored back in the corresponding word, if there were no multiple-bit errors in any of the words read in the block. After the single-bit errors are corrected and stored back in the memory array, the read process is done 440, if there were no multiple-bit errors in any of the words.

If there are multi-bit errors in more than one word in the block, then the errors are uncorrectable, at 450. A report may be sent to the operating system, application that generated the read request, or the like, indicating that an uncorrectable memory read error has occurred.

At 455, if there is a multi-bit error in the ECC word, than the ECC word is recomputed from the data words. The re-computed ECC word is than stored back, at 460, and the read process is done, at 465.

If there is a multi-bit error in a single data word, then the corrected bits for all the other data words and the ECC word, and the uncorrected bits for the data word with the multi-bit error are used to correct the multi-bit error, at 470. In particular, the column parities from the data words are re-computed as if computing the ECC word anew, at 475. At 480, the re-computed ECC word is compared to the ECC word as read. For any bit position in which the re-computed ECC word differs from the ECC word as read, the corresponding bit in the data word with the multi-bit error is flipped, at 485. It is to be noted that the errors in the data word with multi-bit errors may all be in the ECC byte corresponding to the data word rather than the data bits. In such cases there may not be any data bits to flip.

At 490, the newly-corrected data bits in the data word, with the multi-bit error, are used to re-compute the data word's ECC byte. The corrected data word that had the multi-bit error and the re-computed ECC byte are stored back, at 495. Once the corrected data word and re-computed ECC byte are stored the memory read process is done, at 497.

Those skilled in the art appreciate that adding extra rows in the memory array to store the corresponding ECC word of each block complicates the address decoding. Accordingly, it may be advantageous to store the ECC word of each block as extensions of each word in the block. Referring now to FIG. 5, an exemplary memory device, in accordance with another embodiment, is shown. The memory device includes an array of memory cells 510, 520, 530, a row decoder 540, a column decoder 550 and error detection/correction logic 560. The memory cell array includes a data storage area 510 and an error detection/correction extension 520, 530.

The data storage area 510 includes an array of memory cells that are arranged in a plurality of blocks. Each block contains m words. Each word includes n bytes of p bits. A q-bit error detection/correction code is calculated for each word to produce an error detection/correction byte corresponding to the particular word. The error detection/correction byte of each word is stored in a first portion of the corresponding error detection/correction extension 520. An error detection bit is also calculated from each respective bit of a corresponding block of m words to produce an error detection word for each block. An error detection/correction byte is also calculated for the error detection word. The error detection word and the corresponding error detection/correction byte are divided into m chunks. The respective chunks of the error detection word and corresponding error detection/correction byte are stored in a second portion of the corresponding error detection/correction extension 530.

In an exemplary implementation, the data storage area 510 is organized in 64-bit units. Each 64-bit unit, referred herein to as a word, is arranged as eight 8-bit bytes and extended by an additional 8-bit ECC byte and an additional 9-bit ECC word chunks. The ECC word chunks include the ECC word and the ECC byte of the ECC word. The data bits arranged in eight words of eight 8-bit byes, the corresponding ECC bytes and the ECC word chunks are illustrated in Table 2.

TABLE 2 [word 0] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 0 [word 1] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 1 [word 2] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 2 [word 3] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 3 [word 4] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 4 [word 5] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 5 [word 6] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 6 [word 7] byte 0 byte 1 . . . byte 7 ECC byte ECC word chnk 7

Any conventional double-bit error detection (DED) single-bit error correction (SEC) code may be utilized to generate the error detection/correction bytes. Accordingly, the error detection/correction byte can be utilized to detect two bit errors in the corresponding word and correct a single error in the corresponding word. The bits of the error detection word are generated by the column-wise parity (e.g., XOR) of all the data bits for words 0-7. That is, bit 0 of byte 0 of the error detection word is the exclusive-OR of all the bits 0 of all the bytes 0 of all the data words. Implementations may use either positive or negative parity to generate the bits of the error detection word. The error detection/correction byte of the error detection word is computed using the selected DED-SEC algorithm applied to the XOR generated bytes of the error detection word.

In accordance with embodiments, single-bit errors due to soft errors do not become double-bit errors due to additional soft errors. In addition, although hard errors are not corrected, such errors are not aggravated either. The embodiments also advantageously utilize less of the memory cell array to detect and correct two-bit errors in a given block of memory. The embodiments also do not incur as much memory latency as conventional double-bit error detection and correction techniques.

The foregoing descriptions of specific embodiments have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the disclosure to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and its practical application, to thereby enable others skilled in the art to best utilize the disclosure and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the disclosure be defined by the Claims appended hereto and their equivalents.

Claims

1. A memory device comprising:

a first type of error information based on a first group of data; and

a second type of error information based on a second group of the data, wherein the second group is greater than the first group.

2. The memory device of claim 1, wherein the first group is a word.

3. The memory device of claim 2, wherein the second group is a block of words.

4. The memory device of claim 1, wherein the first type of error information comprises error detection and correction information.

5. The memory device of claim 1, wherein the second type of error information comprises error detection information.

6. The memory device of claim 1, wherein the first type of error information comprises a byte of bits.

7. The memory device of claim 6, wherein the second type of error information comprises a word of bytes.

8. A method comprising:

determining a first type of error information by using a first group of data;

determining a second type of error information by using a second group of the data, wherein the second group is greater than the first group; and

storing the first type of error information and the second type of error information in a memory.

9. The method of claim 8, wherein the first group is a word.

10. The method of claim 9, wherein the second group is a block of words.

11. The method of claim 8, wherein the first type of error information comprises error detection and correction information.

12. The method of claim 8, wherein the second type of error information comprises error detection information.

13. The method of claim 8, wherein the first type of error information comprises a byte of bits.

14. The method of claim 13, wherein the second type of error information comprises a word of bytes.

15. A method comprising:

detecting an error in data, wherein said detecting includes: using a first type of error information that is calculated from a first group of the data, and using a second type of error information that is calculated from a second group of the data, wherein the second group is greater than the first group; and

correcting the error by using at least one of the first type of error information and the second type of error information.

16. The method of claim 15, wherein the first group is a word.

17. The method of claim 16, wherein the second group is a block of words.

18. The method of claim 15, wherein the first type of error information comprises error detection and correction information.

19. The method of claim 15, wherein the second type of error information comprises error detection information.

20. The method of claim 15, wherein the first type of error information comprises a byte of bits, and wherein the second type of error information comprises a word of bytes.