DOUBLE BIT ERROR CORRECTION IN A CODE WORD WITH A HAMMING DISTANCE OF THREE OR FOUR

Info

Publication number: 20150341055
Type: Application
Filed: Jun 26, 2013
Publication Date: Nov 26, 2015
Inventors: Samuel EVAIN (SACLAY), Valentin GHERMAN (PALAISEAU)
Application Number: 14/411,067

Abstract

A method for determining the erroneous bits in an initial binary word affected by a double error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4 comprises reception of a datum indicative of a binary level of confidence, low or high, assigned to each of the bits of at least one part of the initial word, a step of generating the syndrome on the basis of the initial word and a step of determining whether the syndrome is that of a code word affected by a double error, in which if it identifies, on the basis of the syndrome, an error in the initial word whose two affected bits correspond to bits of low confidence in the initial word, the two erroneous bits are selected to be corrected. The method applies notably to the fields of error correcting codes and nanometric technologies.

Description

Description

The subject of the invention is a method and a device for determining and correcting double bit errors in a code word provided with a minimum Hamming distance equal to only 3 or 4. The invention proposes a device judiciously utilizing confidence information about the bits of a code word affected by an error in one or two bits so as to correct this error. The invention applies notably to the fields of error correcting codes and nanometric technologies.

In a digital system, the data are customarily stored and/or transmitted in the form of binary values called bits. These data may be affected by faults which may give rise to operating errors and ultimately the failure of the system in the absence of any correction mechanism or masking.

In order to guarantee an acceptable level of integrity for the data stored or transmitted, certain electronic systems use codes, customarily designated by the acronym ECC standing for the expression “Error Correcting Codes”. For this purpose, the data are encoded during the writing of said data to a storage system or during the transmission of said data through a system of interconnections. During the encoding of the data with an ECC code, check bits are added to the data bits so as to form code words.

In order to allow the correction of errors which affect up to n bits in the words of an ECC code, these code words must be separated by a minimum Hamming distance of 2n+1. To also allow the detection of errors which affect n−1 bits, apart from the correction of the errors affecting up to n bits, in the code words of an ECC, these code words must be separated by a minimum Hamming distance of 2n+2.

The code words of a linear corrector code are defined with the aid of a control matrix H. A binary vector V is a code word only if its product with the matrix H generates a zero vector. A code word V of a linear corrector code is checked by evaluating the matrix product H•V. The result of this operation is a vector called a syndrome. If the syndrome is a zero vector, the code word V is considered to be correct. A non-zero syndrome indicates the presence of at least one error. If the syndrome makes it possible to identify the positions of the affected bits, the errors affecting the code word can be corrected.

Various linear ECC codes exist with different error detection and correction capabilities. By way of example, a Hamming code makes it possible to correct a single error, that is to say an error which affects a single bit. This correction capability is dubbed SEC, the acronym standing for the expression “Single Error Correction”. The words of an SEC code are separated by a minimum Hamming distance of 3.

Another example of ECC code is the DEC code, the acronym standing for the expression “Double Error Correction”, which allows the correction of double errors, that is to say of errors affecting two bits in a code word. Quite obviously, the codes of this family are also capable of correcting all single errors. The words of a DEC code are separated by a minimum Hamming distance of 5.

This difference between the correction capability of an SEC code and that of a DEC code is due to an increase in the number of check bits in the case of a DEC code. For example, for eight data bits an SEC code needs four check bits while a DEC code needs eight check bits. Such a 100% increase in the number of check bits may be prohibitive because of the impact on the area of the components, their efficiency, their consumption and ultimately their cost.

Now, memories manufactured with the aid of emerging technologies, such as MRAMs, the acronym standing for the expression “Magnetic Random Access Memory”, may be affected by a significant rate of transient and/or permanent errors. For these types of memories, the correction capability of SEC codes is too low and DEC codes remain too expensive. In such situations, innovative masking solutions and/or corrections are necessary.

An aim of the invention is notably to propose a scheme for determining the erroneous bits of a double error, that is to say an error in two bits, in a code word whose minimum Hamming distance is less than 5. For this purpose, the subject of the invention is a method for determining the erroneous bits in an initial binary word affected by a double error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, comprising the reception of a datum indicative of a binary level of confidence, low or high, assigned to each of the bits of at least one part of the initial word, the method comprising a step of generating the syndrome on the basis of the initial word and a step of determining whether said syndrome is that of a code word affected by a double error, characterized in that if it identifies, on the basis of the syndrome, a double error in the initial word whose two affected bits correspond to bits of low confidence in the initial word, said two erroneous bits are selected so as to be corrected and in that if no double error generating said syndrome affects two bits of low confidence in the initial word, then a double error, for which one of the two bits affected is a bit of low confidence in the initial word, is identified.

The method thus makes it possible to correct double errors for a code provided with a minimum Hamming distance of 3 or 4 by identifying, out of all the possible errors producing the initial word, a double error whose affected bits both correspond to bits of low confidence. This second step makes it possible to discriminate the errors when the most favorable case is not realized, stated otherwise if, out of all the double errors producing the initial word, no error contains two bits of low confidence. In this case, the method searches for a double error, one of whose two bits suffers a low confidence level.

Advantageously, the method comprises a step of determining whether the number of bits of low confidence is strictly greater than 1;

- in the case where this number is strictly greater than 1, a double error generating said syndrome and affecting two bits of low confidence in the initial word is sought;
- in the case where this number is equal to 1, a double error generating said syndrome and affecting the bit of low confidence in the initial word is sought;
  if such an error is found, the two bits affected by this error are selected so as to correct them.

The subject of the invention is also a method for correcting a double error in an initial binary word comprising the steps of the determining method such as is described above, and a step of inverting the erroneous bit or bits determined by said determining method.

The subject of the invention is also a method for correcting the erroneous bits in an initial binary word affected by a single error or by a double error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, in which, when the syndrome arising from the initial word corresponds to a double error, the method for correcting a double error such as presented above is executed, and in which, when said syndrome corresponds to a single error, a correction of single error in the initial word is executed.

The subject of the invention is also a device for correcting the erroneous bits in a binary initial word affected by a double or single error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, the device being able to receive a datum indicative of a binary level of confidence, low or high, assigned to each of the bits of at least one part of the initial word, the device comprising a syndrome generator fed with the initial word and means for selectively correcting bits of the initial word, characterized in that the means for selectively correcting bits are fed by a module for selecting the bits to be inverted which is able to select the two bits affected by a double error in the initial word as a function of said datum indicative of confidence, of the response of a module indicating whether there is more than only one bit of low confidence level in the binary initial word and of the syndrome arising from said syndrome generator, said selecting module being able to identify a double error in the initial word at least one of whose two affected bits correspond to bits of low confidence in the initial word, the means for selectively correcting bits are fed by a module for selecting the bits to be inverted, suitable for selecting the bit affected by a single error in the initial word as a function of the syndrome arising from the syndrome generator.

The subject of the invention is also a memory module comprising a correcting device such as described above.

Other characteristics and advantages of the invention will become apparent with the aid of the description which follows, given by way of nonlimiting illustration and with regard to the appended drawings among which:

FIG. 1 is an exemplary device according to the invention allowing notably the correction of single and double errors in a word of a code characterized by a minimum Hamming distance equal to 3 or 4;

FIG. 2 is an exemplary method according to the invention for selecting a double error from among several double errors producing one and the same syndrome.

FIG. 1 presents an exemplary device according to the invention allowing notably the correction of single and double errors in a word of a code provided with a minimum Hamming distance equal to 3 or 4—stated otherwise, the smallest Hamming distance separating two words of this code is equal to 3 or 4.

Hereinafter, it is considered that a datum indicative of confidence, in the example in the form of a word 102, is used to indicate a binary level of confidence of each bit in a code word 101, termed the initial word 101, at the input of the device according to the invention. In the examples developed, the initial word 101 and the confidence word 102 comprise the same number of bits and certain bits in the initial word 101 may be erroneous. Without affecting the generality of the invention, it is also considered subsequently that in the confidence word 102, a bit with the logic-1 value signals that the corresponding bit in the initial word 101 has a low confidence level, a bit with the logic-0 value signaling a bit with a high confidence level. The binary initial word may be affected by a double or single error.

The device 100 receives on a first input 100a an initial word 101 and on a second input 100b a confidence word 102 comprising the binary information relating to the confidence level of each bit in the initial word 101. This confidence word 102 is used to allow the correction of double errors which generate a different syndrome to all the syndromes produced by a single error. The device 100 delivers a corrected word 104 as output 100c.

The device according to the invention 100 comprises a syndrome generating module 110 fed with the initial word 101, a single error selecting module 140, a double error selecting module 130, a module for correcting errors 150, as well as a test module 120 able to determine whether the number of bits of low confidence in the initial word is greater than or equal to 1.

The syndrome generating module 110 receives as input the initial word 101 and generates a syndrome on the basis of the bits of the initial word 101. In the case where a linear ECC is used, this module performs the matrix product H•V, where H is the parity matrix characterizing the code to which the initial word 101 belongs and V is the initial word 101. The implementation of the calculation of this matrix product is carried out according to known techniques. When the initial word 101 does not contain any error, the syndrome generating module 110 produces a syndrome 103 which is a zero vector; on the other hand, when the initial word 101 contains an error in at least 1 bit, then the syndrome 103 is a non-zero vector. If the minimum Hamming distance of the code that served for the generation of the initial word 101 is equal to 4, then all the double errors produce different syndromes to the syndromes produced by the single errors. If the code to which the initial word 101 belongs is imperfect and characterized by a minimum Hamming distance equal to 3, then a non-zero fraction of the double errors produces different syndromes to the syndromes produced by single errors.

The test module 120 is suitable for receiving the confidence indicator word 102 and for producing a binary signal as output. In the example, the test module 120 indicates via the output signal whether this word 102 contains more than one bit equal to a logic-1, thereby signaling the presence of more than one bit with a low confidence level in the initial word 101. A person skilled in the art can choose from among several possible implementations of such a module. According to another embodiment of the device according to the invention, the device does not comprise any distinct test module 120, the test being performed by the double error selecting module 130.

The double error selecting module 130 is configured to receive the confidence indicator word 102, the syndrome 103 produced by the module 110 described above and the binary output of the module 120. When the syndrome 103 produced by the module 110 is different from all the syndromes produced by a single error and when the number, indicated by the word 102, of bits of low confidence is greater than 0, the module 130 makes it possible, by utilizing the information item contained in the confidence indicator word 102, to determine which one out of the double errors possibly contained in the initial word 101 has occurred. An exemplary method implemented by the double error selecting module 130 is described further on with regard to FIG. 2.

The single error selecting module 140, linked at output of the syndrome generating module 110, makes it possible to identify a single error in the case where the syndrome calculated by the module 110 corresponds to such an error. The module 140 is suitable for selecting the bit affected by a single error in the initial word as a function of the syndrome arising from said syndrome generator. The implementation of such a module 140 is known to the person skilled in the art.

The error correcting module 150, receiving the code word 101 and fed by the single error selecting module 140 and by the double error selecting module 130, allows the correction of single or double errors affecting the initial word 101 by using the information items provided by one of these two modules 130, 140 indicating the position of the erroneous bits. For example, if the error selecting module 140 indicates that the bit of order n is false, the error correcting module 150 performs the correction by inverting the value of this bit of order n, toggling it from 0 to 1 or from 1 to 0 as a function of the value, provided as input by the code word 101, of this “false” bit. Thus, if this “false” bit is 1, it is toggled to the value 0. The implementation of such a module 150 is known to the person skilled in the art. The correcting module 150 delivers the corrected word 104 as output.

The assembly formed by the syndrome generating module 110, the single error selecting module 140 and the single error correcting module 150 is likened to the functions fulfilled by a module of SEC (“Single Error Correction”) type, known elsewhere.

In the case where the minimum Hamming distance of the code to which the word 101 belongs is equal to 4 and all the double errors produce different syndromes to the syndromes produced by single errors, the device of FIG. 1 allows the correction of all the double errors.

FIG. 2 presents an exemplary method according to the invention for selecting a double error from among several double errors producing one and the same syndrome. For example, this method can be executed by the module 130 for selecting a double error of the bits to be inverted, with the aid of the test module 120, these modules being presented above with regard to FIG. 1. This method is necessary only if the same syndrome 103 can be generated by several different double errors, this generally being the case when the minimum Hamming distance between the words of the code is less than 5.

The method utilizes the confidence indicator word 102 and the number of bits which suffer a low confidence level so as to select a double error in the initial word 101 from among several double errors generating the same syndrome 103.

In a first step 210, a double error is not selected if the syndrome 103 does not correspond to a detectable double error or if there is no bit with a low confidence level in the code word 101. A double error is considered to be detectable if it generates a different syndrome to all the syndromes produced by a single error. In the example, this first step performs the following test: does the syndrome 103 generated by the module 110 correspond to a detectable double error and is there at least one bit with a low confidence level according to the datum indicative of confidence 102? It should be noted that the test determining whether there is at least one bit of low confidence in the initial word can be executed by the modules 120 or 130 illustrated in FIG. 1. The first step 210 also indicates that only double errors which correspond to the syndrome 103 are considered. When the test of the first step 210 is positive, we proceed to a second step 220. In the converse case, no double error is indicated 260.

The second step 220 performs a test on the number of bits with a low confidence level in the initial word 101. In the case where this second test step 220 indicates that the initial word 101 does not contain more than one bit with a low confidence level, we proceed to a third step 240 performing the following test: does there exist a double error corresponding to the syndrome 103 and one of whose two bits has a low confidence level? Stated otherwise, does there exist a pair {code word, double error}, for which the code word affected by this double error generates the syndrome 103, and for which one bit out of the two bits affected by the error corresponds to the bit whose confidence level is signaled as low in the initial word 101?

In the case where the second test step 220 indicates that the initial word 101 contains more than one bit suffering a low confidence level, it is followed by a third alternative test step 230 making it possible to choose a double error if the two bits that it affects have a low confidence level. The test performed is the following: does there exist a double error whose two bits correspond to bits of low confidence in the initial word 101? Stated otherwise, does there exist a pair {code word, double error}, for which the code word affected by this double error generates the syndrome 103, and for which the two bits affected by the error correspond to bits whose confidence level is signaled as low in the initial word 101? This case corresponds to the most favorable case for choosing a double error, since there is agreement between two bits of low confidence and at least one of the double errors from which the syndrome 103 arising from the initial word 101 could originate. For example, for an initial word equal to “00110101” and a confidence indicator word equal to “01100010”, if there exists a word of the code from among the following words: “01010101”, “01110111”, “00010111”, then this code word is considered to be the error-free word from which the initial word originates. In the example, if “00010111” is a code word, the two erroneous bits which differ with respect to the initial word are considered to be the erroneous bits and are therefore selected.

For these two third steps 230, 240, when the result of the test is positive, the double error is indicated 250. When the result is negative, no double error is indicated 260.

In a hardware implementation, all these steps which follow one another sequentially or subsets formed by some of these steps may be executed concurrently.

Claims

1. A method for determining the erroneous bits in an initial binary word affected by a double error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, comprising the reception of a datum indicative of a binary level of confidence, low or high, assigned to each of the bits of at least one part of the initial word, the method comprising a step of generating the syndrome on the basis of the initial word and a step of determining whether said syndrome is that of a code word affected by a double error, comprising at least the following steps:

a test step on the number of bits with a low confidence level in the initial word

if the test indicates that said initial word contains more than one bit suffering a low confidence level, then a third step is executed,

does there exist a pair {code word, double error}, for which the code word affected by this double error generates said syndrome, and for which the two bits affected by the error correspond to bits whose confidence level is signaled as low in said initial word, if yes then a double error in said initial word whose two affected bits correspond to bits of low confidence in said initial word is chosen on the basis of said syndrome, said two erroneous bits are selected so as to be corrected,

if the test indicates that said initial word does not contain more than one bit with a low confidence level, a step is executed which performs the following test: does there exist a pair {code word, double error}, for which the code word affected by this double error generates said syndrome, and for which one bit out of the two bits affected by the error corresponds to the bit whose confidence level is signaled as low in said initial word, if no double error generating said syndrome affects two bits of low confidence in said initial word, then a double error, for which one of the two bits affected is a bit of low confidence in said initial word, is identified.

2. The method as claimed in claim 1, in which the method comprises a step for determining whether the number of bits of low confidence is strictly greater than 1;

in the case where this number is strictly greater than 1, a double error generating said syndrome and affecting two bits of low confidence in the initial word is sought;

in the case where this number is equal to 1, a double error generating said syndrome affecting the bit of low confidence in the initial word is sought;

if a double error generates a different syndrome to all the syndromes produced by a single error, the two bits affected by this error are selected so as to correct them.

3. The method as claimed in claim 1, in which the method comprises a prior test step where it is checked whether said syndrome arising from the initial word corresponds to that of a double error generating a different syndrome to all the syndromes produced by a single error and whether at least one bit of the initial word suffers a low confidence level; if the test is negative, no erroneous bit is indicated.

4. A method for correcting a double error in an initial binary word (101) comprising the steps of the determining method as claimed in claim 1, and a step of inverting the selected erroneous bit or bits.

5. A method for correcting the erroneous bits in an initial binary word affected by a single error or by a double error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, in which when the syndrome arising from the initial word corresponds to a double error, the method for correcting a double error as claimed in claim 4 is executed, and in which when said syndrome corresponds to a single error, a correction of single error in the initial word is executed.

6. A device for correcting the erroneous bits in a binary initial word affected by a double or single error and arising from a code endowed with a minimum Hamming distance equal to 3 or 4, the device being able to execute the steps of the method as claimed in claim 1 and to receive a datum indicative of a binary level of confidence, low or high, assigned to each of the bits of at least one part of the initial word, the device comprising a syndrome generator fed with the initial word and means for selectively correcting bits of the initial word, comprising:

said means for selectively correcting bits are fed by a module for selecting the bits to be inverted which is able to select the two bits affected by a double error in the initial word as a function of said datum indicative of confidence, of the response of a module indicating whether there is more than only one bit of low confidence level in the word and of the syndrome arising from said syndrome generator,

said selecting module being able to identify a double error in the initial word at least one of whose two affected bits correspond to bits of low confidence in the initial word, and if no double error generating said syndrome affects two bits of low confidence in the initial word, then the selecting module identifies a double error, for which one of the two bits affected is a bit of low confidence in the initial word,

the means for selectively correcting bits are fed by a module for selecting the bits to be inverted which is suitable for selecting the bit affected by a single error in the initial word as a function of the syndrome arising from said syndrome generator.

7. A memory module comprising a correcting device as claimed in claim 6.