Storage device with data quality metric and selectable data recovery scheme
A storage device is disclosed. The storage device may include storage for data. A controller may manage writing the data to the storage and reading the data from the reading storage. A data quality metric table may map a first number of errors to a first data quality metric and map a second number of errors to a second data quality metric. A transmitter may return the data quality metric table to a host.
Latest Patents:
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/130,399, filed Dec. 23, 2020, which is incorporated by reference herein for all purposes.
FIELDThe inventive concepts relate generally to storage devices, and more particularly to improve error correction processing.
BACKGROUNDHost machines may write data to and read data from storage devices. On occasion, when processing a host read request, a storage device may detect an error. Error correcting codes provide a mechanism by which such errors might be corrected. But applying error correction may take additional time: the storage device may end up taking more time to respond to the read request than is acceptable to the host. In addition, error correction techniques consume power, increasing the operating expense for the storage device.
A need remains to improve the performance of a storage device to reduce the time required to perform error correction and reduce the power used by the storage device.
Reference will now be made in detail to embodiments of the inventive concept, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth to enable a thorough understanding of the inventive concept. It should be understood, however, that persons having ordinary skill in the art may practice the inventive concept without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first module could be termed a second module, and, similarly, a second module could be termed a first module, without departing from the scope of the inventive concept.
The terminology used in the description of the inventive concept herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the inventive concept. As used in the description of the inventive concept and the appended claims, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The components and features of the drawings are not necessarily drawn to scale.
Data recovery from Not-And (NAND) flash chips inside a storage device, such as a Solid State Drive (SSD), may be done by error correction coding (ECC). Based on the ECC decoding results, an SSD only reports either a correction or a failure to correct data in a block in which an error was detected.
There are applications which may tolerate data error to a pre-determined rate. In addition, a protocol, such as Non-Volatile Memory Express (NVMe), may support read recovery levels, which allows configuring a storage device to various levels of effort to be attempted when attempting to recover data from a block with an error (or when to return an error). With this data quality feature, the estimated time to recover data from a block with an error may be computed in advance based on the protocol read recovery levels requirement, without attempting to use a recovery scheme that may have a long latency and/or require additional power to execute.
Non-Volatile Memory (NVM) may support a configurable Read Recovery Level (RRL) attribute that balances the completion time for read commands with the amount of error recovery to be applied to those read commands (when an error occurs). The RRL attribute may be applied to an NVM Set associated with the RRL attribute. A namespace associated with an NVM Set may inherit the RRL of that NVM Set. If NVM Sets are not supported, all namespaces in the NVM subsystem may use an identical RRL. Table 1 shows an example RRL table (“O” stands for “optional”, “M” stands for “mandatory”).
A storage device may include a data quality feature implementation. In a storage device without the data quality metric, when attempting to read the data, the storage device may need perform a brute force method or simplified brute force method to try all error recovery schemes based on the error recovery enhancement order. A brute force method may be to try error recovery schemes in a sequence—for example, from fastest to slowest-until either some scheme results in a successful error recovery or all error recovery schemes have been tried. A simplified brute force method may be to try error recovery schemes in a sequence—for example, from fastest to slowest-until either some scheme results in a successful error recovery or the time spent trying to read and recover the data has exceeded the available time (that is, the latency the host expects from the storage device). To try all error recovery schemes in this manner may require a long latency and additional power consumption, especially for poor data quality.
In contrast, in a storage device with the data quality metric, during the namespace creation during the admin command stage, the RRL table may be built. The RRL of each namespace may be set by the host using protocol feature commands. For Quality of Service (QoS) control purpose, a host may issue verify( ) or get_LBA_status( ) commands, in response to which the storage device data quality metric estimator may return the data quality metric level to the host quickly without applying a real data recovery procedure (which may have a long latency and/or require additional power consumption to execute). During the normal re ad( ) command stage, a first data quality metric estimator module may calculate data quality metrics. Based on the data quality metrics and the RRL attribute, the error recovery mode selector may select an error recovery scheme to be used. Because the RRL attribute and the data quality metrics may identify a single error recovery scheme to be used, the storage device may avoid having to try multiple error recovery schemes in a brute force method or a simplified brute force method to attempt to find an error recovery scheme that may succeed. For example, for data with the poorest data quality, the storage device may know error recovery is unlikely to succeed or may exceed the RRL attribute. In such cases, the storage device may return the data directly to the host and bypass error recovery.
Data Quality Estimation Algorithm
For each ECC, there may be a parity check matrix H associated with it. For a code of (N,K) with N as the code length and K the data length, H may be a matrix with N-K independent rows and N columns as shown below:
An ECC codeword C=(c1, c2, c3, . . . cN) of length N should satisfy the linear equations in the parity check matrix H: in other words, the equation HC=0 should be satisfied. Note that ECC codeword C may be the same for a given data, regardless of the parity check matrix H of the individual ECCs.
As matric H includes N-K rows, the equation HC=0 may include N-K linear equations (which may also be termed “check equations”, as they check whether the codeword may have errors). The encoded codeword C may be stored on a track, in a NAND cell, or using some other technique during the write stage. There may be some errors when the codeword is read out from storage location: the presence of such errors may be denoted as corrupted codeword as {tilde over (C)}. In general, it may be expected that the number of linear equations in H{tilde over (C)} that are not satisfied (i.e., that are not equal to 0), the larger the number of error in the corrupted codeword {tilde over (C)}. The ECC decoder may attempt to use all decoding algorithms to try to find a codeword
In storage devices without the data quality metric, either the equation HC=0 is satisfied and the data may be successfully recovered, or H
The Data Quality Metric Estimator table is shown in Table 2 below.
In general, the storage device may provide a data quality metric based on the number of violated linear equations (V) relative to the number of rows in the parity check matrix H (this number may be labeled “M”). Thus, the data quality metric may be represented as (V/M) %. Because relationship between the number of violated linear equations and the bit error rate may be non-linear, an example data quality table, such as that shown in Table 2 above, may be built based on the number of violated linear equations for a certain ECC decoder.
Processor 110 may be coupled to memory 115. Memory 115 may be any variety of memory, such as flash memory, Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Persistent Random Access Memory, Ferroelectric Random Access Memory (FRAM), or Non-Volatile Random Access Memory (NVRAM), such as Magnetoresistive Random Access Memory (MRAM) etc. Memory 115 may also be any desired combination of different memory types, and may be managed by memory controller 125. Memory 115 may be used to store data that may be termed “short-term”: that is, data not expected to be stored for extended periods of time. Examples of short-term data may include temporary files, data being used locally by applications (which may have been copied from other storage locations), and the like.
Processor 110 and memory 115 may also support an operating system under which various applications may be running. These applications may issue requests (which may also be termed commands) to read data from or write data to either memory 115 or storage device 120. Storage device 120 may be used, for example, to store initial parameters (or ranges of values for initial parameters, along with what types of behaviors the ranges of values represent) used to initialize the simulation. Storage device 120 may be accessed using device driver 130. While
SSD 120 may also include host interface layer 310, which may manage interface 305. If SSD 120 includes more than one interface 305, a single host interface layer 310 may manage all interfaces, SSD 120 may include a host interface layer for each interface, or some combination thereof may be used.
SSD 120 may also include SSD controller 315, various channels 320-1, 320-2, 320-3, and 320-4, along which various flash memory chips 325-1, 325-2, 325-3, 325-4, 325-3, 325-6, 325-7, and 325-8 may be arrayed. SSD controller 315 may manage sending read requests and write requests to flash memory chips 325-1 through 325-8 along channels 320-1 through 320-4. Although
Within each flash memory chip, the space may be organized into blocks, which may be further subdivided into pages, and which may be grouped into superblocks. The page is typically the smallest unit of data that may be read or written on an SSD. Page sizes may vary as desired: for example, a page may be 4 KB of data. If less than a full page is to be written, the excess space is “unused”.
While pages may be written and read, SSDs typically do not permit data to be overwritten: that is, existing data may be not be replaced “in place” with new data. Instead, when data is to be updated, the new data is written to a new page on the SSD, and the original page is invalidated (marked ready for erasure). Thus, SSD pages typically have one of three states: free (ready to be written), valid (containing valid data), and invalid (no longer containing valid data, but not usable until erased) (the exact names for these states may vary).
But while pages may be written and read individually, the block is the basic unit of data that may be erased. That is, pages are not erased individually: all the pages in a block are typically erased at the same time. For example, if a block contains 256 pages, then all 256 pages in a block are erased at the same time. This arrangement may lead to some management issues for the SSD: if a block is selected for erasure that still contains some valid data, that valid data may need to be copied to a free page elsewhere on the SSD before the block may be erased. (In some embodiments of the inventive concept, the unit of erasure may differ from the block: for example, it may be a superblock, which may be a set of multiple blocks.) SSD controller 315 may include flash translation layer 330 (which may be termed more generally a logical-to-physical translation layer, for storage devices that do not use flash storage), Read Recover Level (RRL) table 335, data quality metric estimator 340, data quality metric table 345, error recovery mode selector 350, and error correction code modules 355-1 and 355-2. Flash translation layer 330 may handle translation of LBAs or other logical IDs (as used by processor 110 of
Data quality metric estimator 340 may take information about the data that was read from flash chips 325-1 through 325-8 and determine what error recovery mechanism to use. To understand how this works requires some understanding of how error correcting codes (ECCs) work.
In general, each ECC, which may be implemented using ECC modules 355-1 and 355-2, may include a parity check matrix H. The parity check matrix H may include n−k rows and n columns, where n is the length of a codeword and k is the rank of the code. Since the parity check matrix H includes n columns and a codeword C is of length n, the product HC may be a vector with n−k rows. The codeword C may be generated based on the data being written, and may be selected so that HC is 0. Thus, the product HC may represent a set of n−k linear equations. When data is written to flash chips 325-1 through 325-8, the codeword C may also be written to flash chips 325-1 through 325-8.
When data is read, and in particular when the codeword C is read, errors may occur, resulting in the codeword C being read incorrectly. In such situations, the corrupted codeword may be represented as {tilde over (C)}. Since {tilde over (C)} may have errors (and therefore might not be a codeword itself), H{tilde over (C)} may not equal 0 as would HC: some of the linear equations may not be satisfied (or, put another way, some of the linear equations may be violated). The ECC module may then attempt to find another codeword
With this understanding, data quality metric estimator 340 may be understood. Turning momentarily to
Returning to
Given the data quality metric estimated by data quality metric estimator 340, error recovery mode selector 350 may then select an error recovery mode to be applied to the data. SSD 120 may implement a number of different ECC modules, as demonstrated by ECC modules 355-1 and 355-2. While
ECC modules 355-1 and 355-2 (as well as data quality metric estimator 340 and error recovery mode selector 350) may be implemented using an Application-Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a central processing unit (CPU), a graphics processing unit (GPU), or a general purpose GPU (GPGPU). Depending on the implementation, firmware or software may also be used to implement ECC modules 355-1 and 355-2 (as well as data quality metric estimator 340 and error recovery mode selector 350).
One approach to performing ECC, which may be termed the brute force approach, is to attempt to apply each ECC module in turn, until either the error is corrected or no remaining ECC modules remain to be tried. A second approach is to attempt to apply each ECC module in turn until one of three conditions is met: the error is corrected, no remaining ECC modules remain to be tried, or the time spent processing a read request (including both the time spent receiving the read request and the time spent attempting to perform error correction) exceeds some maximum latency. In either case, if the error is successfully corrected, then the corrected data is returned to host 105 of
But these approaches have some drawbacks. First, because these approaches apply ECC modules 355-1 and 355-2 each in turn, the time required to perform error correction includes the time required to execute each ECC module in turn, which may be greater than the time required to execute a single ECC module. In addition, executing ECC modules requires power. Like time, executing multiple ECC modules may consume more power than just executing a single ECC module.
It may also happen that host 105 of
To address these drawbacks, error recovery mode selector 350 may select a single ECC module 355-1 or 355-2 to execute, based on the data quality metric for the data. If the selected ECC module corrects the data, then the corrected data may be returned more rapidly than by trying multiple ECC modules in turn, and with lower power consumption (since only one ECC module is executed).
In some embodiments of the inventive concept, error recovery mode selector 350 may select the ECC module that is the best fit for the data quality metric. For example, error recovery mode selector 350 may select the fastest executing ECC module that is expected to be able to correct the error based on the data quality metric. In other embodiments of the inventive concept, error recovery mode selector 350 may select the most thorough ECC module that is expected to complete within the latency host 105 of
There are also options that may be used when error recovery is not expected to succeed. In some embodiments of the inventive concept, if the data quality metric indicates that the time required to correct the error is greater than the latency host 105 of
In yet other embodiments, even if error correction is not performed, SSD 120 might return the data instead of an error. There are applications where some data errors may be tolerated. For example, playback of audio or video data may tolerate some errors in the data, since most of the data may be successfully presented to the user. Host 105 of
As mentioned above, SSD controller 315 might include only one ECC module 355-1 or 355-2. But even if SSD controller 315 only includes one ECC module 355-1 or 355-2, it is possible to use the data quality metric to determine whether or not to apply the ECC module. For example, the latency host 105 of
For example, entry 505 shows that namespace 515 maps to RRL 520, accepted latency 525, and accepted data quality metric 530, while entry 510 shows that namespace 535 maps to RRL 540, accepted latency 545, and accepted data quality metric 550. Accepted latencies 525 and 545 may represent default latencies that host 105 of
In some embodiments of the inventive concept, the term “data quality metric” may also be used to include both the data quality metric and the latency. That is, in some embodiments of the inventive concept (particularly in embodiments of the inventive concept where RRL table 345 does not include a column for both accepted latency and accepted data quality metric), the term “data quality metric” may be understood to mean either the data quality metric itself or a latency. But in embodiments of the inventive concept where both terms “data quality metric” and “latency” are described, the term “data quality metric” is not intended to mean latency.
As discussed with reference to
For example, entry 605 shows that number of errors 615 maps to BER 620, error recovery latency 625, and data quality metric 630, while entry 610 shows that number of errors 635 maps to BER 640, error recovery latency 645, and data quality metric 650. Error recovery latencies 625 and 645 may represent the estimated time required to perform error recovery for the identified number of errors. Data quality metrices 630 and 650 may represent the data quality metrices assigned to data with the identified number of errors. Thus, for example, entry 605 reflects that if there are no violated linear equations, then there are no errors in the data, which means that no time is needed to perform error correction and the data quality is considered perfect. In contrast, entry 610 reflects that there are a few violated linear equations, which would take an estimated 2 microseconds (s) to correct: with such a number of errors, the data quality is high, but not necessarily perfect.
Using the information in data quality metric table 345 (in combination with the accepted latency and/or accepted data quality metric for the read request), error recovery mode selector 350 of
At some later point in time, host 105 may send read request 720 to read data 710 back out from storage device 120. Storage device 120 may then attempt to read data 710 from storage, along with the stored codeword, and may use the codeword in combination with the parity check matrix H to determine whether the data is perfect or has errors: if the data has errors, storage device 120 may also use the codeword and parity check matrix H to determine the number of violated linear equations: storage device 120 may then use the number of violated linear equations and data quality metric table 345 of
Host 105 may then use the data quality metric to decide what action to take. For example, host 105 may simply treat response 725 as indicating that there was an error reading the data from storage device 120, and may proceed based on that information. But in some embodiments, host 105 may resend the read request, shown as read request 730. Read request 730 may specify target data quality level 735 and/or target latency 740, which storage device 120 may then use in deciding what ECC module to select (or to select no ECC module), and whether to return the data with errors (if the data could not be corrected). Once storage device 120 decides what correction (if any) to perform, storage device 120 may perform any selected correction and may send response 745 back to host 105. Response 705 may include corrected data 750, response 705 may include the uncorrected data, or response 705 may return an error, depending on what error correction was (or was not) performed and what information was chosen to send back to host 105.
In
At block 825 (
At block 850 (
Alternatively, at block 910, storage device 120 of
Alternatively, at block 920, storage device 120 of
Alternatively, at block 925, storage device 120 of
In
Embodiments of the inventive concept offer technical advantages over the prior art. The storage device may determine a data quality metric for data being read from the storage device. This data quality metric may be determined based on a number of violated linear equations based on a parity check matrix H and a codeword for the data. This data quality metric may then be used to select an error correcting code module to apply to the data. By selecting one error correcting code module, latency may be reduced when compared with trying multiple error correcting code modules in sequence. Similarly, by selecting one error correcting code module, power consumption may be reduced when compared with trying multiple error correcting code modules in sequence. In addition, the data quality metric may be compared with a data quality metric for the data as determined by the host, which may reflect that the host may tolerate some errors in the data: if the data has a sufficiently high data quality, the data may be returned to the host without correction (i.e., with errors).
The following discussion is intended to provide a brief, general description of a suitable machine or machines in which certain aspects of the inventive concept may be implemented. The machine or machines may be controlled, at least in part, by input from conventional input devices, such as keyboards, mice, etc., as well as by directives received from another machine, interaction with a virtual reality (VR) environment, biometric feedback, or other input signal. As used herein, the term “machine” is intended to broadly encompass a single machine, a virtual machine, or a system of communicatively coupled machines, virtual machines, or devices operating together. Exemplary machines include computing devices such as personal computers, workstations, servers, portable computers, handheld devices, telephones, tablets, etc., as well as transportation devices, such as private or public transportation, e.g., automobiles, trains, cabs, etc.
The machine or machines may include embedded controllers, such as programmable or non-programmable logic devices or arrays, Application Specific Integrated Circuits (ASICs), embedded computers, smart cards, and the like. The machine or machines may utilize one or more connections to one or more remote machines, such as through a network interface, modem, or other communicative coupling. Machines may be interconnected by way of a physical and/or logical network, such as an intranet, the Internet, local area networks, wide area networks, etc. One skilled in the art will appreciate that network communication may utilize various wired and/or wireless short range or long range carriers and protocols, including radio frequency (RF), satellite, microwave, Institute of Electrical and Electronics Engineers (IEEE) 802.11, Bluetooth®, optical, infrared, cable, laser, etc.
Embodiments of the present inventive concept may be described by reference to or in conjunction with associated data including functions, procedures, data structures, application programs, etc. which when accessed by a machine results in the machine performing tasks or defining abstract data types or low-level hardware contexts. Associated data may be stored in, for example, the volatile and/or non-volatile memory, e.g., RAM, ROM, etc., or in other storage devices and their associated storage media, including hard-drives, floppy-disks, optical storage, tapes, flash memory, memory sticks, digital video disks, biological storage, etc. Associated data may be delivered over transmission environments, including the physical and/or logical network, in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format. Associated data may be used in a distributed environment, and stored locally and/or remotely for machine access.
Embodiments of the inventive concept may include a tangible, non-transitory machine-readable medium comprising instructions executable by one or more processors, the instructions comprising instructions to perform the elements of the inventive concepts as described herein.
The various operations of methods described above may be performed by any suitable means capable of performing the operations, such as various hardware and/or software component(s), circuits, and/or module(s). The software may comprise an ordered listing of executable instructions for implementing logical functions, and may be embodied in any “processor-readable medium” for use by or in connection with an instruction execution system, apparatus, or device, such as a single or multiple-core processor or processor-containing system.
The blocks or steps of a method or algorithm and functions described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a tangible, non-transitory computer-readable medium. A software module may reside in Random Access Memory (RAM), flash memory, Read Only Memory (ROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, hard disk, a removable disk, a CD ROM, or any other form of storage medium known in the art.
Having described and illustrated the principles of the inventive concept with reference to illustrated embodiments, it will be recognized that the illustrated embodiments may be modified in arrangement and detail without departing from such principles, and may be combined in any desired manner. And, although the foregoing discussion has focused on particular embodiments, other configurations are contemplated. In particular, even though expressions such as “according to embodiments of the inventive concept” or the like are used herein, these phrases are meant to generally reference embodiment possibilities, and are not intended to limit the inventive concept to particular embodiment configurations. As used herein, these terms may reference the same or different embodiments that are combinable into other embodiments.
The foregoing illustrative embodiments are not to be construed as limiting the inventive concept thereof. Although a few embodiments have been described, those skilled in the art will readily appreciate that many modifications are possible to those embodiments without materially departing from the novel teachings and advantages of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of this inventive concept as defined in the claims.
Embodiments of the inventive concept may extend to the following statements, without limitation:
Statement 1. Embodiments of the inventive concept include a storage device, comprising:
storage for data;
a controller to manage writing the data to the storage and reading the data from the reading storage;
a data quality metric table to map a first number of errors to a first data quality metric and map a second number of errors to a second data quality metric; and
a transmitter to return the data quality metric table to a host.
Statement 2. Embodiments of the inventive concept include the storage device according to statement 1, wherein the storage device includes a Solid State Drive (SSD).
Statement 3. Embodiments of the inventive concept include the storage device according to statement 1, wherein:
the controller is configured to receive from the host a read request to read a first data from the storage; and
the storage device further comprises a data quality metric estimator to estimate the first number of errors in the first data.
Statement 4. Embodiments of the inventive concept include the storage device according to statement 3, wherein the transmitter is configured to return the first number of errors in the first data to the host.
Statement 5. Embodiments of the inventive concept include the storage device according to statement 3, wherein the transmitter is operative to return the first data quality metric to the host based at least in part on the first number of errors and the data quality metric table.
Statement 6. Embodiments of the inventive concept include the storage device according to statement 3, further comprising:
a first error correcting code (ECC) module;
a second ECC module; and
an error recovery mode selector to select the first ECC module based at least in part on the first number of errors and the data quality metric table.
Statement 7. Embodiments of the inventive concept include the storage device according to statement 6, wherein:
the controller is operative to receive from the host a target data quality metric for the read request; and
the error recovery mode selector is configured to select the first ECC module based at least in part on the first number of errors, the data quality metric table, and the target data quality metric.
Statement 8. Embodiments of the inventive concept include the storage device according to statement 7, wherein the error recovery mode selector is configured to not select either the first ECC module or the second ECC module based at least in part on the target data quality metric exceeding the first data quality metric.
Statement 9. Embodiments of the inventive concept include the storage device according to statement 6, wherein:
the storage device further comprises a read recovery level (RRL) table mapping a namespace to an RRL and a target data quality metric; and
the controller is configured to access the target data quality metric from the RRL table.
Statement 10. Embodiments of the inventive concept include the storage device according to statement 6, wherein:
the data quality metric table is configured to map the first number of errors to a first read latency and to map the second number of errors to a second read latency;
the controller is operative to receive from the host a target read latency for the read request; and
the error recovery mode selector is configured to select the first ECC module based at least in part on the first number of errors, the data quality metric table, and the target read latency.
Statement 11. Embodiments of the inventive concept include the storage device according to statement 10, wherein the error recovery mode selector is configured to not select either the first ECC module or the second ECC module based at least in part on the first read latency exceeding the target read latency.
Statement 12. Embodiments of the inventive concept include the storage device according to statement 6, wherein:
the storage device further comprises an RRL table mapping a namespace to an RRL and a target read latency; and
the controller is configured to access the target read latency from the RRL table.
Statement 13. Embodiments of the inventive concept include the storage device according to statement 3, wherein the data quality metric estimator includes a counter for a number of linear equations that are violated, the number of linear equations using a parity check matrix and a codeword for the first data.
Statement 14. Embodiments of the inventive concept include the storage device according to statement 13, wherein the data quality metric estimator is configured to determine a bit error rate (BER) for the first data from the number of linear equations that are violated.
Statement 15. Embodiments of the inventive concept include the storage device according to statement 1, wherein the controller is configured to write the first data to the storage based at least in part on a write request from the host.
Statement 16. Embodiments of the inventive concept include the storage device according to statement 15, wherein the controller is further configured to write a codeword to the storage based on the first data.
Statement 17. Embodiments of the inventive concept include the storage device according to statement 1, further comprising an RRL table mapping a namespace to an RRL and a target data quality metric.
Statement 18. Embodiments of the inventive concept include the storage device according to statement 17, wherein the RRL table maps the namespace to a target read latency.
Statement 19. Embodiments of the inventive concept include a method, comprising:
receiving a read request for a data from a host at a storage device;
determining a target data quality metric for the read request by the storage device;
reading the data from the storage device;
determining a number of errors in the data by the storage device;
selecting a first error correcting code (ECC) module from the first ECC module and a second ECC module based at least in part on the number of errors in the data by the storage device;
applying the first ECC module to the data, producing a corrected data by the storage device; and
returning the corrected data to the host from the storage device.
Statement 20. Embodiments of the inventive concept include the method according to statement 19, wherein the storage device includes a Solid State Drive (SSD).
Statement 21. Embodiments of the inventive concept include the method according to statement 19, wherein:
the method further comprises determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric.
Statement 22. Embodiments of the inventive concept include the method according to statement 21, wherein determining the data quality metric based at least in part on the number of errors in the data by the storage device includes determining the data quality metric based at least in part on the number of errors in the data by the storage device using a data quality metric table.
Statement 23. Embodiments of the inventive concept include the method according to statement 21, wherein:
determining the target data quality metric for the read request by the storage device includes receiving the target data quality metric from the host at the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
Statement 24. Embodiments of the inventive concept include the method according to statement 21, wherein:
receiving the read request for the data from the host at the storage device includes accessing the target data quality metric from a read recovery level (RRL) table by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
Statement 25. Embodiments of the inventive concept include the method according to statement 24, wherein accessing the target data quality metric from the RRL table by the storage device includes:
determining a namespace for the read request;
mapping the namespace to the target data quality metric using the RRL table.
Statement 26. Embodiments of the inventive concept include the method according to statement 19, wherein:
the method further comprises determining a read latency based at least in part on the number of errors in the data by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency.
Statement 27. Embodiments of the inventive concept include the method according to statement 26, wherein determining the read latency based at least in part on the number of errors in the data by the storage device includes determining the read latency based at least in part on the number of errors in the data by the storage device using a data quality metric table.
Statement 28. Embodiments of the inventive concept include the method according to statement 26, wherein:
determining the target data quality metric for the read request by the storage device includes receiving a target read latency from the host at the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
Statement 29. Embodiments of the inventive concept include the method according to statement 26, wherein:
determining the target data quality metric for the read request by the storage device includes accessing the target read latency from an RRL table by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
Statement 30. Embodiments of the inventive concept include the method according to statement 29, wherein accessing the target read latency from the RRL table by the storage device includes:
determining a namespace for the read request;
mapping the namespace to the target read latency using the RRL table.
Statement 31. Embodiments of the inventive concept include the method according to statement 19, further comprising:
determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
returning the data quality metric to the host.
Statement 32. Embodiments of the inventive concept include the method according to statement 19, wherein determining the number of errors in the data by the storage device includes:
determining a first number of linear equations; and
determining a second number of the linear equations that are violated.
Statement 33. Embodiments of the inventive concept include the method according to statement 32, wherein determining the first number of linear equations includes:
determining a codeword for the data; and
determining the first number of linear equations based at least in part on a parity check matrix and the codeword.
Statement 34. Embodiments of the inventive concept include the method according to statement 33, further comprising:
receiving a write request for the data from the host at the storage device;
writing the data to the storage device;
generating the codeword based at least in part on the data; and
writing the codeword to the storage device.
Statement 35. Embodiments of the inventive concept include an article, comprising a non-transitory storage medium, the non-transitory storage medium having stored thereon instructions that, when executed by a machine, result in:
receiving a read request for a data from a host at a storage device;
determining a target data quality metric for the read request by the storage device;
reading the data from the storage device;
determining a number of errors in the data by the storage device;
selecting a first error correcting code (ECC) module from the first ECC module and a second ECC module based at least in part on the number of errors in the data by the storage device;
applying the first ECC module to the data, producing a corrected data by the storage device; and
returning the corrected data to the host from the storage device.
Statement 36. Embodiments of the inventive concept include the article according to statement 35, wherein the storage device includes a Solid State Drive (SSD).
Statement 37. Embodiments of the inventive concept include the article according to statement 35, wherein:
the non-transitory storage medium has stored thereon further instructions that, when executed by the machine, result in determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric.
Statement 38. Embodiments of the inventive concept include the article according to statement 37, wherein determining the data quality metric based at least in part on the number of errors in the data by the storage device includes determining the data quality metric based at least in part on the number of errors in the data by the storage device using a data quality metric table.
Statement 39. Embodiments of the inventive concept include the article according to statement 37, wherein:
determining the target data quality metric for the read request by the storage device includes receiving the target data quality metric from the host at the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
Statement 40. Embodiments of the inventive concept include the article according to statement 37, wherein:
receiving the read request for the data from the host at the storage device includes accessing the target data quality metric from a read recovery level (RRL) table by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
Statement 41. Embodiments of the inventive concept include the article according to statement 40, wherein accessing the target data quality metric from the RRL table by the storage device includes:
determining a namespace for the read request;
mapping the namespace to the target data quality metric using the RRL table.
Statement 42. Embodiments of the inventive concept include the article according to statement 35, wherein:
the non-transitory storage medium has stored thereon further instructions that, when executed by the machine, result in determining a read latency based at least in part on the number of errors in the data by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency.
Statement 43. Embodiments of the inventive concept include the article according to statement 42, wherein determining the read latency based at least in part on the number of errors in the data by the storage device includes determining the read latency based at least in part on the number of errors in the data by the storage device using a data quality metric table.
Statement 44. Embodiments of the inventive concept include the article according to statement 42, wherein:
determining the target data quality metric for the read request by the storage device includes receiving a target read latency from the host at the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
Statement 45. Embodiments of the inventive concept include the article according to statement 42, wherein:
determining the target data quality metric for the read request by the storage device includes accessing the target read latency from an RRL table by the storage device; and
selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
Statement 46. Embodiments of the inventive concept include the article according to statement 45, wherein accessing the target read latency from the RRL table by the storage device includes:
determining a namespace for the read request;
mapping the namespace to the target read latency using the RRL table.
Statement 47. Embodiments of the inventive concept include the article according to statement 35, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in:
determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
returning the data quality metric to the host.
Statement 48. Embodiments of the inventive concept include the article according to statement 35, wherein determining the number of errors in the data by the storage device includes:
determining a first number of linear equations; and
determining a second number of the linear equations that are violated.
Statement 49. Embodiments of the inventive concept include the article according to statement 48, wherein determining the first number of linear equations includes:
determining a codeword for the data; and
determining the first number of linear equations based at least in part on a parity check matrix and the codeword.
Statement 50. Embodiments of the inventive concept include the article according to statement 49, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in:
receiving a write request for the data from the host at the storage device;
writing the data to the storage device;
generating the codeword based at least in part on the data; and
writing the codeword to the storage device.
Consequently, in view of the wide variety of permutations to the embodiments described herein, this detailed description and accompanying material is intended to be illustrative only, and should not be taken as limiting the scope of the inventive concept. What is claimed as the inventive concept, therefore, is all such modifications as may come within the scope and spirit of the following claims and equivalents thereto.
Claims
1. A storage device, comprising:
- storage for data;
- a controller to manage writing the data to the storage and reading the data from the storage;
- a data quality metric table to map a first number of errors to a first data quality metric and map a second number of errors to a second data quality metric; and
- a transmitter to return the data quality metric table to a host,
- wherein the storage device uses the data quality metric table to determine a data quality metric for a read request based at least in part on a number of errors based at least in part the read request.
2. The storage device according to claim 1, wherein:
- the controller is configured to receive from the host a read request to read a first data from the storage; and
- the storage device further comprises a data quality metric estimator to estimate the first number of errors in the first data.
3. The storage device according to claim 2, further comprising:
- a first error correcting code (ECC) module;
- a second ECC module; and
- an error recovery mode selector to select the first ECC module based at least in part on the first number of errors and the data quality metric table.
4. The storage device according to claim 3, wherein:
- the controller is operative to receive from the host a target data quality metric for the read request; and
- the error recovery mode selector is configured to select the first ECC module based at least in part on the first number of errors, the data quality metric table, and the target data quality metric.
5. The storage device according to claim 3, wherein:
- the storage device further comprises a read recovery level (RRL) table mapping a namespace to an RRL and a target data quality metric; and
- the controller is configured to access the target data quality metric from the RRL table.
6. The storage device according to claim 3, wherein:
- the data quality metric table is configured to map the first number of errors to a first read latency and to map the second number of errors to a second read latency;
- the controller is operative to receive from the host a target read latency for the read request; and
- the error recovery mode selector is configured to select the first ECC module based at least in part on the first number of errors, the data quality metric table, and the target read latency.
7. The storage device according to claim 3, wherein:
- the storage device further comprises an RRL table mapping a namespace to an RRL and a target read latency; and
- the controller is configured to access the target read latency from the RRL table.
8. The storage device according to claim 2, wherein the data quality metric estimator includes a counter for a number of linear equations that are violated, the number of linear equations using a parity check matrix and a codeword for the first data.
9. The storage device according to claim 1, further comprising an RRL table mapping a namespace to an RRL and a target data quality metric.
10. The storage device according to claim 9, wherein the RRL table maps the namespace to a target read latency.
11. A method, comprising:
- receiving a read request for a data from a host at a storage device;
- determining a target data quality metric for the read request by the storage device;
- reading the data from the storage device;
- determining a number of errors in the data by the storage device;
- selecting a first error correcting code (ECC) module from the first ECC module and a second ECC module based at least in part on the number of errors in the data by the storage device and the target data quality metric for the read request;
- applying the first ECC module to the data, producing a corrected data by the storage device; and
- returning the corrected data to the host from the storage device.
12. The method according to claim 11, wherein:
- the method further comprises determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric and the target data quality metric for the read request.
13. The method according to claim 12, wherein:
- determining the target data quality metric for the read request by the storage device includes receiving the target data quality metric from the host at the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric and the target data quality metric for the read request includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
14. The method according to claim 12, wherein:
- receiving the read request for the data from the host at the storage device includes accessing the target data quality metric from a read recovery level (RRL) table by the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric and the target data quality metric for the read request includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric exceeding the target data quality metric.
15. The method according to claim 11, wherein:
- the method further comprises determining a read latency based at least in part on the number of errors in the data by the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the number of errors in the data by the storage device and the target data quality metric for the read request includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency and the target data quality metric for the read request.
16. The method according to claim 15, wherein:
- determining the target data quality metric for the read request by the storage device includes receiving a target read latency from the host at the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency and the target data quality metric for the read request includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
17. The method according to claim 15, wherein:
- determining the target data quality metric for the read request by the storage device includes accessing the target read latency from an RRL table by the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the read latency and the target data quality metric for the read request includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the target read latency exceeding the read latency.
18. The method according to claim 11, further comprising:
- determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
- returning the data quality metric to the host.
19. An article, comprising a non-transitory storage medium, the non-transitory storage medium having stored thereon instructions that, when executed by a machine, result in:
- receiving a read request for a data from a host at a storage device;
- determining a target data quality metric for the read request by the storage device;
- reading the data from the storage device;
- determining a number of errors in the data by the storage device;
- selecting a first error correcting code (ECC) module from the first ECC module and a second ECC module based at least in part on the number of errors in the data by the storage device and the target data quality metric for the read request;
- applying the first ECC module to the data, producing a corrected data by the storage device; and
- returning the corrected data to the host from the storage device.
20. The article according to claim 19, wherein:
- the non-transitory storage medium has stored thereon further instructions that, when executed by the machine, result in determining a data quality metric based at least in part on the number of errors in the data by the storage device; and
- selecting the first ECC module from the first ECC module and the second ECC module includes selecting the first ECC module from the first ECC module and the second ECC module based at least in part on the data quality metric and the target data quality metric for the read request.
7984357 | July 19, 2011 | Kirscht et al. |
9477550 | October 25, 2016 | Goodman et al. |
9998151 | June 12, 2018 | Kim |
10127264 | November 13, 2018 | Pyle |
20090241008 | September 24, 2009 | Kim |
20120079353 | March 29, 2012 | Liikanen |
20150121167 | April 30, 2015 | Goodman et al. |
20150349807 | December 3, 2015 | Vernon |
20170116070 | April 27, 2017 | Alrod |
20180341580 | November 29, 2018 | Miao |
20200127685 | April 23, 2020 | Chen |
20200278906 | September 3, 2020 | Bains et al. |
Type: Grant
Filed: Mar 19, 2021
Date of Patent: Feb 28, 2023
Patent Publication Number: 20220197741
Assignee:
Inventors: Rekha Pitchumani (Oak Hill, VA), Zongwang Li (Dublin, CA)
Primary Examiner: Mujtaba M Chaudry
Application Number: 17/207,600
International Classification: G06F 11/30 (20060101); G06F 11/10 (20060101); G06F 11/14 (20060101); G06F 11/07 (20060101);