STORAGE ARRAY CONTROLLER FOR SOLID-STATE STORAGE DEVICES
A storage array controller provides a method and system for autonomously issuing trim commands to one or more solid-state storage devices in a storage array. The storage array controller is separate from any operating system running on a host system and separate from any controller in the solid-state storage device(s). The trim commands allow the solid-state storage device to operate more efficiently.
Latest Patents:
If any definitions, information, etc. from any parent or related application and used for claim interpretation or other purpose conflict with this description, then the definitions, information, etc. in this description shall apply.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention relates generally to US Classification 711/216. The present invention relates to storage array controllers and more particularly to storage array controllers for storage arrays that include solid-state storage devices.
2. Description of the Related Art
U.S. Pat. No. 6,480,936 describes a cache control unit for a storage apparatus.
U.S. Pat. No. 7,574,556 and U.S. Pat. No. 7,500,050 describe destaging of writes in a non-volatile cache.
U.S. Pat. No. 7,253,981 describes the re-ordering of writes in a disk controller.
U.S. Pat. No. 6,957,302 describes the use of a write stack drive in combination with a normal drive.
U.S. Pat. No. 5,893,164 describes a method of tracking incomplete writes in a disk array.
U.S. Pat. No. 6,219,289 describes a data writing apparatus for a tester to write data to a plurality of electric devices.
U.S. Pat. No. 7,318,118 describes a disk drive controller that completes some writes to flash memory of a hard disk drive for subsequent de-staging to the disk, whereas for other writes the data is written directly to disk.
U.S. Pat. No. 6,427,184 describes a disk controller that detects a sequential I/O stream from a host computer.
U.S. Pat. No. 7,216,199 describes a storage controller that continuously writes write-requested data to a stripe on a disk without using a write buffer.
US Publication 2008/0307192 describes storage address re-mapping.
BRIEF SUMMARY OF THE INVENTIONThe invention includes improvements to a storage array controller for storage arrays that include solid-state storage devices. The improvements include the ability of a storage array controller to autonomously issue disk trim commands to one or more solid-state storage devices.
So that the features of the present invention can be understood, a more detailed description of the invention, briefly summarized above, may be had by reference to typical embodiments, some of which are illustrated in the accompanying drawings. It is to be noted, however, that the accompanying drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of the scope of the invention, for the invention may admit to other equally effective embodiments. The following detailed description makes reference to the accompanying drawings, which are now briefly described.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the accompanying drawings and detailed description are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the accompanying claims.
DETAILED DESCRIPTION OF THE INVENTIONIn the following detailed description and in the accompanying drawings, specific terminology and images are used to provide a thorough understanding. In some instances, the terminology and images may imply specific details that are not required to practice all embodiments. Similarly, the embodiments described and illustrated are representative and should not be construed as precise representations, as there are prospective variations on what is disclosed that will be obvious to someone with skill in the art. Thus this disclosure is not limited to the specific embodiments described and shown but embraces all prospective variations that fall within its scope. For brevity, not all steps may be detailed, where such details will be known to someone with skill in the art having benefit of this disclosure.
This invention focuses on storage arrays that include solid-state storage devices. The solid-state storage device will typically be a solid-state disk (SSD) and we will use an SSD in our examples, but the solid-state storage device does not have to be an SSD. An SSD may for example, comprise flash devices, but could also comprise other forms of solid-state memory components or devices (SRAM, DRAM, MRAM, volatile, non-volatile, etc.), a combination of different types of solid-state memory components, or a combination of solid-state memory with other types of storage devices (often called a hybrid disk). Such storage arrays may additionally include hard-disk drives (HD or HDD).
This invention allows a storage array controller to autonomously issue a disk trim command. The disk trim command allows an OS to tell an SSD that the sectors specified in the disk trim command are no longer required and may be deleted. The disk trim command allows an SSD to increase performance by executing housekeeping functions, such as erasing flash blocks, that the SSD could not otherwise execute without the information in the disk trim command. The algorithms of this invention allow a storage array controller to autonomously issue disk trim commands, even though an operating system may not support the trim command. The storage array controller is logically located between the host system and one or more SSDs. An SSD contains its own SSD controller, but a storage array controller may have more resources than an SSD controller. This invention allows a storage array controller to use resources, such as larger memory size, non-volatile memory, etc. as well as unique information (because a storage array controller is higher than the SSD controller in the storage array hierarchy, i.e. further from the storage devices) in order to manage and control a storage array as well as provide information to the SSD controller.
GLOSSARY AND CONVENTIONSTerms that are special to this field of invention or specific to this invention are defined in this description and the first use (and usually the definition) of such special terms are highlighted in italics for the convenience of the reader. Table 1 shows a glossary for the convenience of the reader. If any information from Table 1 used for claim interpretation or other purpose conflict with the description text, figures or other tables, then the information in the description shall apply.
In this description there are several figures that depict similar structures with similar parts or components. For example several figures show a disk command. Even though disk commands may be similar in several figures, the disk commands are not necessarily identical. Thus, as an example, to avoid confusion a disk command in
In
In
Other topologies for Computer System 150 are possible: CPU 104 may connect or be coupled to the IO Bus 106 via a chipset; IO Bus 106 may use a serial point-to-point topology and bus technology (such as PCI Express, InfiniBand, HyperTransport, QPI, etc.), but may also use a parallel and/or multi-drop topology and bus technology (such as PCI, etc.); Storage Bus 114 may use a parallel and/or multi-drop topology and bus technology (such as SCSI, etc.), may use a serial point-to-point topology and bus technology (such as SATA, SAS, FC, USB, Light Peak, etc.), or may use a networked protocol (such as iSCSI, FCoE, etc.); the various bus technologies used may be standard or proprietary; the various bus technologies used may be electrical, optical or wireless etc.; portions of the system may be integrated together in a single chip or integrated package, and/or portions of the system may be in different enclosures etc. Many uses for Computer System 150 are possible: a mass storage system, embedded device, etc. Since solid-state storage is widely used in portable electronic devices, the ideas presented here also apply when Computer System 150 is a cell phone, PDA, tablet, camera, videocamera, portable music player, other portable electronic device, or similar.
An operating system (OS) sees a storage array as a collection of disk sectors or just sectors (and sectors may also be called blocks). An SSD in a storage array may have a capacity of more than 100 Gbytes and contain tens of NAND flash memory chips. A typical 1 Gbit NAND flash memory chip may contain 1024 flash blocks with each flash block containing 64 flash pages and each flash page containing 2 kbytes. The numbers of disk sectors, flash pages and flash blocks in
Disk sectors may be 512 bytes in length (and typically are in the 2010 timeframe). In
Note that
We now explain the algorithms of the Storage Array Controller 108.
Algorithm 1: a Storage Array Controller that Issues a Trim Command
The sectors or blocks of a storage device are addressed as logical blocks using a logical block address (LBA). To avoid confusion, we will use host block address (HBA) for the LBA used to address a storage array controller. Unless we explicitly state otherwise, we assume that the host block size (HBS) is equal to the disk block size (DBS). The HBA may be a composite or union of a logical unit number (LUN) that identifies a logical portion of the storage array or disk or other device in the storage array; an LBA; the virtual machine (VM), if any; a UserID that identifies the user application; a VolumeID that identifies a logical target volume; and other data that may be used for logical access or management purposes. Note that to simplify the description, clarify the figures, and in particular to make it clear that operations may be performed on different LUNs, the LUN may be shown separately from HBA in
A disk controller for an HDD or SSD maintains the relationship between an ABA (or the DBA portion of the ABA) and the disk sectors that are physically part of a storage device (often called the physical disk sectors or physical sectors). In exactly the same way the Solid-State Disk Logic 120 maintains the relationship between an ABA and the physical block number (PBN) of an SSD. The PBN of an SSD is analogous to the physical disk sector of an HDD. Due to resource constraints SSDs often manage the PBNs at a coarser granularity than disk sectors. Normally a disk command contains an LBA provided by the host, but in the presence of a storage array controller the disk command contains an ABA provided by the storage array controller. Note that in
Because the terms just described can be confusing we summarize the above again briefly. With just a single disk, the host provides an LBA directly to the disk; the disk controller converts the LBA to the physical disk sector (for an HDD) or to the PBN (for an SSD). In the presence of a storage array controller the host still provides an LBA, but now to the storage array controller (and thus we call the LBA an HBA to avoid confusion); the storage array controller then maps this HBA to an ABA and provides the ABA (or possibly just the DBA portion of the ABA) to the disk; the disk (HDD or SSD) then converts this DBA or ABA (treating the DBA portion of the ABA as though it were just an LBA, which it is) to a physical disk address: either the physical disk sector (for an HDD) or PBN (for an SSD).
It is important to understand the additional layer of hierarchy that a storage array controller introduces. The storage hierarchy of
We will define structures and their functions, operations and algorithms in terms of software operations, code and pseudo-code, it should be noted that the algorithms may be performed in hardware; software; firmware; microcode; a combination of hardware, software, firmware or microcode; or in any other manner that performs the same function and/or has the same effect. The data structures, or parts of them, may be stored in the storage array controller in SRAM, DRAM, embedded flash, or other memory. The data structures, or parts of them, may also be stored outside the storage array controller, for example on any of the storage devices of a storage array (the local storage or remote storage, i.e. remote from the storage array connected to the storage array controller) or on a host system (the local host or a remote host, i.e. remote from the host connected to the storage array controller). For example,
We will now define the data structures (including the map and the freelist) that we will use. A map hr_map is defined between the HBAs and ABAs as hr_map[hba]->aba. Thus hr_map takes an HBA as input and returns an ABA. We say that the HBA maps to that ABA (we can also say that the storage array controller maps or re-maps data from the operating system). A special symbol or bit (for example, we have used X in the Map (1) 136 of
We have used the term storage array controller throughout this description rather than storage controller or disk controller. In
A storage command is directed to a storage device and specifies an operation, such as read, write, etc. A storage command is more commonly called a disk command or just command, a term we will avoid using in isolation to avoid confusion. To avoid such confusion we will use storage command when we are talking about commands in general; but we will save disk command (or disk write, etc.) for the command as it arrives at (or is received by) the disk (either SSD or HDD, usually via a standard interface or storage bus, such as SATA); we will use the term host command (or host write, etc.) for the command as it leaves (or is transmitted by) the OS. A disk command may be the same as a host command when there is a direct connection between the OS on a host system and a single disk.
The algorithms and operations described below use a disk trim command (trim command or just trim are also commonly used). A disk trim command was proposed to the disk drive industry in the 2007 timeframe and introduced in the 2009 timeframe. One such disk trim command is a standard storage command, part of the ATA interface standard, and is intended for use with an SSD. A disk trim command is issued to the SSD; the disk trim command specifies a number of disk sectors on the SSD using data ranges and LBAs (or, as we have explained already, using ABAs or the DBAs contained in ABAs in the presence of a storage array controller); and the disk trim command is directed to the specified disk sectors. The disk trim command allows an OS to tell an SSD that the disk sectors specified in the trim command are no longer required and may be deleted or erased. The disk trim command allows the SSD to increase performance by executing housekeeping functions, such as erasing flash blocks, that the SSD could not otherwise execute without the information in the disk trim command.
It should be noted from the above explanation and our earlier discussion of ABAs that, for example, when we say “place an ABA in a disk trim command,” the disk trim command may actually require an LBA (if it is a standard ATA command for example), and that LBA is the DBA portion of the ABA. To simplify the description we may thus refer to an LBA, DBA and ABA as referring to the same block address, and thus mean the same thing, at the disk level.
Although the disk trim command and other storage commands have fixed and well-specified formats, in practice they may be complicated with many long fields and complex appearance. Storage commands may also vary in format depending on the type of storage bus, for example. We will simplify storage commands and other commands in the figures in order to simplify the description (and the format of the storage commands may also vary between different figures and different examples). The algorithms described here are intended to work with any standard or proprietary command set even though a command shown in a figure in this description may not exactly follow any one standard format, for example.
We now describe Algorithm 1 that allows the Storage Array Controller 108 of
We say the Storage Array Controller 108 autonomously issues the disk trim command or issues the disk trim command in an autonomous fashion or in an autonomous manner, or issues autonomous disk trim commands. We use the term autonomous or autonomously here to describe the fact that it is the Storage Array Controller 108 that initiates, originates, or instigates the disk trim command and generates or creates the contents of all (or part) of the disk trim command rather than, for example, Operating System 158 on Host System 102.
Algorithm 1 may be used in a situation where Operating System 158 on Host System 102 does not support the disk trim command (or does not support the disk trim operation). Algorithm 1 may also be used in a situation where Operating System 158 on Host System 102 is unaware of the physical details of the Storage Array 148. Algorithm 1 may be used, for example, in the situation where the sum capacity of the LUNs presented to Operating System 158 on Host System 102 is smaller than the sum capacity of the Storage Array 148. This situation may occur, as an example, because an OS is in a virtual machine and the storage array is being shared by multiple virtual machines. There are, however, many reasons, including the use of storage management; use of a Guest OS, virtualization of machines; remote, NAS and SAN storage arrays; storage virtualization; and other datacenter functions that may cause Operating System 158 on Host System 102 to be unable to, or unaware that it can, issue a disk trim command to a Solid-State Disk (1) 116 in the attached Storage Array 148.
Algorithm 1: trim_aba
Step 1. Assume valid HBAs map to a fixed subset of ABAs in hr_map
Step 2. Issue a disk trim command to ABAs in aba_free that are not mapped to by valid HBAs
In
Note that Disk Trim Command (1) 142 shows the same information content that an industry-standard disk trim command contains; but, is not necessarily in the exact format used, for example, by the ATA industry standard.
Note that alternative implementations for Algorithm 1 may include the following: (i) multiple disk trim commands may be combined; (ii) if Operating System 158 in
One feature of Algorithm 1 is for a storage array controller to set aside, as unused, a portion (or portions) of an SSD (or SSDs) in a storage array. Thus the sum of the LUNs presented to the host system is smaller than the capacity of the storage array. The storage array controller may then autonomously issue disk trim command(s) to the unused portion(s) of an SSD (or SSDs). An SSD may then use the information in the disk trim command to erase or delete flash blocks. The ability to erase or delete flash blocks improves the SSD performance and improves the SSD reliability.
It is important to note that the Storage Array Controller Logic 112 is (i) separate from the Solid-State Disk Logic 120 typically used by the Solid-State Disk Controller Chip 118 and (ii) separate from Operating System 158.
A storage array controller performs certain functions instead of (or in addition to) an OS running on a host system; and a storage array controller also performs certain functions instead of (or in addition to) an SSD controller(s) in a storage array. A storage array controller is logically located between a host system and an SSD. An SSD contains its own SSD controller, but a storage array controller may have more resources than an SSD controller. The algorithms described here allow a storage array controller to use resources, such as larger memory size, non-volatile memory, etc. as well as unique information (because a storage array controller is higher than an SSD controller in a storage array hierarchy, i.e. further from the storage devices) in order to manage and control a storage array as well as provide information to an SSD controller. For example, a storage array controller is aware of LUNs but a SSD controller is not. This hierarchical management approach has other advantages and potential uses that are explained throughout this description in the forms of various algorithms that may be employed by themselves or in combination.
Algorithm 1 illustrates the operation of the Storage Array Controller Logic 112 in the Storage Array Controller 108. The description of Algorithm 1 is useful before we describe more complex algorithms that include host write commands and other storage array functions. These more complex algorithms show how Freelist (1) 138 in
Note that the various storage-array configuration alternatives as well as other various possibilities for the storage array configuration(s), storage bus(es), and various storage device(s) will not necessarily be shown in all of the figures in order to simplify the description.
Note that the Device Driver 228 (and thus Device Driver Logic 236) and Storage Array Controller 108 (and thus Storage Array Controller Chip 110 and Storage Array Controller Logic 112) are: (i) separate from the Solid-State Disk Logic 120 used by the Solid-State Disk Controller Chip 118 and (ii) separate from Operating System 158 (or storage-driver software that may be considered part of Operating System 158).
Note that in the following examples and implementations we may simplify descriptions by showing Storage Array Controller 108 (with Storage Array Controller Chip 110 and Storage Array Controller Logic 112) as issuing the autonomous disk trim command (just as we described with reference to
Algorithm 2: Storage Array Controller that Maintains a Map and a Freelist
We will now describe Algorithm 2 that builds on Algorithm 1 and that shows how a freelist and map are used.
Alternative implementations of Algorithm 2 may include some or all of the following: (i) an asynchronous disk trim command (i.e. the disk trim command is generated at a different time to that described above and to other events); (ii) a disk trim command may specify multiple disk sectors (using multiple data ranges); (iii) any type of storage array including one or more SSDs; (iv) any of the alternative implementations of the other algorithms in this description; (v) ordering the freelist to increase the likelihood that writes to the SSD are to sequential ABAs (even though the HBAs may be to random addresses)
An old array block address (old ABA) is thus an ABA that is no longer required, containing data that is no longer useful or required; and a new ABA is an ABA, taken from a freelist, that replaces an old ABA and does contain data that is useful or required.
In
Typically an erase of Flash Memory 122 is performed a block at a time, as shown by E in the Erased Flash Block 312 in
One feature of Algorithm 2 is for a storage array controller to maintain a map (i.e. map or re-map data) between host and disk(s) and to autonomously issue disk trim commands to the SSD(s) directed to old ABAs.
Algorithm 3: Storage Array Controller that Performs Garbage Collection
We will now describe Algorithm 3, which is based on Algorithm 2, and that operates on large groups of sectors called superblocks.
First we describe garbage collection. In the context of solid-state storage, typically flash memory, when a flash page (or some other portion) of a storage device is no longer required (i.e. it is obsolete, no longer valid, or is invalid) that flash page is marked as dirty. When an entire flash block (typically between 16 to 256 flash pages) is dirty, the entire flash block is erased and free space reclaimed. If free space on the device is low, a flash block is chosen that has some dirty flash pages and some clean (i.e. pages that are not dirty, are good, or valid) flash pages. The clean flash pages are transferred (i.e. written, moved or copied) to a new flash block. All the original clean flash pages are marked as dirty and the old flash block is erased. In the context of solid-state storage, this process of transferring flash pages to new flash blocks and erasing old flash blocks is called garbage collection. The exact technique used for garbage collection, well-known to someone skilled in the art, is not a key part of the algorithms described here. One key idea is that garbage collection is being performed by the storage array controller. We present Algorithm 3 first and then describe each of the steps.
Algorithm 3: get_write_aba_with_GC(hba)
Step 3.0: Write Loop. Process input host write commands. Go to Step 3.1.
Step 3.1. Host write command arrives at storage array controller. Storage array controller adds the host write command fields (HBA plus HDATA) to a superblock write buffer. Go to Step 3.2.
Step 3.2. Check if the superblock write buffer is full. No: Go to Step 3.1. Yes: Go to Step 3.3.
Step 3.3. Check if we have enough ABAs in the freelist to fill a free superblock.
Step 3.4. Perform freelist_tidy to create a free superblock. Go to Step 3.5.
Step 3.5. Update hr_map. Go to Step 3.6.//Similar to Algorithm 2 or equivalent
Step 3.6. Write the entire superblock to disk. Go to Step 3.7.
Step 3.7. End of Write Loop. Go to Step 3.0.
We will now describe the steps in Algorithm 3 and the data structures shown in
Step 3.1 details: In
Step 3.2 details: In
Step 3.3 details:
Step 3.4 details: freelist_tidy performs garbage collection to produce a free superblock. In Map (4) 412 HBA 04 is marked for garbage collection with S=G. The garbage collection process in freelist_tidy can thus add ABA 05 to Freelist (4) 416 (as shown by the arrow labeled Step 3.4a in
Step 3.5 details: To describe how we update map hr_map we focus on the first entry in Superblock Write Buffer 406 (corresponding to Host Write Command (4) 404 to HBA=01) in
Step 3.6 details: In
Alternative implementations for Algorithm 3 may include one or more of the following: (i) Step 3.4 freelist_tidy may be performed asynchronously (i.e. at a different time) to any write commands so that at most times (and preferably at all times) there is at least one free superblock; (ii) in practice a superblock (and free superblock) will be much larger than the disk sector size, flash block size, or flash page size and could be 32 Mbytes, or more, for example; (iii) if the SSD capacity is 100 Gbyte and a superblock is 1 Gbyte, then to avoid filling the disk we might inform the OS that the SSD capacity is 99 Gbyte for example; (iv) a superblock may contain elements at any granularity or size: for example an element may be a disk sector (512 bytes, for example); but an element may be larger or smaller than 512 bytes, and an element may be larger or smaller than a disk sector; (v) any type of storage array containing one or more SSDs; (vi) any of the alternative implementations of the other algorithms in this description.
As a side note the reader is cautioned that superblock is used in other contexts (filesystems and NAND flash being examples), but that the contexts are close enough that confusion might result if not for this warning. The superblock described here is a collection of disk sectors (block being a common alternative term for disk sector).
The ideas of Algorithm 3 include that a storage array controller: (i) maintains a map between host and disk (i.e. maps or re-maps data), (ii) performs garbage collection, and (iii) autonomously issues disk trim commands directed to superblocks. The storage array controller presents all write and erase operations (including disk trim commands) to an SSD at the granularity of a superblock and this greatly helps the SSD perform its functions, including the garbage collection process of the SSD. Other implementations of Algorithm 3, with other features, are possible without altering these ideas.
Storage Array Controller with Asynchronous Garbage Collection
We will now describe Algorithm 4, based on Algorithm 3, and that contains the majority of the logic required by a storage array controller. Algorithm 4 includes a detailed implementation of an example garbage collection process. Note that many (or indeed any) garbage collection algorithms may be used. Each major step below is a separate stage of operation: steps 4.1, 4.2, 4.3, 4.4, 4.5, and 4.6 correspond to: (i) initialization of the storage device or array; (ii) creation of LUNs; (iii) handling of write commands; (iv) deletion of LUNs; (v) increasing LUN size; (vi) decreasing LUN size.
Algorithm 4: Storage_Controller—1
Step 4.1: Initialization: issue disk trim commands to all ABAs on all disks//Nothing on disk(s)
Step 4.2: LUN creation: set LUN_size=C2
Step 4.3: Write Loop: while there are write commands:
Step 4.3.1: get_write_aba(hba)//pop from aba_free—1 & push to aba_free—2
Step 4.3.2: if threshold_reached( ) go to Step 4.3.3 else go to Step 4.3.1
Step 4.3.3: update aba_free—1( ); go to Step 4.3.1//start using An+3
Step 4.4: LUN deletion:
Step 4.4.1. Issue disk trim commands to all ABAs that are mapped to the LUN
Step 4.4.2. Remove all ABA mappings for the LUN and add the ABAs to the freelist aba_free—1
Step 4.5: LUN increase size: no action required
Step 4.6: LUN decrease size:
Step 4.6.1. Issue a disk trim command specifying all ABAs that are mapped to the LUN region being removed
Step 4.6.2. Remove all ABA mappings for the LUN region being removed and add the ABAs to the freelist aba_free—1
In
In
Next, assume that threshold_reached is now true in Step 4.3.2. For example, we can count the ABAs used and set a threshold at four. In
One idea of Algorithm 4 is to allow the storage array controller to manage writing to a large and rotating pool of dirty sectors. The result is that an SSD controller (under or below the storage array controller hierarchically, i.e. closer to the storage devices) may perform its own more efficient garbage collection and clean large dirty areas of flash blocks and flash pages.
Alternative implementations for Algorithm 4 may include one of more of the following: (i) the capacities, the numbers of disk sectors, and sizes of the pools and areas described are many orders of magnitude higher in practice: C1 may be 100 GB and C2 may be 80 GB for example; (ii) instead of a single LUN C2 we can use multiple LUNs: C2, C3, . . . , Ci, and then Step 4.2 will check that the sum of Ci is less than C1; (iii) other algorithms may be used to set the area of dirty sectors: a fixed pool (rather than rotating), or multiple pools, might be used for example; (iv) other algorithms may be used to set the threshold(s), pool size(s), and location(s); (v) the freelist(s) may be various relative sizes, split, and maintained in different ways that may improve the efficiency and speed of the algorithm; (vi) in Step 4.3.3 we change to use area An+3 (modulo 4 or the number of areas: thus if we were using Area 0 (A0), change to Area 3 (A3); from Area 2 (A2) we change to Area 1 (A1), etc.) and this example assumes we have four areas, but the algorithm may use any number of areas; (vii) set the threshold of the test in Step 4.3.2 by using number of writes performed, by number of ABAs used, or any other method; (viii) Step 4.1 may autonomously issue a standard ATA secure erase command to all disks (this will typically mark all ABAs as free, but possibly also erasing SSD wear-leveling and other housekeeping data); (ix) Step 4.1 may autonomously issue a secure erase command that does not erase wear-leveling data; (x) any of the alternative implementations of the other algorithms in this description.
Storage Array Controller for Large Capacity SSDsWe have presented Algorithms 1, 2, 3, and 4 using small disks as examples and correspondingly small numbers to simplify the descriptions. We now describe Algorithm 5 as an example of a storage array controller for use with one or more solid-state disks using components typical of the 2010 timeframe. Algorithm 5 described below may be viewed as a combination of previously described algorithms. This implementation will thus illustrate ideas already described, but in a more realistic and contemporary context.
In
In
In
Algorithm 5: Storage_Controller 2//Combination of Algorithm 3 & 4
Step 5.1: Initialization: issue a disk trim command to all ABAs on all disks//Nothing on disk
Step 5.2: LUN creation: set LUN_size=C2//C2<C1=disk capacity
Step 5.3: get_write_aba_with_GC(hba)//Use Algorithm 3 or equivalent
Step 5.3.0: Write Loop. Process input host write commands. Go to Step 5.3.1.
Step 5.3.1. Host write command arrives at storage array controller. Storage array controller adds the host write command (HBA plus HDATA) to a write buffer. Go to Step 5.3.2.
Step 5.3.2. Check if the superblock write buffer is full. No: Go to Step 5.3.1. Yes: Go to Step 5.3.3.
Step 5.3.3. Check if we have enough ABAs in the freelist to fill a free superblock. No: Go to Step 5.3.4. Yes: Go to Step 5.3.5.
Step 5.3.4. Perform freelist_tidy to create a free superblock. Go to Step 5.3.5.
Step 5.3.5. Update hr_map. Go to Step 5.3.6.
Step 5.3.6. Transmit a disk write command from the superblock write buffer. Go to Step 5.3.7.
Step 5.3.7. End of Write Loop. Go to Step 5.3.0.
Step 5.4: LUN deletion:
Step 5.4.1. Issue a disk trim command to all ABAs that are mapped to the LUN
Step 5.4.2. Remove all ABA mappings for the LUN and add the ABAs to the freelist aba_free—1
Step 5.5: LUN increase size: no action required
Step 5.6: LUN decrease size:
Step 5.6.1. Issue a disk trim command specifying all ABAs that are mapped to the LUN region being removed
Step 5.6.2. Remove all ABA mappings for the LUN region being removed and add the ABAs to the freelist aba_free—1
In
Alternative implementations for Algorithm 5 may include one or more of the following: (i) other sizes of superblock; (ii) multiple superblock sizes; (iii) any type of storage array containing one or more SSDs; (iv) any of the alternative implementations of the other algorithms in this description.
Numerous variations and modifications based on the above description will become apparent to someone with skill in the art once the above description is fully understood. It is intended that the claims that follow be interpreted to embrace all such variations and modifications.
Claims
1. A method of managing a storage array comprising:
- a storage array controller that is operable to receive one or more host commands from an operating system;
- wherein the one or more host commands are directed to one or more solid-state storage devices in the storage array;
- wherein the storage array controller is operable to generate one or more disk trim commands in response to the one or more host commands;
- wherein the generating one or more disk trim commands is performed in an autonomous manner; and
- wherein the one or more disk trim commands are directed to at least one of the one or more solid-state storage devices.
2. The method of claim 1 wherein the operating system is not operable for generating the one or more disk trim commands.
3. The method of claim 1 wherein the generating one or more disk trim commands further comprises merging one or more host trim commands into the one or more disk trim commands.
4. The method of claim 1 wherein the receiving host commands further comprises: updating a map from a plurality of host block addresses to a plurality of array block addresses; and placing one or more old array block addresses in the one or more disk trim commands.
5. The method of claim 1 wherein the managing a storage array is performed in software.
6. The method of claim 1 wherein the managing a storage array is performed in software in a hypervisor.
7. The method of claim 1 wherein the managing a storage array further comprises:
- maintaining one or more maps and one or more freelists;
- performing garbage collection on at least one of the one or more maps and one or more freelists as a result of the receiving of the one or more host commands;
- generating one or more superblocks; and
- placing one or more superblock addresses of the one or more superblocks in the one or more disk trim commands.
8. A storage array controller operable to be coupled to a host system and a storage array; wherein the storage array includes a plurality of storage devices; wherein the plurality of storage devices includes at least one solid-state storage device; wherein the storage array controller is operable to receive host commands from the host system; and wherein the storage array controller is operable to autonomously issue a disk trim command to the at least one solid-state storage device.
9. The storage array controller of claim 8 wherein the storage array controller maintains a map and a freelist; wherein the map converts host block addresses to array block addresses; and wherein the freelist includes a plurality of free array block addresses.
10. The storage array controller of claim 9 wherein the storage array controller is operable to place one or more of the plurality of free array block addresses in the disk trim command.
11. The storage array controller of claim 9 wherein the storage array controller issues a disk trim command to array block addresses that are not in the map.
12. The storage array controller of claim 9 wherein the storage array controller creates one or more old array block addresses; and wherein the storage array controller issues disk trim commands to the one or more old array block addresses.
13. The storage array controller of claim 9 wherein the storage array controller performs garbage collection.
14. The storage array controller of claim 9 wherein the storage array controller collects write commands into one or more superblocks; and wherein the storage array controller writes to one or more of the at least one solid-state disks using the one or more superblocks.
15. The storage array controller of claim 8 wherein the disk trim command is generated in a device driver.
16. The storage array controller of claim 15 wherein the device driver is part of a host system.
17. The storage array controller of claim 15 wherein the device driver is part of a hypervisor.
18. The storage array controller of claim 8 wherein the storage capacity presented to the host system (C1) is less than the storage array capacity (C2); wherein the storage array capacity (C2) minus the storage capacity presented to the host system (C1) is a portion of storage capacity (C2−C1); and wherein the storage array controller autonomously issues a trim command to the portion of storage capacity (C2−C1).
19. The storage array controller of claim 8 wherein the storage array controller issues a disk trim command during an operation selected from the following: storage array initialization, storage array creation, storage array resizing, LUN creation, LUN removal, LUN resizing, LUN deletion.
20. A computer system for storing and providing data; the computer system operable to be coupled to a storage array controller; the storage array controller operable to be coupled to a storage array; the storage array including a plurality of storage devices; the plurality of storage devices including at least one solid-state storage devices; and wherein the storage array controller is operable to autonomously issue a disk trim command to one or more of the at least one solid-state storage devices.
Type: Application
Filed: Sep 7, 2010
Publication Date: Mar 8, 2012
Applicant: (Cambridge, MA)
Inventors: Daniel L. Rosenband (Cambridge, MA), Michael John Sebastian Smith (Palo Alto, CA)
Application Number: 12/876,393
International Classification: G06F 12/00 (20060101); G06F 12/02 (20060101);