Managing multiple snapshot copies of data
A method for providing multiple, different point-in-time, and read and write accessible snapshot copies of a base disk in storage arrays is disclosed. The method improves the performance of multiple snapshots by linking them together and sharing only one copy of a unique data block. This method also has the benefit of saving snapshot disk space by dynamically allocating additional space required according to the actual usage. Additionally, only one copy-on-write procedure needs to be performed for multiple snapshot volumes during access to either the base disk volume, or any of the snapshots that is attached to the base disk. When a snapshot volume is deleted, disk space and data structure dedicated to that snapshot volume are also deleted, so that storage space and memory resource within the snapshots may be reused for subsequent applications. Additionally, multiple snapshots can be managed in a fashion such that multiple, different point-in-time copies of the base disk can be maintained and updated automatically.
Current high-capacity computerized data storage systems typically involve a storage area network (SAN) within which one or more storage arrays store data on behalf of one or more host devices, which in turn typically service data storage requirements of several client devices. Within such a storage system, various techniques are employed to make an image or copy of the data. One such technique involves the making of “snapshot” or point-in-time copies of volumes of data within the storage arrays without taking the original data “offline,” or making the data temporarily unavailable. Generally, a snapshot volume represents the state of the original, or base, volume at a particular point in time.
Thus, the snapshot volume is said to contain a copy or picture, i.e. “snapshot,” of the base volume.
Snapshot volumes are formed to preserve the state of the base volume for various purposes. For example, daily snapshot volumes may be formed in order to show and compare daily changes to the data. Also, a business or enterprise may want to upgrade its software that uses the base volume from an old version of the software to a new version. Before making the upgrade, however, the user, or operator, of the software can form a snapshot volume of the base volume and concurrently run the new untested version of the software on the snapshot volume and the older known stable version of the software on the base volume. The user can then compare the results of both versions, thereby testing the new version for errors and efficiency before actually switching to using the new version of the software with the base volume. Also, the user can make a snapshot volume from the base volume in order to run the data in the snapshot volume through various different scenarios (e.g. financial data manipulated according to various different economic scenarios) without changing or corrupting the original data in the base volume. Additionally, backup volumes (e.g. tape backups) of the base volume can be formed from a snapshot volume of the base volume, so that the base volume does not have to be taken offline, or made unavailable, for an extended period of time to perform the backup, since the formation of the snapshot volume takes considerably less time than does the formation of the backup volume.
The first time that data is written to a data block in the base volume after forming a snapshot volume, a copy-on-write procedure is performed to copy the original data block from the base volume to the snapshot before writing the new data to the base volume. Afterwards, it is not necessary to copy the data block to the snapshot volume upon subsequent writes to the same data block in the base volume.
When multiple snapshot volumes have been formed, with every write procedure to a previously unchanged data block of the base volume, a copy-on-write procedure must occur for every affected snapshot volume to copy the prior data from the base volume to each of the snapshot volumes. Therefore, with several snapshot volumes, the copying process can take up a considerable amount of the storage array's processing time, and the snapshot volumes can take up a considerable amount of the storage array's storage capacity.
SUMMARYA method for providing a plurality of different point-in-time, read and write accessible snapshot copies of a base disk volume in storage arrays is disclosed. The method improves the performance of multiple snapshots by linking them together and sharing only one copy of a unique data block. This method also has the benefit of saving snapshot disk space by dynamically allocating additional space required according to the actual usage. Additionally, only one copy-on-write procedure needs to be performed for multiple snapshot volumes during access to either the base disk volume, or any of the snapshots that is attached to the base disk. When a snapshot volume is deleted, disk space and data structure dedicated to that snapshot volume are also deleted, so that storage space and memory resource within the snapshots may be reused for subsequent applications. Additionally, multiple snapshots can be managed in a fashion such that multiple, different point-in-time copies of the base disk can be maintained and updated automatically.
BRIEF DESCRIPTION OF THE DRAWING
A storage environment, such as a storage area network (SAN) 100 shown in
The storage array typically has more than one conventional multi-host channel RAID storage controller (a.k.a. array controller) 122 and 124, as shown in storage array 114. The array controllers 122 and 124 work in concert to manage the storage array 114, to create the logical volumes 130 and 136 (
The logical volumes 130 and 136 generally include base volumes 130, snapshot volumes 136, and SAN file systems (SANFS) 132, as shown in
The logical volumes 130 and 136 are shown in the storage controllers 122 and 124, since it is within the storage controllers 122 and 124 that the logical volumes perform their functions and are managed. The storage devices 103 provide the actual storage space for the logical volumes 130 and 136.
The primary logical volume for storing data in the storage array 114 (
The base volumes 130 and the snapshot volumes 136 are addressable, or accessible, by the host devices 104-108 (
Before the snapshot volume 136 is created, the SAN file systems 132 corresponding to the snapshot volume 136 must already have been created. The snapshot volume 136 contains copies of data blocks (not shown) from the corresponding base volume 130. Each data block is copied to the snapshot volume 136 upon the first time that the data stored within the base volume 130 is changed after the point in time at which the snapshot volume 136 is created. The SAN file systems 132 also contains software code for performing certain functions, such as searching for data blocks within the SAN file systems 132 and saving data blocks to the SAN file systems 132 (functions described below). Since the SAN file systems 132 are “internal” to the storage controllers 122 and 124, it only responds to commands from the corresponding base volume 130 and snapshot volume 136, transparent to the host devices 104-108 (
The snapshot volume 136 represents the state of the data in the corresponding base volume 130 at the point in time when the snapshot volume 136 was created. A data access request that is directed to the snapshot volume 136 will be satisfied by data either in snapshot volume 136 or in the base volume 130. Thus, the snapshot volume 136 may not contain all of the data to be accessed. Rather, the snapshot volume 136 includes actual data and identifiers to the corresponding data in base volume 130 and/or additional instances of snapshot volume 136 within the SAN file systems 132. The snapshot volume 136 also includes software code for performing certain functions, such as data read and write functions (described below), on the corresponding base volume 130 and SAN file systems 132. In other words, the snapshot volume 136 issues commands to “call” the corresponding base volume 130 and SAN file systems 132 to perform these functions. Additionally, it is possible to reconstruct, or rollback, the corresponding base volume 130 to the state at the point in time when the snapshot volume 136 was created by copying the data blocks in the snapshot volume 136 back to the base volume 130 by issuing a data read request to the snapshot volume 136.
The SAN file systems 132 intercepts the data access requests directed to the base volume 130 transparent to the host devices 104-108 (
A SAN file system 132 (a software program labeled SANFS) executes on each of the storage controllers 122 and 124 to receive and process data access commands directed to the base volume 130 and the snapshot volume 136. Thus, the SAN file system 132 “calls,” or issues commands to, the base volume 130 and the snapshot volume 132 to perform the data read and write functions and other functions.
Additionally, the SAN file system 132 executes on each of the storage controllers 122 and 124, respectively to manage the creation and deletion of the snapshot volumes 136, and the base volumes 130 (described below). Thus, the SAN file systems 132 creates all of the desired snapshot volumes 136 from the base volume 130, typically in response to commands to the SAN file system 132 (
The technique for storing the data for the snapshot volume 136 using multiple point-in-time images is illustrated in
Furthermore, the base disk node 148 points to its first (most ancient) snapshot, shown as snap1 150 in
In-memory relationships shown in
A procedure 180 for the SAN file system 132 (
A procedure 200 for the SAN file system 132 (
Procedure 224 for a base volume to respond to a data read or write request is shown in
The base write procedure 224 starts at step 234 in
Procedures 250 and 270 are for a snapshot volume to respond to a data read or write request are shown in
The snapshot read procedure 250 begins at step 254 in
The snapshot write procedure 270 begins at step 272 in
The snapshot disk COW table lookup procedure 282 begins at step 286 in
The COW table structure 300 in
The COW table status flag 318 indicates one of the three states of a COW table entry: 1) Unused; 2) Snapshot data blocks chunk is the original base disk data blocks chunk; 3) Snapshot data blocks chunk is a modified copy of the original base disk data blocks chunk. Each COW table entry operates on the block length of snapshot data blocks chunk, whose value is user definable, but not required. Although every snapshot has its own COW table, the actual snapshot data blocks chunk is not necessarily stored in its own disk space. The snapshot disk pointer 316 links a COW table entry to the actual snapshot disk volume where the snapshot data blocks chunk is being stored. By way of example, if a data block on the base disk, having snapshot 1 and snapshot 2, is changed for the first time, a new entry will be added in the COW table of both snapshot 1 and snapshot 2. But the pointer 316 in COW table of snapshot 1 will point to snapshot 2, which is the most recent snapshot that stores the original base data block changed on the base volume. If later on, write to snapshot 1 is on the same data blocks chunk address, the actual snapshot blocks chunk will be first copied from snapshot 2 to snapshot 1, then 316 will be updated to point to snapshot 1, and finally the write to snapshot proceeds.
The procedure 322 shown in
The procedure 344 for calculating the snapshot disk size is shown in
It should be further noted that numerous changes in details of construction, combination, and arrangement of elements may be resorted to without departing from the true spirit and scope of the invention as hereinafter claimed.
Claims
1. A method for managing multiple snapshot copies of data in a storage area network, comprising:
- providing a plurality of different point-in-time read and write accessible snapshot copies of a base disk volume in a storage array wherein said plurality of snapshot copies are all linked together sharing only one copy of a unique data block.
2. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- saving snapshot disk space by dynamically allocating additional space required according to actual usage.
3. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- performing only one copy-on-write procedure needs for said plurality of snapshot copies during access to said base disk volume.
4. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- performing only one copy-on-write procedure for said plurality of snapshot volumes during access to any said plurality of snapshots copies that are attached to said base disk volume.
5. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- deleting a snapshot copy wherein disk space and data structure dedicated to that snapshot copy are also deleted such that storage space and memory resource within said plurality snapshot copies may be reused for subsequent applications.
6. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- maintaining and updating, different point-in-time snapshot copies of said base disk volume.
7. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- managing said plurality of snapshot copies and said base disk volume by a storage area network file located within an array controller.
8. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- adding a snapshot copy to said plurality of snapshot copies by adding to an end of a last snapshot copy thereby continuing said link of said plurality of snapshot copies.
9. The method according to claim 1 for managing multiple snapshot copies of data in a storage area network, comprising:
- deleting a snapshot copy to said plurality of snapshot copies by deleting a first snapshot copy wherein a second snapshot copy becomes a first snapshot copy thereby continuing said link of said plurality of snapshot copies.
10. A storage area network system, comprising:
- a storage array having one or more storage controllers;
- a storage area network file system located within said one or more storage controllers for controlling a base volume and one or more snapshot volumes wherein said snapshot volumes are a plurality of different point-in-time read and write accessible snapshot copies of said base volume and said plurality of snapshot copies are all linked together sharing only one copy of a unique data block.
11. The storage area network system according to claim 10 wherein said one or more storage controllers separately connect to storage devices across dedicated buses.
12. The storage area network system according to claim 10 wherein snapshot disk space of said snapshot volumes is saved by dynamically allocating additional space required according to actual usage.
13. The storage area network system according to claim 10 wherein only one copy-on-write procedure needs to be performed for said plurality of snapshot copies during access to said base volume by said system area network file system.
14. The storage area network system according to claim 10 wherein only one copy-on-write procedure needs to be performed for said plurality of snapshot volumes during access to any said plurality of snapshots copies by said system area network file system.
15. The storage area network system according to claim 10 wherein a snapshot copy that is deleted has its disk space and data structure dedicated to that snapshot copy also deleted such that storage space and memory resource within said plurality snapshot copies may be reused for subsequent applications.
16. The storage area network system according to claim 10 wherein point-in-time snapshot copies of said base disk volume are maintained and updated by said storage area network file system.
17. The storage area network system according to claim 10 wherein said plurality of snapshot copies and said base disk volume are managed by a storage area network file located within an array controller and further managed by said base disk volume.
18. The storage area network system according to claim 10 wherein a snapshot copy is added to said plurality of snapshot copies by adding to an a last snapshot copy thereby continuing said link of said plurality of snapshot copies.
19. The storage area network system according to claim 10 wherein a snapshot copy is deleted to said plurality of snapshot copies by deleting a first snapshot copy wherein a second snapshot copy becomes a first snapshot copy thereby continuing said link of said plurality of snapshot copies.
20. A storage area network system comprising:
- means for providing a plurality of different point-in-time read and write accessible snapshot copies of a base disk volume in a storage array wherein said plurality of snapshot copies are all linked together sharing only one copy of a unique data block;
- means for saving snapshot disk space by dynamically allocating additional space required according to actual usage; and
- means for performing only one copy-on-write procedure needs for said plurality of snapshot copies during access to said base disk volume.
Type: Application
Filed: Aug 25, 2004
Publication Date: Mar 2, 2006
Inventor: Calvin Zheng (Camarillo, CA)
Application Number: 10/925,803
International Classification: G06F 12/16 (20060101);