SYSTEMS AND METHODS FOR RECOVERING ELECTRONIC INFORMATION FROM A STORAGE MEDIUM

In one embodiment of the invention, a method is provided for retrieving certain electronic information previously stored on certain storage media after a threshold set in the storage retention criteria has been exceeded in an electronic information storage system that stores electronic information on storage media in accordance with a storage retention criteria is provided. The method includes storing a record in a memory associated with a system manager that assigns the storage retention criteria to the certain electronic data, designating the storage media available for overwrite after the threshold set in the storage retention policy has been exceeded, identifying the certain storage media available for overwrite, and retrieving information from the certain media after the threshold set in the storage retention policy has been exceeded.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATION(S) Priority Claim

This application is a divisional of U.S. application Ser. No. 12/276,868 titled SYSTEMS AND METHODS FOR RECOVERING ELECTRONIC INFORMATION FROM A STORAGE MEDIUM, filed Nov. 24, 2008, which is a continuation of U.S. application Ser. No. 11/269,515 titled SYSTEMS AND METHODS FOR RECOVERING ELECTRONIC INFORMATION FROM A STORAGE MEDIUM, filed Nov. 7, 2005, now U.S. Pat. No. 7,472,238, which claims the benefit of U.S. Provisional Application No. 60/626,076 titled SYSTEM AND METHOD FOR PERFORMING STORAGE OPERATIONS IN A COMPUTER NETWORK, filed Nov. 8, 2004, and U.S. Provisional Application No. 60/625,746 titled STORAGE MANAGEMENT SYSTEM filed Nov. 5, 2004, each of which is incorporated herein by reference in its entirety.

RELATED APPLICATIONS

This application is also related to the following patents and pending applications, each of which is hereby incorporated by reference in its entirety:

    • U.S. Pat. No. 6,418,478, titled PIPELINED HIGH SPEED DATA TRANSFER MECHANISM, issued Jul. 9, 2002;
    • application Ser. No. 09/610,738, titled MODULAR BACKUP AND RETRIEVAL SYSTEM USED IN CONJUNCTION WITH A STORAGE AREA NETWORK, filed Jul. 6, 2000, now U.S. Pat. No. 7,035,880;
    • application Ser. No. 09/774,268, titled LOGICAL VIEW AND ACCESS TO PHYSICAL STORAGE IN MODULAR DATA AND STORAGE MANAGEMENT SYSTEM, filed Jan. 30, 2001, now U.S. Pat. No. 6,542,972;
    • application Ser. No. 60/409,183, titled DYNAMIC STORAGE DEVICE POOLING IN A COMPUTER SYSTEM, filed Sep. 9, 2002;
    • application Ser. No. 11/269,520, titled SYSTEM AND METHOD FOR PERFORMING MULTISTREAM STORAGE OPERATIONS, filed Nov. 7, 2005;
    • application Ser. No. 11/269,512, titled SYSTEM AND METHOD TO SUPPORT SINGLE INSTANCE STORAGE OPERATIONS, filed Nov. 7, 2005;
    • application Ser. No. 11/269,514, titled METHOD AND SYSTEM OF POOLING STORAGE DEVICES, filed Nov. 7, 2005 now U.S. Pat. No. 7,809,914;
    • application Ser. No. 11/269,521, titled METHOD AND SYSTEM FOR SELECTIVELY DELETING STORED DATA, filed Nov. 7, 2005, now U.S. Pat. No. 7,765,369;
    • application Ser. No. 11/269,519, titled METHOD AND SYSTEM FOR GROUPING STORAGE SYSTEM COMPONENTS, filed Nov. 7, 2005, now U.S. Pat. No. 7,500,053; and
    • application Ser. No. 11/269,513, titled METHOD AND SYSTEM FOR MONITORING A STORAGE NETWORK, filed Nov. 7, 2005.

COPYRIGHT NOTICE

A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosures, as it appears in the Patent and Trademark Office patent files or records, but otherwise expressly reserves all other rights to copyright protection.

BACKGROUND

The present invention generally relates to the storage and retrieval of electronic data used in computer systems. More particularly, the present invention relates to systems and methods for managing the storage of electronic data on recordable medium that extends the amount of time the electronic data may be retrieved from the recordable medium before the medium is reused in another storage application.

The storage of electronic data has evolved over time. During the early development of the computer, storage of electronic data was limited to individual computers. Electronic data was stored in the Random Access Memory (RAM) or some other storage medium such as a magnetic tape or hard drive that was a part of the computer itself.

Later, with the advent of network computing, the storage of electronic data gradually migrated from the individual computer to stand-alone storage devices accessible via a network. These individual network storage devices soon evolved into networked tape drives, optical libraries, Redundant Arrays of Inexpensive Disks (RAID), CD-ROM jukeboxes, and other devices. Common architectures included drive pools, which generally are logical collections of drives with associated media groups including magnetic tapes or other storage media used by a given drive pool.

Storage systems, such as some of the systems described above, typically employ certain high capacity data storage mediums, which may include magnetic tapes, optical disks and the like to store electronic information. At some point in time, however, it is often no longer necessary or desirable to retain the electronic information stored on these media. When this point is reached, the media on which such electronic information is stored may be reused or recycled by the system for use in other storage jobs rather than simply discarding the media or maintaining the information in perpetuity.

For example, in a tape-based system, a storage tape with unwanted or outdated information may be designated within the storage management system for reuse in a subsequent storage operation in a spare media pool. Such a spare media pool may contain media that is available for storage use in subsequent storage operations and may include new media or media designated for reuse within the storage system. When storage media are assigned to the spare media pool, any information in the storage management system regarding the old data on the tape may be discarded, erased or designated for overwrite and replaced with a simple designation indicating that the tape is available for use in another storage operation. For example, an index entry used by the storage management system that includes information about the old data may be overwritten of renamed to after the data retention period has expired.

In many storage systems, however, the reused storage media continues to contain the data from the previous storage operation, which typically remains on the media until it is overwritten by a new storage process. Thus, in many storage systems, the media designated for reuse continues to contain old information for a significant period of time past any established retention date. Nevertheless, because records are not typically retained or retrievable by storage management systems regarding the media designated for reuse (and any old information contained thereon), it is difficult to recover or restore any of this old information, absent the use of cumbersome, uncommon restore procedures, despite the fact that the such information still exists on media designated for reuse within the system.

Accordingly, what is needed are systems and methods that overcome this and other deficiencies.

SUMMARY

In one embodiment of the invention, a method is provided for retrieving certain electronic information previously stored on certain storage media after a threshold set in the storage retention criteria has been exceeded in an electronic information storage system that stores electronic information on storage media in accordance with a storage retention criteria is provided. The method includes storing a record in a memory associated with a system manager that assigns the storage retention criteria to the certain electronic data, designating the storage media available for overwrite after the threshold set in the storage retention policy has been exceeded, identifying the certain storage media available for overwrite, and retrieving information from the certain media after the threshold set in the storage retention policy has been exceeded.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other objects and advantages of the present invention will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:

FIG. 1 is a block diagram of a network architecture for a system that performs storage and retrieval operations on electronic data in a computer network in accordance with the principles of the present invention;

FIG. 2 is a block diagram of an exemplary media library storage device for a system to perform storage and retrieval operations in accordance with an embodiment of the invention;

FIG. 3 is a flow chart illustrating some of the steps for performing storage and retrieval operations on electronic data in a computer network according to an embodiment of the invention; and

FIG. 4 is a flow chart illustrating some of the steps involved with selecting media for reuse in accordance with an embodiment of the invention.

DETAILED DESCRIPTION

An embodiment of the system constructed in accordance with the principles of the present invention is shown in FIG. 1. As shown, the system may include a client 50, a data agent 60, a data store 70, a storage management component (SMC) 80, a storage manager index 90, one or more media management components 100 (sometimes referred to as media agents), one or more media management component indexes 110, and one or more storage devices 120. Although FIG. 1 depicts a system having two media management components 100, there may be one media management component, or a plurality of media management components providing communication between the client 50, storage manager 80 and the storage devices 120. In addition, the system can include one or a plurality of storage devices 120. Moreover, in some embodiments, media management components 100 may be removed, omitted or otherwise bypassed, with storage manager 80 directly controlling storage devices 120.

Client 50 can be any networked client 50 and may include at least one attached data store 70. Data store 70 may be any memory device or local data storage device known in the art, such as a hard drive, CD-ROM drive, tape drive, RAM, or other types of magnetic, optical, digital and/or analog local storage. In some embodiments of the invention, client 50 includes at least one data agent 60, which is a software module that is generally responsible for storing, archiving, migrating, and recovering data of a client 50 stored in data store 70 or other memory location.

Storage operations may include, but are not limited to, creation, storage, retrieval, migration, deletion, and tracking of primary or production volume data, secondary volume data, primary copies, secondary copies, auxiliary copies, snapshot copies, backup copies, incremental copies, differential copies, synthetic copies, HSM copies, archive copies, Information Lifecycle Management (“ILM”) copies, and other types of copies and versions of electronic data.

In some embodiments of the invention, the system of FIG. 1 provides at least one, and typically a plurality of data agents 60 for each client, each data agent 60 is intended to store, backup, migrate, and recover data associated with a different application. For example, a client 50 may have different individual data agents 60 designed to handle Microsoft Exchange data, Lotus Notes data, Microsoft Windows file system data, Microsoft Active Directory Objects data, and other types of data known in the art.

Storage manager 80 is generally a software module or application that coordinates and controls the system. For example, storage manager 80 may manage and control storage operations performed by the system shown in FIG. 1. Storage manager 80 may communicate with some or all components of the system including clients 20, data agents 60, media management components 100, and storage devices 120 to initiate and manage storage operations. Storage manager 80 may include an index 90 for storing data related to storage operations (described in more detail below). Generally speaking, storage manager 80 communicates with storage devices 120 via a media management component 100. In some embodiments, storage manager 80 may communicate directly with storage devices 120.

The system shown in FIG. 1 may include one or more media management components, such as media management component 100. Media management component 100 may be implemented as a software module that conveys data, as directed by the storage manager 80, between the client 50 and one or more storage devices 120, which can be storage devices such as a tape library, a hard drive, a magnetic media storage device, an optical media storage device, or other storage device. Media management component 100 is communicatively coupled with and may control storage device 120. For example, media management component 100 might instruct a storage device 120 to store, archive, migrate, or restore application specific data. Media management component 100 generally communicates with the storage device 120 via a local bus such as a SCSI adaptor or a host bus adaptor (HBA).

Each media management component 100 may maintain an index cache 110 which stores index data that the system generates during storage operations. For example, storage operations for Microsoft Exchange data generate index data. Index data may include, for example, information regarding the location of the stored data on a particular media (e.g., a location offset value), information regarding the content of the data stored such as file names, sizes, creation dates, formats, application types, and other file-related criteria, information regarding one or more clients associated with the data stored, information regarding one or more storage policies, storage criteria, or storage preferences associated with the data stored, compression information, retention-related information, encryption-related information, stream-related information, and other types of information. Index data thus provides the system with an efficient mechanism for performing storage operations including locating user files for recovery operations and for managing and tracking stored data.

The system of FIG. 1 may maintain multiple copies of the index data regarding particular stored data. A first copy may be stored with the data copied to a storage device 120. Thus, a tape may contain the stored data as well as index information related to the stored data. In the event of a system restore, the index data stored with the stored data can be used to rebuild a media management component index 110 or other index useful in performing storage operations. In addition, the media management component 100 that controls the storage operation also may generally write an additional copy of the index data to its index cache 110. The data in the media management component index cache 110 is generally stored on faster media, such as magnetic media, and is thus readily available to the system for use in storage operations and other activities without having to be first retrieved from the storage device 120.

Storage manager 80 may also maintain an index cache 90. Storage manager index data may be used to indicate, track, and associate logical relationships and associations between components of the system, user preferences, management tasks, and other useful data. For example, storage manager 80 might use its index cache 90 to track logical associations between media management components 100 and storage devices 120. Storage manager 80 may also use its index cache 90 to track the status of storage operations to be performed, storage patterns associated with the system components such as media use, storage growth, network bandwidth, service level agreement (“SLA”) compliance levels, data protection levels, storage policy information, storage criteria associated with user preferences, retention criteria, storage operation preferences, and other storage-related information.

Index caches 90 and 110 typically reside on their corresponding storage component's hard disk or other fixed storage device. For example, jobs agent 85 of a storage manager component 80 may retrieve storage manager index 90 data regarding a storage policy and storage operation to be performed or scheduled for a particular client 50. Jobs agent 85, either directly or via another system module, may communicate with data agent 60 regarding the storage operation. In some embodiments, jobs agent 85 may also retrieve from index cache 90 a storage policy associated with the client 50 and uses information from the storage policy to communicate to data agent 60 one or more media management components 100 associated with performing storage operations for that particular client 50 as well as other information regarding the storage operation to be performed such as retention criteria, encryption criteria, streaming criteria, etc. Data agent 60 then packages or otherwise manipulates the client data stored in client data store 90 in accordance with the storage policy information and/or according to a user preference, and communicates this client data to the appropriate media management component(s) 100 for processing. Media management component(s) 100 store the data according to storage preferences associated with the storage policy including storing the generated index data with the stored data, as well as storing a copy of the generated index data in the media management component index cache 110. Data may be stored in accordance with any suitable storage policy or preference including those disclosed in U.S. patent application Ser. No. 10/818,749, which is hereby incorporated by reference in its entirety.

In some embodiments, components of the system may reside and execute on the same computer. In some embodiments, a client component such as a data agent 60, a media management component 100, or a storage manager 80 coordinates and directs local archiving, migration, and retrieval application functions as further described in U.S. patent application Ser. No. 09/610,738, which is hereby incorporated by reference in its entirety. These client components can function independently or together with other similar client components.

Storage device 120 may be any conventional storage device capable of storing data. Some storage devices 120 may include a robotic arm (not shown) that may be used to insert and remove storage media 145 contained in the storage device. The type of storage media used in storage device 120 is not critical and can be a magnetic tape or optical disk, such as that generally depicted in FIG. 2. For example, storage device 120 may include any suitable storage media such as storage tapes 145, but some embodiments may also include other optical and magnetic media such as CDRW, DVDRW, etc., (not shown). Storage device 120 may also include drives 125, 130, and 135 for reading information from and writing information to such media. Tapes 145 may store electronic data containing backups of application data, user preferences, system information, and other useful information known in the art.

In operation, the system shown in FIG. 1 may store electronic data on storage media 145 as described above. Generally speaking, the information stored on media 145 may be maintained in accordance with particular storage policy or retention preference that may be predefined or updated periodically. Such policies may be user defined or may be one of several available predefined default settings (e.g., as directed by a storage manager index). A storage policy is generally a data structure or other information that includes a set of preferences and other storage criteria for performing a storage operation. The preferences and storage criteria may include, but are not limited to: a storage location, relationships between system components, network pathway to utilize, retention policies, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, and other criteria relating to a storage operation. A storage policy may be stored to a storage manager index, to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system.

In the case where information is retrieved from media 145, storage manager 80 and/or media management components 100 may cooperate with one another and interact with storage device 120 to locate a particular media 145 and retrieve the desired data. Media 145 may be located using any suitable means including index information that specifies the physical location of the media within storage device 120 and may also utilize external or internal labels or other indicia identifying the media and data stored thereon. Such media identifiers may include on media labels (OMLs), bar codes, RFIDs, etc.

Furthermore, during normal operation, storage device 120 may reuse or recycle storage media 145 as appropriate to provide the system with the storage resources necessary to perform future storage operations and to promote the efficient use of spare media within the system. One benefit of this reuse type system is that it reduces the amount of media required by the storage system thereby eliminating the need for large amounts unnecessary storage media 145.

For example, storage manager 80 and/or media component manager 100 may monitor the retention preferences or storage policies of data stored on media 145. When certain data exceeds one or more predetermined thresholds (e.g., exceeds an age, size or other specified parameter), storage manager 80 and/or media component manager 100 may designate the media on which that data is stored available for current use (i.e., may be overwritten). This allows storage device 120 to use that media, which still contains old data that has passed its retention period, for new storage tasks. For example, after certain data on a particular media 145 has exceeded a threshold parameter, media manager component 100 may designate that media for reuse. The information regarding the old data, however, still exists and may be retained (e.g., in an index or backup index). This information, which may include descriptive metadata, is useful in future restore operations where spare media tape 145 has not yet been overwritten and it is desired to retrieve some of that data stored thereon. Such information may be retained until the media 145 is completely overwritten with new data. After media 145 is overwritten with new data in a subsequent storage operation, the old data previously stored on the overwritten portion of the media is usually unrecoverable.

In some embodiments, media 145 may be managed by assigning the media to one or more “media pools.” Media 145 may be assigned to a particular media pool by storage manager 80 based on certain attributes of the data stored on the media. For example, one type of media pool may be referred to as a “save pool.” Media assigned to a save pool may be designated by storage manager 80 and/or media management components 100 as “write protected” or “unavailable” or “in storage.” Certain media 145 may be assigned to such a save pool in the case where the data stored therein is to remain in storage and accessible pursuant to a storage policy and therefore cannot be overwritten or reused at this point in time. Storage manager 80 may retain records and other information relating to the data stored on each media 145 in the save pool such as its physical location and the relationship between the data, media ID, and storage policy in order to coordinate access and management of storage resources and the stored data.

Another type of media pool may be referred to as a “scratch pool.” Media assigned to a scratch pool may be designated by storage manager 80 as “writeable” or “available” or “spare” or “spare media pool.” Media assigned to the scratch pool is generally available for storage operations and is generally not write protected or otherwise restricted from use within a storage device. Thus, when the system of FIG. 1 requires additional media 145 for new storage operations, the spare media pool is where such media may be located and made available to the system. Moreover, media 145 may be assigned to such a scratch pool in the case where the media is newly added to the system or where the data stored on a previously used media no longer needs to be retained or has exceeded a limitation set forth in its storage policy and therefore may be overwritten or reused at this point in time. For example, certain data may have exceeded its age criteria. In this case, the media 145 on which that data is stored may be designated for reuse and assigned to the spare media pool. The metadata and other information describing the data may also be retained in a spare media index that tracks such information (not shown). The spare media index may be substantially the similar to or the same as index 90 used to track media 145 and may be stored in or part of index 90.

Thus, the system of FIG. 1 has the ability to keep track of what previously used media is available for new storage operations by consulting an index of data records that indicate which media 145 are members of which media pool. In other embodiments, the status of a particular media may be determined by consulting records that are maintained on a media by media basis. For example, when a certain media is available for new storage operations, a flag may set in that media's profile record. With this system, storage manger 80 may quickly determine system capacity, availability and degree of utilization of spare media.

Moreover, in some embodiments, data may be overwritten on spare media (and the media may be reused) based on a classification scheme or according to certain preferences. For example, data may be assigned to various retention levels and may be overwritten based on those retention levels, with the highest priority data being overwritten last. Thus, for example, low priority data may be overwritten first, intermediate priority data may be overwritten next, and high priority data overwritten last. Such a hierarchy extends the lifecycle of data on a sliding scale, providing additional flexibility in retrieving data based on retention level, while making storage media available within the system.

Unlike prior art systems, a preferred embodiment of the present invention continues to retain records and other information relating to media assigned to the scratch pool (or simply for the media designated for reuse in general irrespective of whether a scratch pool or save pool concept is actually implemented). For example, media management component 100 and/or storage manager 80 may store or retain records relating to each media 145 in the scratch pool including its physical location within storage device 120, the data stored on that media, as well as information useful in indexing that data, media identification information and storage policy, etc. (e.g., in a spare media pool index). This allows the present invention to identify and retrieve previously stored information from scratch pool media that has exceeded its retention date, thus accommodating the need for the reuse of storage media and promoting system efficiency while succeeding in extending the storage period of previously stored data past its retention date by leveraging description data already present within the system This ability represents an improvement over prior art systems which typically cannot access old information from recycled media despite the fact such information continues to remain on spare media within the storage system prior to overwrite.

In some embodiments, the index or other information retained for the scratch pool media (i.e., spare media pool index) may be the same as or substantially similar to the information retained for save pool media. In this case, when a media 145 is assigned to the scratch pool from the save pool, the associated records may be simply copied or redesignated as scratch pool records. Using this approach, little or no additional processing of existing media management information need be performed to obtain detailed and accurate information regarding scratch pool media. The redesignated information may be used by storage manager 80 or other management systems (not shown) to retrieve old information that remains on scratch pool media (prior to reuse).

Although media may be reassigned from one storage pool to the other as described above, it will be understood that this does not necessarily require any physical movement of the storage media from one location to another. Rather, media may remain at one location with that media being reassigned to the scratch pool within management software resident on storage manager 80.

Furthermore, in some embodiments of the invention, storage manager 80 may monitor the reuse of media from the scratch pool such that the system keeps track of the storage space and/or data overwritten by subsequent storage operations. This may involve updating the scratch pool media records so that the records reflect how much of the old data remains on that particular media. For example, a certain media 145 designated for reuse may be partially overwritten such that it includes both new data and old data. This may involve keeping track of certain files, chunks, and/or blocks of data including any location offset on the media or any description. Storage manger 80 may update the records associated with that media so it may be readily determined how much old information still may be recovered. Such updating may be automated and triggered by reuse of a previously used media 145 and/or in accordance with any classification or retention scheme such that description information or metadata may be updated, deleted or otherwise modified when corresponding portions of data are overwritten on media 145. Any suitable data monitoring and updating procedure or program may be used to achieve this objective. This feature permits the present invention to identify and retrieve (or partially retrieve) old information from media already in reuse.

Another aspect of the invention involves the management, organization and display of save pool and scratch pool information. In some embodiments, both save pool and scratch pool information may be organized and displayed using a graphical user interface with familiar pull down menus and a folder/file organization structure. For example, a user may browse information in either pool by merely clicking on a particular folder (such as save or scratch) and select a particular media (which may be represented as a file within the folder) to view the information stored on that particular media. This allows a user media level access to the information stored in the system. In other embodiments, browse features associated with the system may locate and display for a user a graphical view or all save pool media in one display and a different display that shows all the scratch pool media. For example, by searching for all available spare pool media, the system of FIG. 1 may populate a table, list or other graphical display showing the available spare pool media, the records of the data stored on that media, and any other useful information (e.g., a spare pool media display). The same or similar may also be done for save pool media. In some embodiments, access to such information may be password protected within the system and available to only users with the appropriate privileges. For example, a user may only have privileges to the save pool and not the scratch pool, the may have access to high level data such as the available or used media, but not to any index information, etc.

Additionally, management software may include a search engine and command functions that allow the user to quickly search save system media to determine if particular data exists or to observe the status of certain media. For example, if a user wants to determine if certain data which has past its retention cycle still exists on media within the scratch pool, a boolean word search or other searching method may determine whether that data still exists or not. Moreover, the system may generate summaries that include general information such as listing the oldest data in the scratch pool, the current contents of the scratch pool, remaining unused system storage capacity etc. Command functions may allow users to modify or otherwise direct manipulation of media outside of normal automated operation. These summary, command, and search functions may be user configurable and arranged according to the needs or desires of a particular user.

The system of FIG. 1 may select media for reuse employing a number of different selection criteria. For example, media that contains data past its retention cycle may be designated for reuse immediately after (or some time after) the retention period expires. However, the order in which those media are overwritten may vary according to default or user-specified preferences. For example, a default preference may specify that the media containing the oldest data be overwritten first. Other default scenarios may include specifying reuse preference based on data type. For example, all marketing data may be overwritten before any financial data is overwritten, system backup files may have priority over email backups, etc. System users may customize their system with reuse procedures and policies that best reflect the needs of a particular business or enterprise. Nonetheless, it will be understood that any suitable reuse or recycle policy may be used if desired.

Some of the steps involved in recovering electronic information from a storage medium in accordance with the present invention are illustrated in flow chart 300 shown in FIG. 3. As shown, at step 302, the system of FIG. 1 may determine what data is to be stored, and which retention policy should govern the storage operation. This step is preferably automated and may be accomplished at least in part, by system management software resident on storage manager 80 which oversees the storage of information on a particular media 145. At this point, a certain storage media may be assigned to a save pool and be designated as restricted. After these decisions have been made and the data stored, a record may be created at step 304 that may be maintained in index 90 and/or media management component 100. Next, at step 306, jobs agent 85 or other management agent monitors the retention policies of data stored within the save pool.

At step 308, when certain data exceeds its retention threshold, jobs manager 85 or other selection logic may selectively “prune” or remove certain media from the save pool by releasing its associated index entry and designating it available (i.e., placing it in the scratch pool) while retaining its record profile. Next, at step 310, storage manager 80 and/or media management component 100 may select a media 145 for overwrite based on default of other criteria described above (the “reused media”) and update that media's record profile accordingly. At step 312, a user may optionally search for and retrieve information from the reused media assigned to the scratch pool using the indexing and location information stored at step 308. This may be accomplished for example, by invoking the media pool display screen described above, and populating that display with the desired information. A user may then retrieve or otherwise access data stored on the identified media. Afterwards, at step 314, reused media may be partially overwritten in a new storage operation. At this point, the media in use may have record profiles that belong to both the scratch and save pools and both new and old data may be retrieved from the media. For example, a certain media 145 may have index entries in both the save and scratch pool with offset data defining the location of the old or new data on that media. Furthermore, media 145 used in this type of dual role may be organized in any suitable way, as desired, such as by overwriting large contiguous sections, or by selectively overwriting old data of lesser importance, etc. Media 145 containing both old and new information may sometimes be referred to as hybrid media.

Next, at step 316, media management component 100 and/or storage manager 80 may update the record profile or index entry associated with the reused media to reflect the extent to which the reused media has been overwritten and to indicate how much old data still remains. The record profile may also be updated to reflect the newly added information. At step 318, a user may optionally retrieve any old data remaining on the used media, and finally, at step 320 the reused media may be completely overwritten and its associated record profile may be updated to reflect this change. At this point, the reused media may be assigned back to the save pool, and the records in the scratch pool regarding this media may be deleted.

Although the steps shown above are illustrative of a general embodiment of the invention, it will be understood these steps are not intended to be comprehensive or necessarily performed in the order shown. For example, steps 314 to 318 may be performed on an iterative basis until the media in use is completely overwritten or designated to the save pool. For example, steps 314 to 318 may be performed until a threshold is reached, such as media capacity, in which case the index data may be deleted.

Some of the steps involved in selecting media assigned to the scratch pool for overwrite in accordance with the present invention are illustrated in flow chart 400 shown in FIG. 4. As shown, at step 402, the system of FIG. 1 may determine what media is to be overwritten first according to certain defined or default criteria as described above. Available media may be tracked in a data structure by storage manager 80 and/or media management component 100 such that the most appropriate media is readily identifiable and available when the need arises for spare media. For example, a data structure representing a virtual queue or other arrangement may be used to track and order media according to retention criteria of other preferences such as first in, first out, by data type or subject matter, etc.

Next, at step 404, a certain media identified in the data structure may be retrieved for an overwrite operation, which would overwrite portions of data previously stored on that media. Storage device 120 may retrieve this media and confirm it is the correct one by verifying its identity via an OML, a header file, or other marking indicia at step 406 to ensure the correct media has been selected for overwrite. If the media identity is verified, the media may be overwritten at step 408 and tracked according to the applicable retention preference or policy at step 410. If the media identity is not verified, the system of FIG. 1 may perform the appropriate discovery steps to locate the media in question at step 412 (e.g., by systematically searching through various media libraries). If the media is located through discovery, the verification procedure may be performed from step 404 going forward. If found, and the media is the correct one, it can continue to on step 408. If not, step 412 may be repeated several times, and if the media is still not found, it may be determined as lost at step 414 (e.g., by a setting flag in the media index profile, or assigning the media to a “lost pool”).

Although the steps shown above are illustrative of a general embodiment of the invention, it will be understood these steps are not intended to be comprehensive or necessarily performed in the order shown.

Thus, systems and methods for recovering electronic information from a storage medium are provided. It will be understood that the foregoing is merely illustrative of the principles of the present invention and that various modifications can be made by those skilled in the art without departing from the scope and spirit of the invention. Accordingly, such embodiments will be recognized as within the scope of the present invention.

Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, and other devices suitable for the purposes described herein. Software and other modules may be accessible via local memory, via a network, via a browser or other application in an ASP context, or via other means suitable for the purposes described herein. Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein. User interface elements described herein may comprise elements from graphical user interfaces, command line interfaces, and other interfaces suitable for the purposes described herein. Screenshots presented and described herein can be displayed differently as known in the art to input, access, change, manipulate, modify, alter, and work with information.

While the invention has been described and illustrated in connection with preferred embodiments, many variations and modifications as will be evident to those skilled in this art may be made without departing from the spirit and scope of the invention, and the invention is thus not to be limited to the precise details of methodology or construction set forth above as such variations and modification are intended to be included within the scope of the invention.

Persons skilled in the art will appreciate that the present invention can be practiced by other than the described embodiments, which are presented for purposes of illustration rather than of limitation and that the present invention is limited only by the claims that follow.

Claims

1. A computer-implemented method for selecting a spare storage medium to be used in a data storage operation, wherein the spare storage medium includes data designated to be overwritten, the method comprising:

accessing at least one index that stores— first index information regarding a first spare storage medium, and second index information regarding a second spare storage medium;
identifying the first spare storage medium associated with the first index information, wherein the first index information identifies data stored on the first spare storage medium, and wherein the first spare storage medium includes data designated to be overwritten;
identifying the second spare storage medium associated with the second index information, wherein the second index information identifies data stored on the second spare storage medium, and wherein the second spare storage medium includes data designated to be overwritten; and
selecting the first spare storage medium for use in a data storage operation based on the accessing of the first index information and the second index information.

2. The method of claim 1, wherein the first index information identifies data having a lower priority of preservation than the data identified by the second index information.

3. The method of claim 1, wherein the first index information identifies data older than the data identified by the second index information.

4. The method of claim 1, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is partially overwritten.

5. The method of claim 1, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is substantially overwritten.

6. A data storage system, comprising:

at least one client computing device; and
at least one data storage device coupled to the client computing device via a network, wherein the data storage device includes at least a first spare storage medium and a second spare storage medium;
wherein the client computing device is programmed to: access at least one index having— first index information regarding the first spare storage medium of the data storage device, and second index information regarding the second spare storage medium of the data storage device; identifying the first spare storage medium associated with the first index information, wherein the first index information identifies data stored on the first spare storage medium, and wherein the first spare storage medium includes data designated to be overwritten; identifying the second spare storage medium associated with the second index information, wherein the second index information identifies data stored on the second spare storage medium, and wherein the second spare storage medium includes data designated to be overwritten; and selecting the first spare storage medium for use in a data storage operation based on the accessing of the first index information and the second index information.

7. The data storage system of claim 6, wherein the first index information identifies data having a lower priority of preservation than the data identified by the second index information.

8. The data storage system of claim 6, wherein the first index information identifies data older than the data identified by the second index information.

9. The data storage system of claim 6, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is partially overwritten.

10. The data storage system of claim 6, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is substantially overwritten.

11. A method for selecting a spare storage medium to be used in a data storage operation, wherein the spare storage medium includes data designated to be overwritten, the method comprising:

identifying a first spare storage medium associated with first index information that identifies data stored on the first spare storage medium;
identifying a second spare storage medium associated with second index information that identifies data stored on the second spare storage medium;
selecting the first spare storage medium for use in a data storage operation based on a review of the first index information and the second index information, wherein the first spare storage medium includes data designated to be overwritten;
retrieving a storage medium;
verifying that the retrieved storage medium is the first spare storage medium, wherein the verifying includes automatically reading data from the retrieved storage medium; and,
when the retrieved storage medium is verified to be the first spare storage medium, then overwriting, with new data, the data designated to be overwritten on the first spare storage medium.

12. The method of claim 11, wherein the first index information identifies data having a lower priority of preservation than the data identified by the second index information.

13. The method of claim 11, wherein the first index information identifies data older than the data identified by the second index information.

14. The method of claim 11, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is partially overwritten.

15. The method of claim 11, further comprising:

overwriting the data stored on the first storage medium during the data storage operation; and
deleting the first index information only after the data stored on the first storage medium is substantially overwritten.

16. The method of claim 11, wherein the automatically reading data from the retrieved storage medium includes reading an on media label (OML) on the retrieved storage medium.

17. The method of claim 11, wherein the automatically reading data from the retrieved storage medium includes reading a header file on the retrieved storage medium.

18. The method of claim 11, further comprising systematically searching media libraries to locate the first spare storage medium when the retrieved storage medium is verified not to be the first spare storage medium.

Patent History
Publication number: 20110093672
Type: Application
Filed: Dec 16, 2010
Publication Date: Apr 21, 2011
Inventors: Parag Gokhale (Ocean, NJ), Jun Lu (Ocean, NJ), Yanhui Lu (Acton, MA), Yu Wang (Edison, NJ), Rajiv Kottomtharayil (Marlboro, NJ)
Application Number: 12/970,536
Classifications
Current U.S. Class: Entry Replacement Strategy (711/159); Addressing Or Allocation; Relocation (epo) (711/E12.002)
International Classification: G06F 12/02 (20060101);