System and method for automated data storage management
A system, method and program product for assigning management classes to data objects. The disclosed system includes a class assignment system for assigning a management class to an inputted data object, wherein the class assignment system analyzes historical usage characteristics of existing data objects that have attributes similar to the inputted data object; and a data analysis system that analyzes existing data objects to generate a knowledge base of historical usage characteristics. The historical usage characteristics are updated as existing objects make their way through the storage lifecycle.
Latest IBM Patents:
- SENSITIVE STORED PROCEDURE IDENTIFICATION IN REAL-TIME AND WITHOUT DATA EXPOSURE
- Perform edge processing by selecting edge devices based on security levels
- Compliance mechanisms in blockchain networks
- Clustered rigid wafer test probe
- Identifying a finding in a dataset using a machine learning model ensemble
The present invention relates generally to data storage management, and more specifically relates to a system and method for assigning management classes to data objects.
BACKGROUND OF THE INVENTIONIn today's large scale storage systems, a tremendous amount of effort can be expended in assigning management criteria to individual or groups of data objects. In many modern operating systems, such as OS/390 and z/OS, “management classes” are used to define the criteria under which data elements will be administered. Namely, a defined management class will determine how an object will be managed over a period of time. For example, a management class may dictate that a given data object reside on DASD for x days, then be moved to a compressed format on DASD for y days, then be migrated to tape and stored for z days, and then be discarded.
Unfortunately, the process of assigning management classes is often done by a storage administrator who has to make various assumptions about the data. Often, however, the storage administrator is well removed from the application side of the environment that generated or utilizes the data. To address this, prescribed policies and procedures are often put into place for application implementers. Unfortunately, such procedures are often not followed or completely understood, thanks to a lack of training, oversight, etc. Moreover, once a storage management criteria is assigned to a group of objects, the criteria often does not remain up to date with the application requirements.
Such problems are further exacerbated by the use of storage area networks (SANs), in which data may be distributed over a disparate network. In such cases, it is not feasible for a storage administrator to know how a given set of data should be managed. Furthermore, in such scenarios, disparate groups of users often commingle data, further complicating the process. Additional complications arise when one organization assumes responsibility for another organization's data.
Accordingly, a need exists for a system and method that can effectively assign management classes to data objects in an automated manner.
SUMMARY OF THE INVENTIONThe present invention addresses the above-mentioned problems, as well as others, by providing a system and method for automatically assigning management classes to data objects. In a first aspect, the invention provides a management class processing system, comprising: a class assignment system for assigning a management class to an inputted data object, wherein the class assignment system identifies historical usage characteristics of existing data objects that have attributes similar to the inputted data object; and a data analysis system that analyzes existing data objects to generate a knowledge base of historical usage data.
In a second aspect, the invention provides a program product stored on a recordable medium for processing management classes, comprising: program code for assigning a management class to an inputted data object by analyzing historical usage characteristics of existing data objects that have attributes similar to the inputted data object; and program code that analyzes existing data objects to generate a knowledge base of historical usage data.
In a third aspect, the invention provides a method for assigning management classes, comprising: analyzing existing data objects in a storage system to determine historical usage characteristics; inputting a new data object having at least one attribute; and assigning the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
In a fourth aspect, the invention provides a system for deploying an application for assigning management classes to data objects, comprising: a computer infrastructure being operable to: analyze existing data objects in a storage system to determine historical usage characteristics; input a new data object having at least one attribute; and assign the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
In a fifth aspect, the invention provides a computer software embodied in a propagated signal for assigning management classes to data objects, the computer software comprising instructions to cause a computer system to perform the following functions: analyze existing data objects in a storage system to determine historical usage characteristics; input a new data object having at least one attribute; and assign the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
BRIEF DESCRIPTION OF THE DRAWINGSThese and other features of this invention will be more readily understood from the following detailed description of the various aspects of the invention taken in conjunction with the accompanying drawings in which:
The drawings are not necessarily to scale. The drawings are merely schematic representations, not intended to portray specific parameters of the invention. The drawings are intended to depict only typical embodiments of the invention, and therefore should not be considered as limiting the scope of the invention. In the drawings, like numbering represents like elements.
BEST MODE FOR CARRYING OUT THE INVENTION Referring now to the drawings,
In one illustrative embodiment, object attributes 11 may include information such as name, data object type, size, creation information, source node information, etc., relating to object 10. Thus, attributes of inputted data object 10 may include that the object was created at node X by application Y with a size Z. Historical usage characteristics 18 generally comprises usage information, i.e., how the existing data objects were used/managed over time, e.g., it was stored on DASD for x days, then compressed for y days, then moved to storage for z days, etc.
Then, based on a set or rules or logic, class assignment system 14 would assign a management class to data object 10. For instance, if class assignment system 14 found a historical record of data objects with similar name, attribute, creating node and application metadata, the class assignment system 14 would base the selection of the management class for data object 10 upon the actual usage traits of its predecessors. It should be understood that class assignment system 14 could utilize any logic for selecting management classes based on its analysis of the data object attributes 11 and historical usage characteristics 18.
Historical usage characteristics 18 may be stored in a knowledge base as metadata that is managed by a data management system 20. Data management system 20 would for instance be responsible for storing, updating, grouping, searching, etc., historical usage characteristics 18. Historical usage characteristics 18 could be extracted, collected, and/or processed (i.e., “analyzed”) from a storage system 26 by a data analysis system 22. Storage system 26 represents a storage environment for any type of enterprise, system, or subsystem. Storage system 26 may include, e.g., hardware, software, an operating system, etc., necessary to manage data stored therein. In one illustrative embodiment, data analysis system 22 would periodically gather usage information about the objects stored in storage system 26. Alternatively, data analysis system 22 would obtain usage information whenever a triggering event (e.g., allocate, close, recall, etc.) was detected in the storage system 26. This information would then be passed to data management system 20, which would then be stored in the knowledge base of historical usage characteristics 18. Thus, historical usage characteristics 18 can be collected for each existing data object in storage system 26 as the data object makes its way through the storage lifecycle. Thus, if an object gets recalled to DASD, goes to tape, gets compressed, gets deleted, etc., historical usage characteristics 18 get updated.
Moreover, after a data object 10 is assigned to a management class 24, it can be stored in the storage system 26 such that its usage characteristics can eventually become part of the knowledge base of historical usage characteristics 18. Thus, as more and more data objects are added to storage system 26, the knowledge base of historical usage characteristics 18 will grow, providing increased efficacy in assigning management classes to new data objects.
It should be recognized that any or all of the various functions described herein could be integrated within the facilities of the storage system 26. Such facilities, which perform any pertinent data management functions, e.g., allocate, close, recall, delete, migrate, etc., may trigger the appropriate update to usage characteristic information 18. Thus, usage characteristic information 18 could be initially populated with an initialization routine, and then be updated automatically any time a triggering event occurs.
Referring now to
After receiving the results 38, assignment logic 32 applies a set of rules to the returned information and selects a management class 42 from the set of management classes 34. Alternatively, class assignment system 14 could dynamically create a management class for the inputted data object 40. Moreover, in some instances, assignment logic 32 may simply assign a default management class if no results 38 were found in the knowledge base. In the example shown in
Each of the possible management classes 34 has an associated storage scheme (i.e., schemes a, b and c) to be used for managing the data object to which it is assigned. For example, a management class may dictate that the data object remain on DASD for six months, then be stored on DASD in a compressed form for two months, then be transferred to tape for one year, then be destroyed.
In further illustrative embodiments involving for instance involving OS/390 or z/OS operating systems, data could be analyzed from System Management Facilities (SMF), which would provide a complete historical view regarding how data objects were used in the past. SMF collects and records system and job-related information that an installation can use for, e.g., billing users, reporting reliability, analyzing configurations, scheduling jobs, summarizing direct access volume activity, evaluating dataset activity, profiling system resource use, maintaining system security, etc. Data could also be analyzed from DFSMShsm, which would provide details of instances where data was recalled back from another medium, potentially signaling an anomaly in the current management class. DFSMShsm is a facility that automatically performs space management and availability management in a storage device hierarchy. DFSMShsm makes sure that space is available on a DASD volume so that one can extend old datasets & allocate new ones. DFSMShsm also makes sure that backup copies of datasets are always available in case working copies are lost or corrupted. Moreover, data could be analyzed from the data-using processes themselves, thus giving a rough idea regarding how frequently and in what manner the data object was created and used.
Further features of the invention may include the ability to suggest a management class to the data object, the ability to override the class assignment system 14, and the ability to build new management classes as needed.
Referring to
I/O interfaces 46 may comprise any system for exchanging information to/from an external resource. External devices/resources 48 may comprise any known type of external device, including speakers, a CRT, LED screen, hand-held device, keyboard, mouse, voice recognition system, speech output system, printer, monitor/display, facsimile, pager, etc. A bus 50 may be included to provide a communication link between each of the components in the computer system 40 and likewise may comprise any known type of transmission link, including electrical, optical, wireless, etc. Although not shown, additional components, such as cache memory, communication systems, system software, etc., may be incorporated into computer system 40.
Storage system 26 and the knowledge base of historical usage characteristics 18 may be embodied in any type of storage 52 (e.g., a relational database, etc.) and may include one or more storage devices, such as RAM, ROM, a magnetic disk drive and/or an optical disk drive. Data storage can also be distributed across, for example, a local area network (LAN), wide area network (WAN) or a storage area network (SAN) (not shown). Thus, storage system 26 and/or knowledge base of historical usage characteristics 18 could have some or all of their data stored remotely over a distributed network, thereby allowing for the pooling of resources and information.
Such a network 54 can be any type of network such as the Internet, a local area network (LAN), a wide area network (WAN), a virtual private network (VPN), etc. Communication could occur via a direct hardwired connection (e.g., serial port), or via an addressable connection that may utilize any combination of wireline and/or wireless transmission methods. Moreover, conventional network connectivity, such as Token Ring, Ethernet, WiFi or other conventional communications standards could be used. Still yet, connectivity could be provided by conventional TCP/IP sockets-based protocol. In this instance, an Internet service provider could be used to establish interconnectivity. Further, as indicated above, communication could occur in a client-server or server-server environment.
It should be appreciated that the teachings of the present invention could be offered as a business method on a subscription or fee basis. For example, a computer system 40 comprising management class processing system 12 could be created, maintained and/or deployed by a service provider that offers the functions described herein for customers. That is, a service provider could off the service of generating management classes, e.g., as an application service provider.
It should also be understood that the present invention can be realized in hardware, software, a propagated signal, or any combination thereof. Any kind of computer/server system(s)—or other apparatus adapted for carrying out the methods described herein—is suited. A typical combination of hardware and software could be a general purpose computer system with a computer program that, when loaded and executed, carries out the respective methods described herein. Alternatively, a specific use computer, containing specialized hardware for carrying out one or more of the functional tasks of the invention, could be utilized. The present invention can also be embedded in a computer program product or a propagated signal, which comprises all the respective features enabling the implementation of the methods described herein, and which—when loaded in a computer system—is able to carry out these methods. Computer program, propagated signal, software program, program, or software, in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
The foregoing description of the preferred embodiments of this invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of this invention as defined by the accompanying claims.
Claims
1. A management class processing system, comprising:
- a class assignment system for assigning a management class to an inputted data object, wherein the class assignment system identifies historical usage characteristics of existing data objects that have attributes similar to the inputted data object; and
- a data analysis system that analyzes existing data objects to generate a knowledge base of historical usage characteristics.
2. The management class processing system of claim 1, wherein the class assignment system includes assignment logic for selecting a management class from a set of management classes for the inputted data object.
3. The management class processing system of claim 1, wherein the management class dictates a storage scheme for the data object.
4. The management processing system of claim 1, further comprising a data management system for managing historical usage characteristics in the knowledge base of historical usage characteristics.
5. The management class processing system of claim 1, wherein the class assignment system includes a system for searching the knowledge base of historical usage characteristics.
6. The management processing system of claim 1, wherein the existing data objects analyzed by the data analysis system are stored in a storage system.
7. The management processing system of claim 6, wherein the inputted data object is stored in the storage system with an assigned management class.
8. The management processing system of claim 7, wherein the storage system is distributed over a network.
9. The management processing system of claim 7, wherein the knowledge base of historical usage characteristics is distributed over a network.
10. A program product stored on a recordable medium for processing management classes, comprising:
- program code for assigning a management class to an inputted data object by identifying historical usage characteristics of existing data objects that have attributes similar to the inputted data object; and
- program code that analyzes existing data objects to generate a knowledge base of historical usage characteristics.
11. The program product of claim 10, wherein the program code for assigning a management class to an inputted data object includes assignment logic for selecting a management class from a set of management classes.
12. The program product of claim 10, wherein the management class dictates a storage scheme for the data object.
13. The program product of claim 10, further comprising program code for managing historical usage characteristics in the knowledge base of historical usage characteristics.
14. The program product of claim 10, further comprising program code for searching the knowledge base of historical usage characteristics.
15. The program product of claim 10, wherein the knowledge base of historical usage characteristics is distributed over a network.
16. A method for assigning management classes, comprising:
- analyzing existing data objects in a storage system to determine historical usage characteristics;
- inputting a new data object having at least one attribute; and
- assigning the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
17. The method of claim 16, wherein the historical usage characteristics are stored in a knowledge base.
18. The method of claim 16, wherein the assigning step includes the step of selecting a management class from a set of management classes.
19. The method of claim 16, wherein the assigned management class dictates a storage scheme for the data object.
20. The method of claim 17, wherein the assigning step includes the step of searching the knowledge base of historical usage characteristics.
21. The method of claim 16, comprising the further step of storing the new data object in the storage system with an assigned management class.
22. A system for deploying an application for assigning management classes to data objects, comprising:
- a computer infrastructure being operable to: analyze existing data objects in a storage system to determine historical usage characteristics; input a new data object having at least one attribute; and assign the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
23. Computer software embodied in a propagated signal for assigning management classes to data objects, the computer software comprising instructions to cause a computer system to perform the following functions:
- analyze existing data objects in a storage system to determine historical usage characteristics;
- input a new data object having at least one attribute; and
- assign the new data object to a management class by analyzing historical usage characteristics of similarly attributed existing data objects.
Type: Application
Filed: Aug 10, 2004
Publication Date: Feb 16, 2006
Applicant: International Business Machines Corporation (Armonk, NY)
Inventors: Quyen Dao (Dillsburg, PA), William Reeves (Mechanicsburg, PA), Paul Snyder (Mechanicsburg, PA)
Application Number: 10/915,993
International Classification: G06F 17/00 (20060101);