SYSTEM AND METHOD FOR SERVER-TO-SERVER DATA STORAGE IN A NETWORK ENVIRONMENT

A system and method for storing data in a network computing environment. The network includes a source server that will receive data to be stored from a client and target servers that have locally attached physical storage media. A server-to-server protocol is used to establish a communication connection between the source server and target server while programming allows the storage of the data from the source server on the physical storage at the target server, while also creating a virtual volume at the source server on which the data is also stored. From the perspective of the client, the data appears to be stored at the source server on locally attached storage media. The present invention eliminates the requirement for actual physical media locally attached to the source server.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
FIELD OF THE INVENTION

[0001] The present invention pertains to the field of data storage. More particularly, this invention pertains to a method and apparatus that allow the storage of a data set at a source server on a virtual volume, while facilitating the storage of the actual data from the data set at a physical volume at a target server using a server-to-server protocol.

BACKGROUND OF THE INVENTION

[0002] In a typical network computing environment, a hierarchy of servers often exists that are networked together. While this hierarchy of servers may be as few as two servers, it can also involve many servers. These servers are often located in physically unique locations. In a network, a server represents the application code (hosted on server computer) that runs on an operating system and a client (hosted on a client computer) represents the code run in any number of different applications that can run on many different operating systems. A client application can interface with the server to backup or archive data on the server machine.

[0003] In a network computing environment, copying the data is performed in order to protect the data files from corruption on the local client computer's hard drive, accidental deletion of a file, and other problems. A storage system between the servers on the network can back-up and store the data, and can also manage the data stored in the volumes. In standard networks having a number of levels of interconnected servers, the volumes are typically physical volumes, such as disk drives or tape drives, that are locally attached to each server. These locally attached storage devices must be maintained and managed at each unique location.

[0004] This traditional model of a storage system in a computer network has limitations in the fact that it is administratively burdensome to have operators at each server site to maintain the physical volumes. While the cost per megabyte of storing data in physical tape libraries is relatively inexpensive compared to other storage media, the maintenance and administration of tape libraries may be costly. Examples of some administrative tasks would be managing the inventory of tapes within the library including removing full tapes from the library, cleaning drives, adding additional scratch tapes to the library, and other routine maintenance tasks. Reducing the number and locations of physical storage media would reduce both the complexity and cost of this administration and maintenance.

SUMMARY OF THE INVENTION

[0005] The present invention provides an improved data storage system between servers in a network that substantially eliminates or reduces disadvantages and problems associated with previously developed systems and methods used for network data storage.

[0006] In one embodiment, the present invention provides a system for storing data in a computing environment network using virtual volumes, network communications and a server-to-server protocol. The system includes source servers (or local client servers) that have data that needs to be stored. The network also includes target servers at a single physical location (though the target servers could be located at multiple sites) that have locally attached physical storage media. The system uses a server-to-server protocol layered on the network protocol to store the data from the source server on one or more storage volumes at the target server, while also creating a virtual volume for “storing” the data at the source server. The target server and its physical storage appear, due to the creation of a virtual volume for storing the data, in all respects to be a locally attached storage media from the perspective of a client storing data at the source server. The present invention eliminates the requirement for actual physical sequential media attached to the source server.

[0007] The present invention provides an important technical advantage by allowing the consolidation and/or sharing of data storage resources.

[0008] The present invention provides an important technical advantage by storing data from a source server at both a virtual volume at the source server and a physical volume at a target volume so that to all outward appearances the data is stored in storage media locally attached to the source server.

[0009] The present invention provides another technical advantage by improving disaster recovery when a branch office source server and its locally attached physical storage volumes are destroyed or damaged because the backed up data is recoverable from a physical storage volume at the target server.

[0010] The present invention provides yet another technical advantage by reducing the amount of maintenance of physical storage media, such as tape libraries, at branch office locations.

[0011] The present invention provides another technical advantage by reducing the risk of lost or damaged data and storage devices and easing the maintenance requirements due to storing data at a central location.

[0012] The present invention provides the capability for administration of all real physical storage devices at a central location in a network while maintaining the functional characteristics, convenience, and capabilities of locally attached sequential storage media at each source server.

BRIEF DESCRIPTION OF THE DRAWING

[0013] For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following description taken in conjunction with the accompanying drawings in which like reference numerals indicate like features and wherein:

[0014] FIG. 1 shows a network of interconnected nodes (or computers);

[0015] FIG. 2 shows a hierarchy of storage media;

[0016] FIG. 3 shows a storage network that utilizes locally attached physical storage media;

[0017] FIG. 4 shows a storage network utilizing local virtual storage media and a central physical storage facility according to the teaching of the present invention;

[0018] FIG. 5 shows an embodiment of the data storage system of the present invention; and

[0019] FIG. 6 shows an embodiment of the data storage method from a source server to a target server according to the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0020] In a network computing environment, there can be a hierarchy of servers. For example, in a typical corporate environment, there may be corporate level servers at the computing headquarters of the corporation and regional offices of the corporation that also have local servers. The hierarchy may extend further to branch offices underneath the regional offices that also have local servers. All of these servers are connected by means of a network. In such a network, there is usually a need to store data at each server site. The present invention provides a general scheme for taking data that needs to be stored at any of the local (or source) servers and storing the data on a target server's physical storage volume (for example, at the corporate computing headquarters), while also “storing” the data on a virtual volume at the source server. Thus, to all outward appearances, the data is stored at local storage attached to the source server, when in reality the actual data is stored on a physical volume remote from the source server. The present invention uses a server-to-server storage protocol layered onto the network protocol to implement the virtual volume(s) on the source server and to actually store the data on the target server. In one embodiment, the present invention is applicable to a network that includes an Adstar Distributed Storage Manager (ADSM) server, an IBM product, for managing the storage and back up of data in physical storage media. It should be understood that the present invention is applicable to any computing environment that interconnects servers and stores data between these interconnected servers.

[0021] FIG. 1 shows the interaction of a subset of nodes, or computers, on a network 10. Each node at any point in time, is running either a server application or client application depending on the particular operating system and how the network 10 is configured. The different nodes pass data back and forth to each other. In FIG. 1, storage node 19 is used to backup and archive data from the other 10 nodes, which are clients 22. As shown in FIG. 1, storage node 19 has two servers 11, 13 installed, with each server supporting five of the clients 22. The storage node 19 also has a tape library 40 attached to it. In some circumstances, the tape library 40 may only be able to be used by a single server, for example server 11, on the storage node 19. The present invention provides a means for the second server 13 to use tape library 40 by designating the second server 13 as a source server and the first server 11 as a target server. After the necessary configuration is completed between the source and target servers, as will be described more fully herein, the source server will be able to store data on the target server and thus utilize the tape library. In this way, the present invention provides a means for both consolidating resources (only one server has to be set up to directly access the tape library 40) and for sharing resources so that both server 11 and server 13 on the ADSM node 19 can access the tape library 40.

[0022] Each server has a storage hierarchy that includes different types of physical storage media that are categorized based on speed and cost. FIG. 2 shows a storage hierarchy 12 where the upper level 14 represents the fastest and the most expensive storage media, such as local disk storage and locally attached hard drives. The next storage level 16 generally represents a slightly slower and less expensive optical tape or tape drive. The lowest level 18 is the least expensive and slowest optical tape, such as that used in optical tape libraries containing sequential media.

[0023] Network storage management servers are used to store data and allow users to configure this storage hierarchy in any number of ways. The user can provide some parameters, such as resources including number and type of tape drives, space on a particular disk drive, etc. and the network storage management server will manage where the data is stored and how to most efficiently move data from one media to another in order to maintain enough space on various storage media. The data stored can include a client back-up of a locally attached hard drive, which can be an image of the entire hard drive, or alternatively, specific directories.

[0024] FIG. 3 shows an example of a “branch office model” network 100 having a hierarchy of servers including a corporate level of servers 32, a regional level of servers 34, and a local level of servers 36. Each level of servers may have multiple servers 20 (for example, corporate level of servers 32 is shown having three servers 20). A physical volume 26, for example a tape drive or a tape library, is shown locally attached to each individual server 20.

[0025] Each server 20 in FIG. 3 services some number of clients 30 (shown attached to one server 20 at the regional level 34 and the local level 36). Each server 20 may have different storage needs, and in order to have enough storage to service all of the client's storage environment, storage devices 26 are generally locally attached to each server 20. While locally attached storage devices 26, such as tape drives, can work well in that they are relatively inexpensive storage media and can store significant amounts of data, they are administratively cumbersome. In the storage environment of FIG. 3 having locally attached tape libraries 26 at each local and regional server 20, an administrator must manage each server's tape library at each server site. In the branch office network 100 of FIG. 3, the management of the data storage facilities requires resources and people with the expertise to actually manage all the tape drives in addition to keeping track of these tapes, the actual media, across all the different servers 20. This, for example, includes basic maintenance such as making sure the tape does not get damaged environmentally to more sophisticated maintenance and tracking such as determining when the tape libraries are becoming full.

[0026] The present invention uses virtual volumes and a server-to-server protocol to allow the various branch office servers to define various attributes, including device class, for a locally attached virtual volume. The server-to-server protocol of the present invention manages the transfer of information and data between servers in the network. This server-to-server protocol is another “layer” above the actual network protocol used to connect the servers in the computer network. In one embodiment, the server-to-server protocol is implemented using TCP/IP as the network protocol. However, because the server-to-server protocol is at the level of the application, rather than the network level, the present invention can implement the server-to-server protocol in networks other than those using TCP/IP.

[0027] FIG. 4 shows a branch office model network 200 that is similar to that of FIG. 3, but incorporates the storage server 80 and virtual volume storage to overcome limitations presented in the FIG. 3 network. The storage server 80 of the present invention includes a network communications manager 21 that controls communication to and from defined and available network interfaces, a meta-data storage manager 23 that controls the management and storage of metadata for server operations (including the storage of metadata describing client data) and a data storage manager 25 that controls the management and storage of actual data from clients defined to the storage server 80. The storage server 80 can also include other functional components 27 that provide standard storage server functionality.

[0028] As shown in FIG. 4, the network 200 once again includes corporate level 32, regional level 34, and local level 36 with servers 80 at each level. However, unlike FIG. 3, locally attached physical storage devices 26 (such as tape drives or tape libraries) are replaced with virtual volumes 50 locally attached to each server 80 at the regional and local levels. During operation, rather than storing data at the locally attached physical storage media 26 (as in FIG. 3), the data is actually stored in a tape library 40 at the corporate level 32, while the data appears to be stored locally using a virtual volume 50.

[0029] For the present invention, the attributes defined for the virtual volume can be the same attributes as would be defined for the previously locally attached physical storage devices such as the tape drives and tape libraries. In this definition process, instead of storing data at a locally attached physical storage volume, the present invention actually sends the data to be stored to another server (the target server). The virtual volumes 50 can have the same storage characteristics as the physical storage media 40.

[0030] With reference to FIG. 4, if a server 80 at the local level 36 needs to store data, the data would be sent through a regional server 80 at level 34 to corporate level 32 and physically stored in tape library 40. It should be understood that the data could also be stored at the regional level 34 in a tape library. One purpose of the invention is to physically store data in fewer locations while maintaining the appearance of local storage at each server. Using the present invention, instead of having numerous different branch offices having their own locally attached tape libraries, the data can be centralized at one or a few locations. However, even though the data is stored remotely at a central location, each branch office server logically acts as though it has a locally attached tape drive. The branch office servers may be going to regional office servers, which may or may not have locally attached tape drives or tape libraries, or the regional servers may also just be pointing to the corporate level servers and the locally attached tape library at the corporate level.

[0031] FIG. 5 illustrates a client 30 using the data storage method of the present invention to store data at a source server 82 at local level 36. The present invention sends the data through a regional server (not shown) to target server 86 at corporate level 32. It should be understood that at any target server receiving data from a source server, the target server may store the data on a physical storage media attached to that target server, on actual sequential media devices attached to that target server, or it may store the data in virtual volumes. In the latter case, the target server would then become a source server, and would send and physically store the data to yet another target server.

[0032] When client computer 30 at local level 36 requires the back-up of its local hard drive, the client application software 42 communicates that need to source server 82. Source server 82 has virtual volume allocation software program 43 that sets up a communication connection to target server 86 at corporate level 32. In order to store the data from the client 30, an allotment of storage space at the target server 86 must be made. The source server 82 will request an allotment of space from the target server 86 for a defined size of a virtual volume 50. The size of the virtual volume 50 for the present invention is an attribute of the device class defined on the source server 82 (the size of the requested allotment is based upon a configuration parameter on the source server 82 relating to the virtual volumes 50). The target server 86 will reply that the space does or does not exist, and if it does, the client 30 data gets stored at the target server 86. The target server 86 may or may not store the data directly to tape library 40 depending on the storage management policies at the target server 86. For example, the data may first be stored to disk, and then later migrated to tape on the tape library 40. It should be understood that when a client 30 connects to a source server 82, the steps of requesting space at the target server 86, responding that space is available at the target server 86, and the client 30 transferring data all happen on a file by file basis. Thus, the client 30 data storage occurs on a file by file basis.

[0033] At the same time, the data is “stored” on virtual volume 50. In order to retrieve data that was stored during this operation at a later time, the client application will contact the source server 82 to recover the files. The source server 82 will open a communication channel to target server 86 that will retrieve the files from their location in tape library 40 and send this data back to source server 82. In this way, the present invention provides storage at the source server 82 on a virtual volume 50 while the data is actually stored in tape library 40 at target server 86. Virtual volume 50 is a logical volume that appears to the client 30 as any other physical sequential storage media. The virtual volumes 50 are represented as file objects to the target server 86. The virtual volume 50 is created and maintained in the metadata stored on the source server 82. It should be understood that both the source and target servers can provide data storage services to other clients in the network using protocols other than the server-to-server virtual volume protocol of the present invention.

[0034] The present invention transfers the data from the source server 82 to the target server 86 over the network using a network protocol such as TCP/IP. It should be understood that other communication protocols can be used to create and store data in the source server virtual volumes as taught by the present invention. The present invention simply requires implementation using a communications protocol at the source and target servers that will allow a connection to be established between the source and target servers. When recovering the data from target server 86, the source server 82 will establish the connection to target server 86, and using a server-to-server protocol, will send a request defining the attributes of the data to be recovered. The attributes of the data are stored in a metadata file on the source and target servers. The metadata is selected information, such as which source server sent the data and what version of the data is currently being stored, that is used to track the actual file data at the target server. Thus, when the client 30 sends the data to be stored to source server 82, the client passes a “verb” that contains a number of attributes, or metadata, with the actual file data to be stored. A database of verbs or metadata is maintained at the source server 82 that tracks where the data was stored.

[0035] The following example further illustrates the metadata used according to the present invention. Client “A” backs up data to source storage server “SUPER”. Source storage server SUPER stores the data in virtual volumes, while physically storing the data on target storage server “FRED”. Target server FRED allows source server SUPER to contact it as “CLIENT_SUPER” for data storage purposes. Source server SUPER stores this metadata that tracks which files are stored for client A. Source server SUPER also stores information about the virtual volume which includes: (i) the virtual volume name; (ii) the layout of the actual client data in the virtual volume; and (iii) where the virtual volume is stored. Target server FRED stores metadata tracking which files are stored on behalf of CLIENT_SUPER. As a result, target server FRED knows the data for CLIENT_SUPER represents virtual volume data, however server FRED does not know what client data was stored by server SUPER stored in the virtual volume(s) at source server SUPER.

[0036] FIG. 6 is a flow chart of one embodiment of the authentication and data transfer protocol 70 of the present invention at the point the data storage request has been received at the source server 82. The authentication and data transfer protocol is contained in software programing 44 contained in a computer readable medium at both the source and target servers. The authentication and data transfer protocol controls the interaction between the servers, manages the data transfer between the servers, and deletes the data when no longer needed.

[0037] At step 46, a write operation request is received at the source server 82 from the client 30 that requests the storage of some data on sequential storage media at the source server 82. The write operation request will contain the source server communication attributes and other virtual volume attributes to use to write the data successfully. In an ADSM storage management system, the write operation will include the device class and other ADSM-specific processing overhead for managing storage volumes. At step 48, from the device class information (or metadata information), the source server 82 will initiate the process of opening the sequential storage volume in order to store the data. At step 52, the source server 82 determines whether the locally attached storage medium is a virtual or a physical volume. If the storage medium locally attached to the source volume 82 is a physical volume, the process flows to step 54, which represents a standard physical volume storage process (that can include the steps of mounting the local physical volume, writing the data to the physical media, and flushing the volume and forcing the flush of buffers to the physical media as necessary to store the data on the physical volume). If the storage volume locally attached to the source server 82 is a virtual volume the present invention will, at step 56, open the communication connection between the source server 82 and the target server 86. Step 56 includes requesting a certain amount of storage space at the target server 86. If that amount of space does not exist at the target server 86, then the communication channel is not opened and it will appear to the client as if the source server did not have enough available space for the data. At step 56, the source server 82 logically opens a virtual storage volume while at the network level a communication connection is being opened between the source server 82 and the target server 86 using the appropriate server-to-server protocol. If at step 56 the communication channel is successfully opened to the target server 86, the virtual volume at the source server 82 would also be successfully opened. After the connection has been established, a file object is created and opened at the target server 86 in order to store the actual data at step 58. This file object also includes the metadata or verb (for the virtual volume) that tracks where the files to be stored came from and what the data is, while at the same time allocating space on the physical storage 40 attached to the target server 86 in order to store the data. The metadata created will contain the appropriate pointers in the overhead to designate where the data is stored both on the virtual volume and the physical volume and correlates these so that the data can be recovered. The source server 82 then writes some portion of data to the created file object at the target server 86 at step 62. Depending on the size of the data files to be stored, the source server will periodically perform a flush volume at step 64 in order to ensure the data that has been written from the source server 82 and stored in buffers is written to the storage device. If a flush volume has not occurred, then the data will continue to be written to the same file object. If a flush volume is performed, then at step 66, the created file object is closed on the target server 86. By doing a flush volume, at the source server 82 the transaction is closed. In order to ensure the data gets written and the information needed to track the data is maintained in the overhead information, the current file object must be closed. If there is more data to process, at step 68, then another file object will be opened at step 58 from the sequence of files that represents the virtual volume. At the point that there is no more data to process (i.e., all of the data to be stored has been written from the source server 82 to the target server 86), then the close processing step 72 is performed. In the case of virtual volume storage according to the present invention, at the close sequential media volume step 72, the communication channel that was opened at step 56 will be closed.

[0038] The present invention uses a server-to-server virtual volume command interface protocol within a storage server to accomplish the data transfer and storage as described herein. For certain storage management systems, for example ADSM, the present invention simply adds certain functionality to the existing application protocol. The server-to-server virtual volume command interface can be implemented as a software program that resides on the source and target servers. The server-to-server virtual volume command interface provides the interface to allow a user to define various attributes to enable the data storage at a virtual volume locally and at a physical volume at the target server. The server-to-server command interface can include a source server command interface, a target server command interface, an authentication and data transfer software program, a reconciliation software program, and a security and access control program. These software programs can reside at one or both of the source and target servers.

[0039] The source server command interface can be executed by the user to define the virtual volume device to the source server. This source server command interface allows the issuing of configuration commands, including a define server command that will initially create the communication attributes necessary to allow the source server to communicate with the target server. The source server command interface will then allow the user to define the virtual volume to include the size of the virtual volume, how many virtual volumes exist at the source server, as well as referring to the server connection definitions required to establish the communication connection. The source server command interface can also provide a virtual volume naming convention to allow for the validation of volume names for both user defined volume names and server generated names for volumes that are scratch allocated.

[0040] The target server command interface is used to define the source server as a special type of network client to the target server. The target server command interface allows the user to designate the number and names of any source servers allowed to use a particular server as a target server. The source server will be registered by the target server as a client when the source server contacts the target server to store the data. The target server command interface also establishes storage space at the target server for where file data from the source server(s) should be stored, including a the specification of a physical volume storage pool where the data should reside, and the type of device(s) the data will reside upon. Furthermore, the definition of the data storage space can provide storage space for specific archive type data from the specialized network (ADSM) client.

[0041] The authentication and data transfer software program is used to establish the communications connection between the source server and the target server. This authentication and data transfer software program provides an interface to the specific network communications method that may be used to transfer the data between servers and the data transfer definitions for sending and receiving data between the two servers. For the source server, authentication and data transfer software program provides for the storing of the data to a virtual volume which has the characteristics and behavior of a sequential storage device to the source server, while sending the data to the target server in a one to many file format. Specifically, as shown in FIG. 6, a file object is logically opened at the beginning of a data transfer operation between the servers, and when the source server reaches a transaction processing boundary, the current file object is closed and the next file object is opened. In this way, the data is aggregated on the source server into files for transmission to the receiving target server using the server-to-server protocol. For the target server, the authentication and data transfer software program stores the data in the server's storage as a file or files on the target server. The deletion of these files is controlled by the source server.

[0042] The reconciliation software program synchronizes the virtual volume definitions on the source server with the actual data storage location on the target server. The reconciliation software program provides a record of where on the source virtual volume the data is represented as being stored for use by the client application. This record is correlated to the actual location of the data on a physical storage media at the target server. When the client application attempts to retrieve the stored data from the virtual volume, the source server uses this record to find the actual data stored at the target server in order to retrieve the data.

[0043] A two-level security and access control negotiation program can be used to control access using both an access verification key and password authentication. An access verification key identifies the source server to the target server in order to control data between the source server and the target server. The verification key is managed by the source and target servers. If the source server does not report a valid verification key to the target server, the source server is not allowed to store data on the target server. In addition, a password authentication step provides administrative control of the data for server administrators by allowing the setting of passwords between the source server and target servers.

[0044] In one embodiment, the present invention can be utilized in conjunction with an ADSM system having an ADSM storage server. The ADSM server provides backup, archive, and space management services to ADSM clients or the ADSM client API in a distributed computing environment. The ADSM storage server allows the user to define the server by specifying a name for the definition that references a set of attributes for the source server. In a TCP/IP scheme, there's a high-level (IP address or host name) and low-level qualifier (port). The definition also includes an alias that will be used by the server to communicate with the ADSM server. The user can also supply a password. The source server will contain an ADSM database of metadata which includes metadata information about clients and the data known to this server. The ADSM database can also store administrative information for the ADSM server (such as administrators allowed access, procedures for managing data, and storage devices that are locally attached). The define server command writes this information about the server into one of the ADSM database tables. This step of defining the server simply creates a set of attributes that will be used during the data storage function (these attributes allow the source server to establish connectivity to the target server at some point in the future).

[0045] The ADSM server then allows the user to define the virtual device class. For the source server to be able to use a storage device on the ADSM network system, the device class of the storage device must be defined. The device class will indicate the type of storage device and potentially other attributes associated with that storage media (such as how many drives does it has, the size, and other defining attributes). In one embodiment, the device class for the source server would be a “server” to distinguish it from a tape drive or other storage media. The “server” virtual storage media can appear as one big storage media, where the underlying physical storage media may be a plurality of physical disks supporting the virtual volume. The size of the virtual volume is actually constrained by the size of the available storage at the target server. Thus, the user may define any number of virtual devices, and can make those virtual volumes any size they want.

[0046] The user can now define the ADSM storage pool, where a storage pool is a collection of identical device class storage devices. When storing data, a storage pool can be used to allow a larger amount of data to be stored. The storage pool can be comprised of any device class; it may be a device class that is a virtual volume or a device class that is a physical volume. When using the present invention for client back-up of data, the user can define a storage pool on the source server which is basically a set of constructs in the source server that define which client is allowed to store data in the storage pool. For the present invention, a storage pool is created at the source server having a device class that is a virtual volume device class. The creation of a storage pool allows the ADSM server to manage the stored data and to implement a storage hierarchy between different storage pools. Various attributes are associated with each particular storage pool that allow a user to manage which storage pool collects each set of data (e.g., a storage pool may have a size attribute that requires any file larger than a certain size to be forced down to the next storage pool regardless of whether there is adequate space in the first pool).

[0047] In an alternative embodiment that is independent of a client application, a storage pool may not be required. For example to accomplish database backups, after the ADSM server performs define server and define device class operations, the database can simply be backed up in the specified device class. The method described in FIG. 6 is then used to store the database without use of a storage pool. Thus, for storage that is not related to client data back-up, the present invention does not require the establishing of a storage pool.

[0048] In the ADSM storage network embodiment, actual data storage begins as described earlier in FIG. 6. A client connects to and begins writing data to the source server. The source server will make a determination to write that data to a storage pool of a particular device class. If that disk storage pool fill up, the source server must start moving data in order to free up space in the disk pool. The source server may then start writing client data to an alternative storage pool (which may be a different device class) while flushing the disk space in the original disk pool. At the point the source server starts writing to the device class, the present invention queries the device class and, when using the present invention, the device class is a virtual volume. For a virtual volume, the source server reads the communication attributes previously established and establishes the communication connection to the target server using the ADSM protocol. After the connection is established using the ADSM protocol, a file object is opened. This step basically consists of a query that states a certain volume of data needs to be sent through and does the target volume have space to store that data.

[0049] If space is available at the target, the present invention moves to the data storage stages at both the target and source servers. The source server uses the reconciliation software to track where the data has been put on the virtual volume and sends the data to the target servers. Two distinct types of reconciliation occur using the present invention which can be performed by the server reconciliation software. Initially, during data storage from the source server to the target server, the source server is tracking 1) where the data is placed based upon the source server metadata that is maintained for the storage operation and 2) the success of the write requests to the target server. This metadata maintains the mapping of where the data is stored based upon how data storage is assigned and tracked on the source server as well as the naming conventions used to create the file objects on the target server on behalf of the source server. The reconciliation software can also include a specific reconciliation algorithm that is executed on the source server to reconcile the source server's metadata for virtual volumes with the actual files stored on the target server. This reconciliation algorithm processing verifies that the data files are stored on the target server, while also verifying that the metadata attributes for the virtual volume on the source are viable and complete.

[0050] The target server then performs a write operation to a device class. If the device class to which the target server will write is a virtual volume, then the data storage will follow steps 56 through 72 of FIG. 6 as described above. If the device class to which the target server will write is a physical volume, the data storage follows a sequence as described in step 54 of FIG. 6. For physical volume storage, the target server will determine the type of device and then store the data on the physical storage device while tracking where the data has been stored physically in the media.

[0051] The reconciliation program tracks the correlation between the source server data location and the target server data location on request of a user at the source server. In one embodiment of the present invention, the data is aggregated at the source server and sent to the target server as one object. The tracking of the individual file locations is done on the virtual volume at the source server. Thus, the target server will receive one file from the source server that may represent a number of files from the client. In order to retrieve one of the client files, the client will request that file from the source server. The source server will then access a program that will retrieve portions of the object that was sent to the target server. The retrieve program will identify from the position of the requested file on the virtual volume the position and number of bytes to retrieve from the target server and make that request. The source server will establish a communication connection and perform a set of queries to correlate the position of the data on the virtual volume to the position of the data on the target volume. Thus, the present invention will relate positions and files between the source and target servers. Essentially, the virtual volume from the source server is represented on the target server as some number of files that aggregate the data.

[0052] In summary, the present invention provides a data storage system and method for use in a network to improve the management of data storage. The source server can act as a specialized network client with respect to the target server. The data from a client is stored in a virtual volume on the source server, and in a one to many file format at the target server. The data from the source server may be stored by the target server on any physical storage device that the target server supports. During operation, a client application writes data to the source server as if the source server will store the data on locally attached sequential media, but the source server simply maps the data to make it look and feel like sequential media to the client application, while actually writing the data to the target server over the network. The present invention collects the data at the source server, surrounds it with identifiers, and sends it to the target server. It should be understood that the source server could send the data through several target servers before the data is actually stored in physical storage media.

[0053] Although the present invention has been described in detail, it should be understood that various changes, substitutions and alterations can be made hereto without departing from the spirit and scope of the invention as described by the appended claims.

Claims

1. A system for storing data in a computer network, comprising:

a target server stored on a computer readable medium at a target server computer;
a source server stored on a computer readable medium at a source server computer; and
a means for representing a virtual volume as a sequential storage media using a set of metadata stored on the source server computer,
wherein the source server computer executes the source server to:
open a communication channel between the source server and the target server using a server-to-server protocol;
transfer the data from the source server to the target server; and
represent the data as being stored on the virtual volume.

2. The system of

claim 1, wherein the source server computer further executes the source server to:
a) create and open a file object at the target server;
(b) write a portion of data from the data to the file object;
(c) close the file object upon the occurrence of a transaction processing boundary; and
repeat (a)-(c) as necessary until the data has been entirely transferred in a one to many file format.

3. The system of

claim 2, wherein the source server computer further executes the source server to:
use communication protocol specific attributes to establish the communication connection between the source server and the target server;
define the attributes used to manage the data as the data is sent to the target server;
establish the virtual volume characteristics; and
manage how the source server will use the virtual volume to store the data.

4. The system of

claim 2, wherein the data is stored on a physical storage volume locally supported by the target server.

5. The system of

claim 2, wherein the source server and the target server each further comprises:
a network communications manager that controls communication to and from defined and available network interfaces;
a meta-data storage manager that controls the management and storage of metadata for server operations; and
a data storage manager that controls the management and storage of the data.

6. The system of

claim 2, wherein the target server and the source server reside on a single computer.

7. A storage server contained on a computer readable medium and executable by a processor in a computer, for use in storing data in a computer network, comprising:

a network communications manager operable to control communication to and from a set of defined network interfaces;
a metadata storage manager operable to control the management and storage of a set of metadata for the storage server; and
a data storage manager operable to control the management and storage of data from a client.

8. The storage server of

claim 7, wherein the metadata describes the data from the client.

9. The storage server of

claim 7, wherein the storage server is further operable to:
open a communication channel to a second storage server using a server-to-server protocol;
transfer the data to the second storage server;
represent the data as being stored on a virtual volume that is created and maintained in the metadata stored on the storage server;
use communication protocol specific attributes to establish the communication connection between the storage server and the second storage server;
define a set of attributes used to manage the data as the data is sent to the second storage server;
establish the virtual volume characteristics; and
manage how the storage server will use the virtual volume to store the data.

10. The storage server of

claim 9, wherein the storage server is further operable to:
a) create and open a file object at the second storage server;
(b) write a portion of data from the data to the file object;
(c) close the file object upon the occurrence of a transaction processing boundary; and
repeat (a)-(c) as necessary until all of the data has been transferred in a one to many file format.

11. A system for storing data in a computer network, comprising:

a target server computer having a target server;
a source server computer having a source server;
a means for representing a virtual volume as a sequential storage media within a set of metadata stored on the source server computer;
a server application stored on the source server computer in a computer usable medium and on the target server computer in a computer usable medium;
wherein the source server computer executes the server application to:
open a communication channel between the source server and the target server using a server-to-server protocol;
transfer the data from the source server to the target server; and
represent the data as being stored on a virtual volume locally attached to the source server.

12. The system of

claim 11, wherein the source server computer further executes the server application to:
a) create and open a file object at the target server;
(b) write a portion of data from the data to the file object;
(c) close the file object upon the occurrence of a transaction processing boundary; and
repeat (a)-(c) as necessary until all the data has been transferred in a one to many file format.

13. The system of

claim 12, wherein the source server computer further executes the server application to:
use communication protocol specific attributes to establish the communication connection between the source server and the target server;
define the attributes used to manage the data as the data is sent to the target server;
establish the virtual volume characteristics; and
manage how the source server will use the virtual volume to store the data.

14. The system of

claim 12, wherein the data is stored on a physical storage volume locally supported by the target server.

15. The system of

claim 12, wherein the source server and the target server each further comprises:
a network communications manager that controls communication to and from defined and available network interfaces;
a meta-data storage manager that controls the management and storage of metadata for server operations; and
a data storage manager that controls the management and storage of the data.

16. A method for managing data storage interaction between a source server and a target server in a computer network, comprising:

receiving a write operation request at the source server from a client to request the storage of a data file on sequential storage media at the source server;
opening a communication connection between the source server and the target server using a server-to-server protocol;
opening a sequential virtual storage volume at the source server;
creating a file object at the target server that includes a virtual volume verb;
opening the file object on the target server for storing the data file; and
writing data from the data file to the created file object at the target server to store the data file.

17. The method of

claim 16, further comprising:
(a) storing a portion of the data file in at least one buffer at the target server;
(b) performing a flush volume of the at least one buffer;
(c) closing the created file object on the target server;
(d) if there is more data in the data file to process, opening another file object at the target server;
(e) repeating steps (a) through (d) as necessary in order to store all data from the data file at the target server; and
closing the communication channel between the source server and target server.

18. The method of

claim 16, wherein the write operation request includes a set of metadata information for the data file.

19. The method of

claim 16, further comprising storing the data file on a physical storage volume locally attached to the target server.

20. The method of

claim 16, further comprising embedding a set of data file characteristics in the virtual volume verb, including a set of pointers to correlate a location where the data file is stored on the virtual volume to a location where the data file is stored at the target server.

21. The method of

claim 16, further comprising:
defining a set of server-to-server protocol specific attributes used to establish a communication connection between the source server and the target server;
defining the virtual volume at the source server when establishing a communication connection to the target server, the virtual volume having a set of characteristics emulating a physical storage device at the source server;
defining the source server as a client to the target server;
establishing the communication connection and transferring the data file between the source server and the target server; and
synchronizing the location of the data file on the virtual volume to the location of the data file on the target server.

22. The method of

claim 16, further comprising:
defining the virtual volume within a device class to include a set of virtual characteristics for the virtual volume and to provide a naming convention for the virtual volume; and
identifying a set of source servers operable to store data at the target server and designation of a storage pool at the target server for storing data from the source server.

23. The method of

claim 16, further comprising:
establishing a set of data transfer definitions for sending and receiving data between the source server and the target server;
storing the data file to the virtual volume;
facilitating the transfer of the data file to the target server, further comprising:
(a) creating and opening a file object at the target server;
(b) writing a portion of data from the data file to the file object;
(c) closing the file object upon the occurrence of a transaction processing boundary; and
repeating (a)-(c) as necessary until the entire data file has been transferred in a one to many file format; and
storing the data file at the target server; and
defining a set of deletion attributes at the source server to control deleting the data file at the target server.

24. The method of

claim 16, wherein the source server and the target server each further comprises:
a network communications manager that controls communication to and from defined and available network interfaces;
a meta-data storage manager that controls the management and storage of metadata for server operations; and
a data storage manager that controls the management and storage of the data.

25. In a computing network environment, a system for storing data, comprising:

a source server having data that needs to be stored;
a target server in communication with a target storage media;
a virtual volume created and maintained in a set of metadata stored on the source server; and
a command interface operable to configure the source server to enable a transfer of data from the source server to the target server over the network using a server-to-server protocol while also representing the data as stored at a location on the virtual volume.

26. The system of

claim 25, wherein the command interface allows a user to:
define the virtual volume at the source server;
define a set of communications attributes to establish a communication connection between the source server and target server; and
define a storage pool for storing the data.

27. The system of

claim 26, wherein the command interface is stored on a computer-readable medium and further comprises:
a source server command interface that resides on the source server and is operable:
define a set of server-to-server protocol specific attributes used to establish a communication connection between the source server and the target server; and
define the virtual volume at the source server when establishing a communication connection to the target server, the virtual volume having a set of characteristics emulating a physical storage device at the source server;
a target server command interface operable to define the source server as a client to the target server;
an authentication and data transfer software program for establishing the communication connection and transferring the data between the source server and the target server; and
a reconciliation software program to synchronize the location of the data on the virtual volume to the location of the data on the target server.

28. The system of

claim 27, wherein the source server command interface is further operable to define the virtual volume within a device class to include a set of virtual characteristics for the virtual volume and to provide a naming convention for the virtual volume.

29. The system of

claim 27, wherein the target server command interface is further operable to identify a set of source servers operable to store data at the target server and designation of a storage pool at the target server for storing data from the source server.

30. The system of

claim 27, wherein the authentication and data transfer software program is further operable to interface with the server-to-server protocol required to communicate between the source server and target server and further to establish a set of data transfer definitions for sending and receiving data between the source server and the target server.

31. The system of

claim 30, wherein the authentication and data transfer software program is further operable to:
store the data to the virtual volume; and
facilitate the transfer of the data to the target server, further comprising:
(a) creating and opening a file object at the target server;
(b) writing a portion of data from the data to the file object;
(c) closing the file object upon the occurrence of a transaction processing boundary; and
repeating (a)-(c) as necessary until all of the data has been transferred in a one to many file format.

32. The system of

claim 31, wherein the authentication and data transfer software program is further operable to:
store the data at the target server as archive files; and
define a set of deletion attributes at the source server to control deleting the archive files.

33. The system of

claim 26, wherein the data is stored on a physical storage volume locally supported by the target server.

34. The system of

claim 25, wherein the source server and the target server each further comprises:
a network communications manager that controls communication to and from defined and available network interfaces;
a meta-data storage manager that controls the management and storage of metadata for server operations; and
a data storage manager that controls the management and storage of the data.
Patent History
Publication number: 20010013059
Type: Application
Filed: Feb 2, 1999
Publication Date: Aug 9, 2001
Inventors: COLIN SCOTT DAWSON (TUCSON, AZ), BARRY FRUCHTMAN (TUCSON, AZ), HARRY CLAYTON HUSFELT (TUCSON, AZ), MICHAEL ALLEN KACZMARSKI (TUCSON, AZ), DON PAUL WARREN JR. (TUCSON, AZ)
Application Number: 09247576
Classifications
Current U.S. Class: Remote Data Accessing (709/217)
International Classification: G06F015/16;