SYSTEM AND METHOD FOR STORAGE OPERATION ACCESS SECURITY
A method and system for controlling access to stored data is provided. The storage access control system leverages a preexisting security infrastructure of a system to inform the proper access control that should be applied to data stored outside of its original location, such as a data backup. The storage access control system may place similar access control restrictions on the backup files that existed on the original files. In this way, the backed up data is given similar protection as that of the original data.
CROSS-REFERENCE TO RELATED APPLICATIONS
The present application is a continuation of U.S. application Ser. No. 11/694,784 (Attorney Docket No. 60692-8042.US00) entitled “SYSTEM AND METHOD FOR STORAGE OPERATION ACCESS SECURITY” and filed on Mar. 30, 2007, which claims priority to U.S. Provisional Application No. 60/852,584 (Attorney Docket No. 60692-8047.US00) entitled “METHOD AND SYSTEM FOR COLLABORATIVE SEARCHING,” and filed on Oct. 17, 2006, each of which are hereby incorporated by reference.
Traditional security systems operate on the principal of limiting access to data. Each user of the system is generally identified with a user name, and access rights are assigned to each user. For example, users may be permitted or prevented from accessing certain files or adding new hardware to a computer system. Users may also be assigned to groups where each member of the group is given common access rights. Often a great amount of administrative effort has been put into creating users and groups and assigning them appropriate access rights in a traditional computer security system. For example, Microsoft Windows provides Active Directory for creating users and groups and assigning access to resources throughout a computer network. File systems also often provide access control. For example, the NT File System (NTFS) provides folder and file access based on user and group identifiers and the type of access requested such as read, write, execute, and other operations. An organization may have an extensive scheme of groups and access rights. For example, there may be a group of accounting department users that have different rights than engineering department users. The organization may also have identified certain users as administrators that have additional rights to administer the system.
Computer systems contain large amounts of personal data, such as financial data, names, addresses, telephone numbers, bank account information, photographs and much more. Corporate computer systems often contain confidential information, such as trade secrets, manufacturing processes, business strategy, and so on. With the increased reliance on computer systems to store critical information, the importance of protecting this data against loss has grown. For example, traditional storage management systems receive an identification of a file location of an original file and then create one or more secondary copies, such as backup files, containing the contents of the original file. These secondary copies can then later be used to restore the original data should anything happen to the original data. Secondary copies of data are often stored in a publicly accessible location for quick restoration of data in the event of a disaster or other data loss event. For example, backup files may be stored on a widely accessible server, and tapes and other media used for storing backup files may be physically accessible to many users.
Backed up data may contain sensitive information that is more widely accessible than the original data. Backing up data often removes the data from the well-planned security environment in which it was originally stored. Even though a system administrator may have gone to great lengths to properly limit access to data throughout a network, once the data is stored as one or more secondary copies it is often more accessible than originally intended. For example, the CEO of a company may have many sensitive files on a computer system that only he can access, but if that computer system is backed up, then the backup files may allow unauthorized users to have access to data that they would not normally be able to access. In addition, some systems provide searches based on backup data in which the backup data is indexed. Indexed content does not have the protections imposed on the original files.
There is a need for a system that overcomes the above problems, as well as providing additional benefits.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, the same reference numbers and acronyms identify elements or acts with the same or similar functionality for ease of understanding and convenience. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the Figure number in which that element is first introduced (e.g., element 1104 is first introduced and discussed with respect to
The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the claimed invention.
A method and system for controlling access to stored data described below leverages a preexisting security infrastructure to inform proper access control that should be applied to data stored outside of its original location, such as a data backup. In one embodiment, the storage access control system receives a request to perform a storage operation that makes data at a source location available at a destination location. For example, the request may indicate that data stored on one computer should be copied and stored on a second computer. A storage operation may include many types of operations such as backup, migration, replication, snapshot, hierarchical storage management (HSM), and so on. For example, the storage operation may be a request to make a snapshot copy of data at the source location. The source location may contain electronic information such as file system data objects, application data objects, or other types of storage data objects. Upon receiving the request, the storage access control system queries the source or other location for access control information. For example, if the data includes one or more files, then the storage access control system may examine the file system to determine what access control scheme is currently in place for the data. The file system may contain access information that identifies the users and groups that have access to the data. One manner in which the access information may be associated with the data is by storing the access information along with the file. Then, the storage access control system applies the access control information to the data stored at the destination location. For example, the storage access control system may associate the access control information with the data stored at the destination location in a different manner, such as by storing metadata describing the access control information in a content indexing system. In the example of backing up files, the storage access control system may place similar access control restrictions on the backup files that existed on the original files. In this way, the backed up data is given similar protection as that of the original data.
The invention will now be described with respect to various embodiments. The following description provides specific details for a thorough understanding of, and enabling description for, these embodiments of the invention. However, one skilled in the art will understand that the invention may be practiced without these details. In other instances, well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the invention.
The terminology used in the description presented below is intended to be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific embodiments of the invention. Certain terms may even be emphasized below; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section.
Improving Security with ACLs and Active Directory
In some embodiments, the storage access control system determines the access control information stored by the preexisting security infrastructure based on an offline or secondary copy of the data. An offline copy can be a backup, snapshot, or other copy of the data that is not actively being used by a live data server or other computers system. By using a secondary copy, the storage access control system can avoid interrupting user access to the live data by not consuming additional resources on the server or other computer system storing the live copy of the data.
When a live or production copy of the source data is used to create a secondary copy, the preexisting security information associated with the data may also be associated with the secondary copy. For example, if the source data is a file, then the security information associated with the file may be captured when the secondary copy is created and stored with the file or in another location that is associated with the secondary copy. For example, many file systems contain hierarchical security schemes such that access control information applied to a parent file system object (e.g., a folder) is applied to each of the child file system objects (e.g., files in the folder). The storage access control system captures this information so that the access control information applied to source data can also be applied to secondary copies of the source data. For example, if the user later performs a search and the storage access control system searches offline copies of data, then the storage access control system can ensure that the user has similar access (both permitting allowed operations and denying excluded operations) to the offline data that the user had to the original live data from which the offline data was created. For example, if the user could not browse particular source data, then the storage access control system may exclude references to secondary copies of the source data from search results. Similarly, if the user could browse and read a file but not write to it, then the storage access control system may permit the user to receive the file in search results, read from the file, but not make modifications to the file.
In some embodiments, the storage access control system stores access control information as metadata that identifies users or groups authorized to perform storage operations. For example, backup files may contain metadata that lists the users that can access the data contained in the backup file. Alternatively or additionally, backup data that is indexed for searching may be associated with metadata stored with the index to apply access control information in response to search queries. For example, a user that does not have permission to access a particular backup data object may be prevented from receiving that data object in a list of results from a search query, even though the backup data object may satisfy the search criteria. Alternatively or additionally, the user may be able to receive the data object in a list of search results, but not be able to open or view the data object. A data object could be a file system object (e.g., a file or folder), an application data object (e.g., an email mailbox, word processing document, etc.), or other object containing data.
In some embodiments, the storage access control system stores access control information as an Access Control List (ACL) containing Access Control Entries (ACE). The ACL contains a list of users and/or groups that are allowed to access a data object, type of data object, or resource containing a data object. Each ACE may specify a user, group, or other entity that has access to the data object associated with the ACL. In some embodiments, an ACL may contain a list of users or groups that are specifically denied access to a data object. In this way, administrators can apply access control rights in the manner that is most logical for their organization. For example, if everyone in the accounting department except User A should have access to a particular data object, then an administrator may create an ACL associated with the data object containing an ACE that allows access to the accounting department group, and another ACE that denies access to User A. The ACL may also contain Boolean operators that describe combinations of permissions and users that should be applied to a data object.
When a user, system, or process attempts to access a data object, such as to perform a storage operation on the data object, the storage access control system accesses the ACL and associated ACEs related to the data object to determine whether the user has the appropriate access to perform the operation on the data object. If the user has the appropriate access, then permission to perform the operation is granted, and the operation proceeds. If the user does not have the appropriate access, then the storage access control system denies permission to perform the operation, and an error or other information may be conveyed to the user indicating that the operation was not performed.
In some embodiments, the storage access control system further protects secondary copies of data, such as by encrypting the data. This may be useful when the backup data is expected to be stored offsite, such as by a public remote backup provider. The data may be encrypted such that it can only be decrypted by those users or groups with access to the original data. For example, the data may be encrypted using a key that is associated with a particular group of users that has access to the data. Users that are not part of the group will not know or be associated with the key and therefore will not be able to decrypt the data, while users within the group will know the key and can decrypt and access the data.
In some embodiments, the storage access control system assigns access rights based on the content of or metadata associated with a data object, such as by querying a content or metadata indexing system. For example, some users may be denied access to files that contain the word “confidential.” An access group of company executives can be granted exclusive access to files that contain the term “board of directors.” The system may apply such content filtering to the data directly, or the system can filter searches for data objects such that the search results do not contain content to which the searching user has not been granted the right to access.
Active Directory Integration for User Creation
In some embodiments, the storage access control system provides a separate security infrastructure, but recognizes users and groups created in the preexisting security infrastructure. For example, server systems running Microsoft Windows often use Active Directory or other systems to create users and groups and assign access rights to those users and groups. The storage access control system may allow creating a separate set of users and groups that are assigned various storage operation rights. However, rather than recreating each user from the Active Directory in the storage access control system, the storage access control system may allow adding an Active Directory user or group to a storage access control system group. For example, when an Active Directory user is added to a storage access control system group, the storage access control system may query the Active Directory to determine information about the user and the access rights associated with the user. Thus, it is not necessary to give storage system operators permissions to create new storage access control system users, and it is not necessary to duplicate the users in both security systems. Similarly, other preexisting security infrastructures could be used with the storage access control system.
The storage access control system may also retrieve other information from the preexisting security system. For example, the preexisting security system may maintain a list of computers associated with a particular user, and the storage access control system can grant the user access, for example, to backup computers in that list. The preexisting security system may contain other supplemental information, such as the user's email address that the storage access control system may use, for example, to email the user if a storage operation fails. The integration and connection of the storage access control system with the preexisting security system allows the storage access control system to provide a system administrator with additional value in the administrator's investment of time and resources in the preexisting security system and reduces the need for a redundant investment of time and resources in another security system.
Security-Based Queries and Access Filtering
In some embodiments, the storage access control system provides an indexing and search facility that allows searching based on keywords within backed up documents. The storage access control system stores access control information for indexed files and applies access control to search queries initiated by a user, system, or process. For example, an administrator may be able to search backup data for all users, whereas another user may only be able to search her own backup data. Likewise, an executive of a company may be able to search for and view content containing sensitive business plans or trade secrets, but other employees may not.
Such access control may be applied using ACLs and Active Directory groups as described above. For example, a user with an ACL on an original file that allows the user to view the file can also view search results containing the file, whereas a user without access to the original file cannot view the file by opening it from a list of search results. Similarly, a user that is a member of an Active Directory group that has access to a file will have access to view search results containing the file. In this way, an organization can leverage the investment in an existing security infrastructure to provide similar security for content accessible via a search facility.
Unless described otherwise below, aspects of the invention may be practiced with conventional systems. Thus, the construction and operation of the various blocks shown in
The set destination security component 130 applies access control information identified from an external security provider to data managed by the storage access control system. For example, during a backup operation, access control information from a source file is associated with secondary copies that store information from the source file, such that a user has similar access rights to the source file and the backup data. The index storage data component 140 creates an index of storage data managed by the storage access control system. For example, the system 100 can maintain an index of data present in a set of files that have been backed up.
The search storage data component 150 performs searches of indexed storage data to identify matching data objects. The provide search results component 160 prepares identified matching data objects for display to a user. For example, data objects for which the searching user does not have access rights may be removed from the search results before the results are returned to the user. The apply content security component 170 applies security to a data object based on the content of the data object. For example, if a user has not been granted access to documents containing the word “confidential,” then the apply content security component 170 prevents the user from accessing a document containing “confidential.”
Aspects of the invention can be embodied in a special purpose computer or data processor that is specifically programmed, configured, or constructed to perform one or more of the computer-executable instructions explained in detail herein. Aspects of the invention can also be practiced in distributed computing environments where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (LAN), Wide Area Network (WAN), or the Internet. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Aspects of the invention may be stored or distributed on computer-readable media, including magnetically or optically readable computer discs, hard-wired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media. Indeed, computer implemented instructions, data structures, screen displays, and other data under aspects of the invention may be distributed over the Internet or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme). Those skilled in the relevant art will recognize that portions of the invention reside on a server computer, while corresponding portions reside on a client computer such as a mobile or portable device, and thus, while certain hardware platforms are described herein, aspects of the invention are equally applicable to nodes on a network.
In step 430, the system adds the external users to the storage application group, such that the storage application group contains users that were not created using the storage application. For example, a user “Bob Jones” created in the Active Directory may be added to a group “System Administrators” within the storage application. The external users may also be user groups, such that group previously created by the administrator using Windows Active Directory is added to the storage application group. In step 440, the system applies the access control rights of the storage application group to the added external users. The system is more secure than traditional systems because each administrator is not given access to create new users within the storage application. By allowing an administrator to add external users to the storage application, the system does not need to allow most administrators to have the access rights necessary to create new users within the storage application. For example, an administrator may only be able to add existing users or groups to the storage application. Thus, an administrator of the preexisting security system can restrict the entities to which an administrator of the storage application can assign rights. Storage system administrators often have access to some of a corporation's most important data, so the ability to control which users can perform storage operations can significantly enhance data security. After step 440, these steps conclude.
From the foregoing, it will be appreciated that specific embodiments of the storage access control system have been described herein for purposes of illustration, but that various modifications may be made without deviating from the spirit and scope of the invention. For example, although certain preexisting security systems have been described, the storage access control system is compatible with any preexisting security system, such as Linux Kerberos, Lightweight Directory Access Protocol (LDAP)-based systems, and others. Although backups have been described, the storage access control system can be applied to other storage operations such as migrating data from one system to another. Accordingly, the invention is not limited except as by the appended claims.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” The word “coupled”, as generally used herein, refers to two or more elements that may be either directly connected, or connected by way of one or more intermediate elements. Additionally, the words “herein,” “above,” “below,” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word “or” in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.
The above detailed description of embodiments of the invention is not intended to be exhaustive or to limit the invention to the precise form disclosed above. While specific embodiments of, and examples for, the invention are described above for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel, or may be performed at different times.
The teachings of the invention provided herein can be applied to other systems, not necessarily the system described above. The elements and acts of the various embodiments described above can be combined to provide further embodiments.
These and other changes can be made to the invention in light of the above Detailed Description. While the above description details certain embodiments of the invention and describes the best mode contemplated, no matter how detailed the above appears in text, the invention can be practiced in many ways. Details of the system may vary considerably in implementation details, while still being encompassed by the invention disclosed herein. As noted above, particular terminology used when describing certain features or aspects of the invention should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the invention with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification, unless the above Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the invention encompasses not only the disclosed embodiments, but also all equivalent ways of practicing or implementing the invention under the claims.
While certain aspects of the invention are presented below in certain claim forms, the inventors contemplate the various aspects of the invention in any number of claim forms. For example, while only one aspect of the invention is recited as embodied in a computer-readable medium, other aspects may likewise be embodied in a computer-readable medium. Accordingly, the inventors reserve the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the invention.
1. A method of searching for data objects in a data management system, the method comprising:
- receiving one or more criteria describing at least one data object to be located within the data management system;
- identifying one or more data objects stored within the data management system that satisfy the received one or more criteria;
- determining one or more access rights associated with the identified one or more data objects stored within the data management system; and
- providing a filtered list of results that contains the identified one or more data objects, wherein the list is filtered based on the determined one or more access rights.
2. The method of claim 1 wherein determining one or more access rights comprises determining access rights based on an identity of a user from which the one or more criteria are received.
3. The method of claim 1 wherein determining one or more access rights comprises determining access rights based on an identity of a process from which the one or more criteria are received.
4. The method of claim 1 wherein determining one or more access rights comprises determining access rights based on an identity of the identified one or more data objects.
5. The method of claim 1 wherein the identified one or more data objects include textual content, and wherein determining one or more access rights comprises determining access rights based on the included textual content.
6. The method of claim 1 wherein providing a filtered list of results comprises removing identified data objects from the results to which the one or more access rights do not grant access.
7. The method of claim 1 wherein providing a filtered list of results comprises providing an indication that access to the results is restricted to results to which the one or more access rights grant access.
8. The method of claim 1 wherein identifying one or more data objects stored within the data management system comprises querying a database that maintains an index of data objects stored within the data management system and access control information associated with the data objects to determine data objects that satisfy the one or more received criteria.
9. The method of claim 1 wherein the access rights associated with the identified one or more data objects are based on access control information associated with source data used to create the one or more data objects.
10. The method of claim 1 wherein the data management system contains multiple copies of certain data objects, and wherein similar access rights are associated with each of the copies of the certain data objects.
11. The method of claim 1 wherein the access rights of a user from which the one or more criteria are received are determined by the membership of the user in one or more Microsoft® Windows Active Directory groups.
12. A computer-readable medium containing instructions for controlling a computer system to restrict access to data objects stored within a storage management system, by a method comprising:
- receiving a request identifying a particular copy of a data object for which access rights are to be determined, wherein the data object has multiple copies;
- identifying the entity requesting access to the particular copy of the data object;
- querying access control information for the particular copy of the data object from the storage management system, wherein the storage management system determines access control information with each data object when a first instance of the data object is encountered and associates the access control information with each subsequent copy of the data object that is created; and,
- indicating whether the identified entity requesting access to the data object is granted access to the data object based on the access control information associated with the data object by the storage management system, wherein the indication is the same regardless of which of the multiple copies of the data object the request identifies.
13. The computer-readable medium of claim 12 wherein identifying the entity requesting access comprises determining the access rights assigned to the entity by a security system.
14. The computer-readable medium of claim 12 wherein different storage operations have been performed on each of the copies of the data object having multiple copies.
15. The computer-readable medium of claim 12 wherein storing access control information for each data object when a first instance of the data object is encountered comprises retrieving access control information associated with a file system in which the data object is stored.
16. The computer-readable medium of claim 12 wherein querying access control information for the particular copy of the data object from the storage management system comprises accessing an index that stores information about each copy of the data object and access control information associated with the data object.
17. The computer-readable medium of claim 12 wherein the entity requesting access is a member of an external security group and identifying the entity requesting access to the particular copy of the data object comprises determining the access rights assigned to members of the external security group.
18. A system for filtering data objects provided in response to a search in a data management system based on access rights associated with the data objects, the system comprising:
- a network security component that provides access control information for data objects stored by one or more computers within the data management system, wherein the access control information is based on access control information associated with source data used to create each data object;
- an entity identification component that identifies an entity requesting access to a data object stored within the data management system;
- a storage search component that receives criteria and performs searches for data objects within the data management system that satisfy at least one or the criteria; and
- a data object access component that determines whether the entity identified by the entity identification component has access to the data objects discovered by the storage search component based on the access control information.
19. The system of claim 18 wherein the network security component manages storage operations associated with data objects and when a storage operation creates a copy of a data object, migrates access control information associated with the source of the data object to the copy of the data object.
20. The system of claim 18, further comprising a web server component that provides access to data objects and the storage search component through a web browser interface.
International Classification: G06F 17/30 (20060101); G06F 7/00 (20060101);