Data tagging

- VARONIS SYSTEMS, INC.

A method for characterizing data elements in an enterprise including ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements and employing the at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements.

Skip to: Description  ·  Claims  ·  References Cited  · Patent History  ·  Patent History
Description
REFERENCE TO RELATED APPLICATIONS

Reference is made to U.S. Provisional Patent Application Ser. No. 61/348,829, filed May 27, 2010 and entitled “DATA MANAGEMENT USING DATA TAGGING”, the disclosure of which is hereby incorporated by reference and priority of which is hereby claimed pursuant to 37 CFR 1.78(a) (4) and (5)(i).

Reference is also made to U.S. patent application Ser. No. 13/014,762, filed Jan. 27, 2011, and entitled “AUTOMATIC RESOURCE OWNERSHIP ASSIGNMENT SYSTEMS AND METHODS”, the disclosure of which is hereby incorporated by reference and priority of which is hereby claimed pursuant to 37 CFR 1.78(a) (1) and (2)(i).

Reference is also made to U.S. patent application Ser. No. 13/106,023, filed May 12, 2011, and entitled “AUTOMATIC RESOURCE OWNERSHIP ASSIGNMENT SYSTEM AND METHOD”, the disclosure of which is hereby incorporated by reference and priority of which is hereby claimed pursuant to 37 CFR 1.78(a) (1) and (2)(i).

Reference is also made to the following patents and patent applications, owned by assignee, the disclosures of which are hereby incorporated by reference:

U.S. Pat. Nos. 7,555,482 and 7,606,801;

U.S. Published Patent Application Nos. 2007/0244899, 2008/0271157, 2009/0100058, 2009/0119298; 2009/0265780; 2011/0060916 and 2011/0061111; and

U.S. patent application Ser. No. 12/673,691.

FIELD OF THE INVENTION

The present invention relates to improved systems and methodologies for data tagging.

BACKGROUND OF THE INVENTION

The following patent publications are believed to represent the current state of the art:

U.S. Pat. Nos. 5,465,387; 5,899,991; 6,338,082; 6,393,468; 6,928,439; 7,031,984; 7,068,592; 7,403,925; 7,421,740; 7,555,482, 7,606,801 and 7,743,420; and

U.S. Published Patent Application Nos.: 2003/0051026; 2004/0249847; 2005/0108206; 2005/0203881; 2005/0086529; 2006/0064313; 2006/0184530; 2006/0184459; 2007/0203872; 2007/0244899; 2008/0271157; 2009/0100058; 2009/0119298 and 2009/0265780.

SUMMARY OF THE INVENTION

The present invention provides improved systems and methodologies for data tagging.

There is thus provided in accordance with a preferred embodiment of the present invention a method for characterizing data elements in an enterprise including ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements and employing the at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements.

Preferably, the method for characterizing data elements in an enterprise also includes ascertaining an owner for each of the plurality of data elements and requiring the owner to review and validate the metatags automatically applied to ones of the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the employing includes automatically applying specific ones of a plurality of different metatags to specific ones of the plurality of data elements. Additionally or alternatively, the employing includes automatically applying to each one of the plurality of data elements a metatag previously applied to a parent folder thereof.

Preferably, the data identifier is one of file type, author, category and language.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes maintaining a database of access metrics for the each of a plurality of data elements. Additionally or alternatively, the method for characterizing data elements in an enterprise also includes maintaining a database of data identifiers for the each of a plurality of data elements.

Preferably, the employing includes employing an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements. Alternatively, the employing includes employing an access metric to automatically apply a metatag to ones of the plurality of data elements. In another alternative embodiment the employing includes employing a data identifier to automatically apply a metatag to ones of the plurality of data elements.

There is also provided in accordance with another preferred embodiment of the present invention a method for characterizing data elements in an enterprise including ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements and employing the at least one of an access metric and a data identifier to automatically recommend application of metatags to the plurality of data elements.

Preferably, the employing includes automatically recommending application of specific ones of a plurality of different metatags to specific ones of the plurality of data elements. Additionally or alternatively, the employing includes automatically recommending application to each of the plurality of data elements a metatag previously applied to a parent folder thereof.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes ascertaining an owner for each of the plurality of data elements and requiring the owner to review and validate application of the recommended metatags to ones of the plurality of data elements of which he is the owner.

Preferably, the data identifier is one of file type, author, category and language.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes maintaining a database of access metrics for the each of a plurality of data elements. Additionally or alternatively, the method for characterizing data elements in an enterprise also includes maintaining a database of data identifiers for the each of a plurality of data elements.

Preferably, the employing includes employing an access metric and a data identifier to automatically recommend application of a metatag to ones of the plurality of data elements. Alternatively, the employing includes employing an access metric to automatically recommend application of metatags to the plurality of data elements. In another alternative embodiment, the employing includes employing a data identifier to automatically recommend application of metatags to the plurality of data elements.

There is yet further provided in accordance with still another preferred embodiment of the present invention a method for characterizing data elements in an enterprise including ascertaining an owner for each of a plurality of data elements and requiring the owner to apply at least one metatag to ones of the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes maintaining a database of access metrics for the each of a plurality of data elements. Additionally or alternatively, the method for characterizing data elements in an enterprise also includes maintaining a database of data identifiers for the each of a plurality of data elements.

There is even further provided in accordance with a further preferred embodiment of the present invention a method for characterizing data elements in an enterprise including ascertaining an owner for each of a plurality of data elements and requiring the owner to review and validate metatags applied to ones of the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes maintaining a database of access metrics for the each of a plurality of data elements. Additionally or alternatively, the method for characterizing data elements in an enterprise also includes maintaining a database of data identifiers for the each of a plurality of data elements.

There is also provided in accordance with yet another preferred embodiment of the present invention a method for characterizing data elements in an enterprise including ascertaining an owner for each of a plurality of data elements and automatically recommending application of metatags by the owner to the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the method for characterizing data elements in an enterprise also includes maintaining a database of access metrics for the each of a plurality of data elements. Additionally or alternatively, the method for characterizing data elements in an enterprise also includes maintaining a database of data identifiers for the each of a plurality of data elements.

There is further provided in accordance with still another preferred embodiment of the present invention a method of operating a file system including maintaining a data owner/administrator accessible database of metatags assigned by data owner/administers to a plurality of data elements; applying the metatags to the plurality of data elements in a storage platform and automatically synchronizing the metatags applied to the plurality of data elements and the database.

There is still further provided in accordance with another preferred embodiment of the present invention a system for characterizing data elements in an enterprise including access metrics collection functionality operative to collect access metrics associated with a plurality of data elements, metadata collection functionality operative to collect metadata associated with the plurality of data elements and metatag application functionality operative to utilize the access metrics collection functionality and the metadata collection functionality to automatically employ at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements.

Preferably, the system for characterizing data elements in an enterprise also includes metatag owner validation functionality operative to ascertain owners of each of the plurality of data elements and to require each of the owners to review and validate the metatags automatically applied to ones of the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the metatag application functionality is also operative to automatically apply specific ones of a plurality of different metatags to specific ones of the plurality of data elements. Additionally or alternatively, the metatag application functionality is also operative to automatically apply to each one of the plurality of data elements a metatag previously applied to a parent folder thereof.

Preferably, the data identifier is one of file type, author, category and language.

In accordance with a preferred embodiment of the present invention the system for characterizing data elements in an enterprise also includes an access metrics database which stores the access metrics collected by the access metrics collection functionality. Additionally or alternatively, the system for characterizing data elements in an enterprise and also includes a metadata database which stores the metadata collected by the metadata collection functionality.

Preferably, the metatag application functionality is also operative to utilize the access metrics collection functionality and the metadata collection functionality to automatically employ an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements. Alternatively, the metatag application functionality is also operative to utilize the access metrics collection functionality to automatically employ an access metric to automatically apply a metatag to ones of the plurality of data elements. In another alternative embodiment, the metatag application functionality is also operative to utilize the metadata collection functionality to automatically employ a data identifier to automatically apply a metatag to ones of the plurality of data elements.

There is yet further provided in accordance with still another preferred embodiment of the present invention a system for characterizing data elements in an enterprise including access metrics collection functionality operative to collect access metrics associated with a plurality of data elements, metadata collection functionality operative to collect metadata associated with the plurality of data elements and metatag recommendation functionality operative to utilize the access metrics collection functionality and the metadata collection functionality to automatically employ at least one of an access metric and a data identifier to automatically recommend application of a metatag to ones of the plurality of data elements.

Preferably, the metatag recommendation functionality is also operative to automatically recommend application of specific ones of a plurality of different metatags to specific ones of the plurality of data elements. Additionally or alternatively, the metatag recommendation functionality is also operative to automatically recommend applying to each one of the plurality of data elements a metatag previously applied to a parent folder thereof.

In accordance with a preferred embodiment of the present invention the system for characterizing data elements in an enterprise also includes metatag owner validation functionality operative to ascertain owners of each of the plurality of data elements and to require each of the owners to review and validate application of the recommended metatags to ones of the plurality of data elements of which he is the owner.

Preferably, the data identifier is one of file type, author, category and language.

In accordance with a preferred embodiment of the present invention the system for characterizing data elements in an enterprise also includes an access metrics database which stores the access metrics collected by the access metrics collection functionality. Additionally or alternatively, the system for characterizing data elements in an enterprise also includes a metadata database which stores the metadata collected by the metadata collection functionality.

Preferably, the metatag recommendation functionality is also operative to utilize the access metrics collection functionality and the metadata collection functionality to automatically employ an access metric and a data identifier to automatically recommend application of a metatag to ones of the plurality of data elements. Alternatively, the metatag recommendation functionality is also operative to utilize the access metrics collection functionality to automatically employ an access metric to automatically recommend application of a metatag to ones of the plurality of data elements. In another alternative embodiment the metatag recommendation functionality is also operative to utilize the metadata collection functionality to automatically employ a data identifier to automatically recommend application of a metatag to ones of the plurality of data elements.

There is even further provided in accordance with yet another preferred embodiment of the present invention a system for characterizing data elements in an enterprise including metatag owner validation functionality operative to ascertain owners of each of the plurality of data elements and to require each of the owners apply at least one metatag to ones of the plurality of data elements of which he is the owner.

Preferably, the system for characterizing data elements in an enterprise also includes an access metrics database which stores access metrics associated with the plurality of data elements. Additionally or alternatively, the system for characterizing data elements in an enterprise also includes a metadata database which stores metadata associated with the plurality of data elements.

There is also provided in accordance with still another preferred embodiment of the present invention a system for characterizing data elements in an enterprise including metatag owner validation functionality operative to ascertain owners of each of a plurality of data elements and to require each of the owners to review and validate application of metatags to ones of the plurality of data elements of which he is the owner.

In accordance with a preferred embodiment of the present invention the system for characterizing data elements in an enterprise also includes an access metrics database which stores access metrics associated with the plurality of data elements. Additionally or alternatively, the system for characterizing data elements in an enterprise also includes a metadata database which stores metadata associated with the plurality of data elements.

There is yet further provided in accordance with yet another preferred embodiment of the present invention a system for characterizing data elements in an enterprise including metatag owner validation functionality operative to ascertain owners of each of a plurality of data elements and to recommend application of metatags by each of the owners to ones of the plurality of data elements of which he is the owner.

Preferably, the system for characterizing data elements in an enterprise also includes an access metrics database which stores access metrics associated with the plurality of data elements. Additionally or alternatively, the system for characterizing data elements in an enterprise also includes a metadata database which stores metadata associated with the plurality of data elements.

There is still further provided in accordance with another preferred embodiment of the present invention a system of operating a file system including a data owner/administrator accessible database of metatags assigned by data owner/administers to a plurality of data elements, metatag application functionality operative to apply the metatags to the plurality of data elements in a storage platform and synchronizing functionality operative to automatically synchronize the metatags applied to the plurality of data elements and the database.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be understood and appreciated more fully from the following detailed description, taken in conjunction with the drawings in which:

FIG. 1 is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with a preferred embodiment of the present invention;

FIG. 2 is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with another preferred embodiment of the present invention;

FIG. 3 is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with yet another preferred embodiment of the present invention;

FIG. 4 is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with a preferred embodiment of the present invention;

FIG. 5 is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with another preferred embodiment of the present invention;

FIG. 6 is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with yet another preferred embodiment of the present invention; and

FIG. 7 is a simplified block diagram illustration of the automatic data tagging system whose functionality is illustrated in FIGS. 1-6.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Reference is now made to FIG. 1, which is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with a preferred embodiment of the present invention. The automatic data tagging system of FIG. 1 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 1 preferably includes characterizing data elements in an enterprise by ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements, and employing the at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements. The operation of the automatic data tagging system also preferably includes ascertaining an owner for each of the plurality of data elements and requiring the owner to review and validate the metatags automatically applied to ones of the plurality of data elements of which he is the owner.

The term “data identifier” is used throughout to refer to metadata associated with a data element. The data identifier may be a content-based data identifier or a non content-based data identifier. A content-based data identifier associated with a data element preferably includes, for example, keywords or an abstract of the content of the data element. A non content-based data identifier associated with a data element preferably includes characteristics associated with the data element such as, for example, file type, author, category and language. A non content-based-data identifier associated with a data element may also include one or more non content-based data identifiers associated with a parent folder of the data element. It is appreciated that the metadata may comprise predefined characteristics provided by the system which hosts the data elements or user-defined characteristics.

The term “metatag” is used throughout to refer to a metadata tag which is associated with a data element. Metatags are useful, for example, for automating data management tasks and for identifying data elements which may be grouped or categorized together for purposes of automatic or manual data management tasks.

The automatic data tagging system of FIG. 1 typically resides on an a server 100 that is connected to an enterprise computer network 102 which preferably includes multiple disparate clients 104, servers 106 and data storage resources 108. Typically, data elements, such as computer files, reside on servers 106 and on data storage resources 108 and are accessible to users of the network 102 in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 1, an IT Administrator of enterprise network 102 decides to utilize the automatic data tagging system residing on server 100 to automatically tag a subset of files which files have access permissions to the ‘Everyone’ group and contain the term ‘confidential’ as being vulnerable files. The Administrator then decides to send a list of the vulnerable files to their respective owners for access permissions remediation. In the example of FIG. 1 access permissions remediation may include, for example, modification of the access permissions of the file to include access permissions only to trusted individuals who require ongoing access to the file.

Reference is now made to FIG. 2, which is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with another preferred embodiment of the present invention. The automatic data tagging system of FIG. 2 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 2 preferably includes characterizing data elements in an enterprise by ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements, and employing the at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements. The operation of the automatic data tagging system also preferably includes ascertaining an owner for each of the plurality of data elements, and requiring the owner to review and validate the metatags automatically applied to ones of the plurality of data elements of which he is the owner.

The automatic data tagging system of FIG. 2 typically resides on an a server 200 that is connected to an enterprise computer network 202 which preferably includes multiple disparate clients 204, servers 206 and data storage resources 208. Typically, data elements, such as computer files, reside on servers 206 and on data storage resources 208 and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 2, an IT Administrator of enterprise network 202 decides to utilize the automatic data tagging system residing on server 200 to automatically tag a subset of files which files are owned by Dave, the company attorney, as being ‘legal’ files. The Administrator then decides to send a list of the legal files to Dave, requesting Dave to ascertain and confirm that the files tagged as ‘legal’ are actually legal-related files. As seen in FIG. 2, Dave ascertains and confirms that the file Contract1.doc and Agreement2.doc are actually legal-related files, while Resume5.doc is not legal-related, and therefore should not be tagged as ‘legal’.

Reference is now made to FIG. 3, which is a simplified pictorial illustration of an example of the operation of an automatic data tagging system constructed and operative in accordance with yet another preferred embodiment of the present invention. The automatic data tagging system of FIG. 3 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 3 preferably includes characterizing data elements in an enterprise by ascertaining an owner for each of a plurality of data elements, and requiring the owner to apply metatags to ones of the plurality of data elements of which he is the owner.

The automatic data tagging system of FIG. 3 typically resides on an a server 300 that is connected to an enterprise computer network 302 which preferably includes multiple disparate clients 304, servers 306 and data storage resources 308. Typically, data elements, such as computer files, reside on servers 306 and on data storage resources and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 3, an IT Administrator of enterprise network 302 decides to request from all owners of a subset of files to manually tag the files. The Administrator utilizes the system residing on server 300 to automatically ascertain the owners of the files and to send a request to each owner to tag their respectively owned files. As seen in FIG. 3, upon receiving the request, each file owner tags their respectively owned files.

Reference is now made to FIG. 4, which is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with a preferred embodiment of the present invention. The automatic data tagging system of FIG. 4 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 4 preferably includes characterizing data elements in an enterprise by ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements, and employing the at least one of an access metric and a data identifier to automatically apply a metatag to ones of the plurality of data elements. The operation of the automatic data tagging system also preferably includes ascertaining an owner for each of the plurality of data elements, and requiring the owner to review and validate the metatags automatically applied to ones of the plurality of data elements of which he is the owner.

The automatic data tagging system of FIG. 4 typically resides on a server that is connected to an enterprise computer network which preferably includes multiple disparate clients, servers and data storage resources. Typically, data elements, such as computer files, reside on servers and on data storage resources and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 4, the system preferably continuously maintains a database of access metrics which include actual access and access permissions of every user to every data element in the enterprise (400). The system also preferably continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements (402). Preferably, the system utilizes the database of stored access metrics and the database of metadata to automatically apply a metatag to each of the subset of data elements (404). Alternatively, the system may automatically apply the metatag assigned to the parent folder of each of the subset of data elements to the data element.

Additionally, the system preferably ascertains an owner for each of the subset of data elements (406), and requires the owner of each of the subset of data elements to review and validate the metatag automatically applied to the data element (408).

Reference is now made to FIG. 5, which is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with another preferred embodiment of the present invention. The automatic data tagging system of FIG. 5 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 5 preferably includes characterizing data elements in an enterprise by, ascertaining at least one of an access metric and a data identifier for each of a plurality of data elements, and employing the at least one of an access metric and a data identifier to automatically recommend application of metatags to the plurality of data elements. Preferably, the recommending of application of metatags to the plurality of data elements includes automatically recommending application of specific ones of a plurality of different metatags to specific ones of said plurality of data elements.

The automatic data tagging system of FIG. 5 typically resides on a server that is connected to an enterprise computer network which preferably includes multiple disparate clients, servers and data storage resources. Typically, data elements, such as computer files, reside on servers and on data storage resources and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 5, the system preferably continuously maintains a database of access metrics which include actual access and access permissions of every user to every data element in the enterprise (500). The system also preferably continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements (502). Preferably, the system utilizes the database of stored access metrics and the database of metadata to recommend applying at least one metatag to each of the subset of data elements (504).

Additionally, the system preferably ascertains an owner for each of the subset of data elements (506), and requires the owner of each of the subset of data elements to review and approve the recommendation to apply the at least one metatag to the data element (508).

Reference is now made to FIG. 6, which is a simplified flowchart indicating steps in the operation of an automatic data tagging system constructed and operative in accordance with yet another preferred embodiment of the present invention. The automatic data tagging system of FIG. 6 is preferably suitable for operating in an enterprise computer network including multiple disparate clients, data elements, computer hardware resources and computer software resources.

The operation of the automatic data tagging system of FIG. 6 preferably includes characterizing data elements in an enterprise by ascertaining an owner for each of a plurality of data elements, and requiring the owner to apply metatags to ones of the plurality of data elements of which he is the owner.

The automatic data tagging system of FIG. 6 typically resides on a server that is connected to an enterprise computer network which preferably includes multiple disparate clients, servers and data storage resources. Typically, data elements, such as computer files, reside on servers and on data storage resources and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

Preferably, the system continuously maintains a database of actual access and access permissions of every user to every data element in the enterprise. This functionality is described in U.S. Pat. No. 7,606,801, in U.S. Published Patent Application 2009/0265780 and in U.S. patent application Ser. No. 12/673,691 owned by assignee, the disclosures of which are hereby incorporated by reference. Access permissions and/or actual access are together designated as access metrics and may be used to designate subsets of all of the data elements in the enterprise.

Preferably, the system also continuously crawls over at least a subset of all data elements in the enterprise and maintains a database of metadata associated with each of the subset of data elements.

As shown in FIG. 6, the system preferably continuously maintains a database of access metrics which include actual access and access permissions of every user to every data element in the enterprise (600). Preferably, the system utilizes the database of access metrics to ascertain an owner for each of the data elements (602), and requires the owner of each of the data elements to apply at least one metatag to each of the data elements of which he is the owner (604).

Reference is now made to FIG. 7, which is a simplified block diagram illustration of the automatic data tagging system whose functionality is illustrated in FIGS. 1-6. The automatic data tagging system 700 typically resides on an a server 702 that is connected to an enterprise computer network 704 which preferably includes multiple disparate clients 706, servers 708 and data storage resources 710. Typically, data elements, such as computer files, reside on servers 708 and on data storage resources 710 and are accessible to users of the network in accordance with access permissions defined by an owner of each data element or each data element folder. It is appreciated that the data elements may reside on any suitable data storage system or platform, such as a file system or a data collaboration system, which may reside on any suitable computer operating system or infrastructure.

As shown in FIG. 7, the automatic data tagging system 700 comprises access metrics collection functionality 720 and metadata collection functionality 722. As described hereinabove regarding FIGS. 1-6, access metrics collection functionality 720 preferably stores access metrics in an access metrics database 724 and metadata collection functionality 722 preferably stores data element metadata in metadata database 726.

Metatag functionality 730 is preferably provided to utilize databases 724 and 726 to automatically apply metatags to data elements residing anywhere on network 704, as described hereinabove with regard to FIGS. 1-6. Metatag functionality 730 preferably includes metatag application functionality 732, which is operative to apply metatags to data elements, and metatag recommendation functionality 734, which is operative to recommend application of metatags to data elements. Metatag application functionality 730 also preferably includes metatag owner validation functionality 736, which is operative to ascertain owners of data elements and to require the owners of the data elements to assign metatags to data elements or to validate recommendations of metatag assignment recommended by metatag recommendation functionality 734.

It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described hereinabove. Rather, the invention also includes various combinations and subcombinations of the features described hereinabove as well as modifications and variations thereof, which would occur to persons skilled in the art upon reading the foregoing and which are not in the prior art.

Claims

1. A method for characterizing data elements, each of said data elements being accessible to users of an enterprise computer network in accordance with access permissions explicitly assigned by an assigned owner thereof, said method comprising:

continuously maintaining a database of said access permissions explicitly assigned by said assigned owner;
continuously maintaining a database of data identifiers associated with said plurality of data elements;
specifying, by an administrator, at least one access permission explicitly assigned by said assigned owner and at least one data identifier;
ascertaining which of said plurality of said data elements have both said at least one access permission explicitly assigned by said assigned owner and said at least one data identifier associated therewith;
specifying, by an administrator, administrator defined metatags to be associated with each of said data elements ascertained to have both said at least one access permission explicitly assigned by said assigned owner and said at least one data identifier associated therewith;
automatically applying a metatag from among said administrator defined metatags to ones of said plurality of data elements ascertained to have said at least one access permission explicitly assigned by said assigned owner and said at least one specific data identifier associated therewith;
ascertaining an assigned owner for each one of said plurality of data elements ascertained to have said at least one access permission explicitly assigned by said assigned owner and said at least one data identifier associated therewith, said assigned owner having authority and accountability with respect to said one data element within said enterprise; and
requiring said assigned owner to review and validate said metatags automatically applied to said ones of said plurality of data elements owned thereby.

2. The method for characterizing data elements in an enterprise according to claim 1 and wherein said automatically applying a metatag comprises automatically applying specific ones of a plurality of different metatags to specific ones of said plurality of data elements.

3. The method for characterizing data elements in an enterprise according to claim 1 and wherein said automatically applying a metatag comprises automatically applying to each one of said plurality of data elements a metatag previously applied to a parent folder thereof.

4. The method for characterizing data elements in an enterprise according to claim 1 and wherein said data identifier is one of file type, author, category and language.

5. The method for characterizing data elements in an enterprise according to claim 1 and wherein said automatically applying a metatag comprises automatically applying a metatag to ones of said plurality of data elements.

6. A system having a computer comprising a processor, a memory and a non-transitory, tangible computer-readable medium in which computer program instructions are stored, which instructions, when executed by said processor, cause the computer to characterize data elements, each of said data elements being accessible to users of an enterprise computer network in accordance with access permissions explicitly assigned by an assigned owner thereof, said system comprising:

access metrics collection functionality operative to collect said access permissions explicitly assigned by said assigned owner thereof;
metadata collection functionality operative to collect data identifiers associated with said plurality of data elements;
metatag specification functionality operative to facilitate specifying, by an administrator, administrator defined metatags to be associated with each of said data elements having both at least one access metric permission explicitly assigned by said assigned owner and at least one data identifier associated therewith;
metatag application functionality operative to utilize said access metrics collection functionality and said metadata collection functionality to ascertain which of said plurality of said data elements have both said at least one access permission explicitly assigned by said assigned owner and said at least one specific data identifier associated therewith and to automatically apply a metatag from among said administrator defined metatags to ones of said plurality of data elements ascertained to have said at least one access permission explicitly assigned by said assigned owner and said at least one data identifier associated therewith; and
metatag owner validation functionality operative to ascertain an assigned owner of each one of said plurality of data elements ascertained to have said at least one access permission explicitly assigned by said assigned owner and said at least one data identifier associated therewith, said assigned owner having authority and accountability with respect to said one data element within said enterprise, and to require each of said assigned owners to review and validate said metatags automatically applied to said ones of said plurality of data elements owned thereby.

7. The system according to claim 6 and wherein said metatag application functionality is also operative to automatically apply specific ones of a plurality of different metatags to specific ones of said plurality of data elements.

8. The system according to claim 6 and wherein said metatag application functionality is also operative to automatically apply to each one of said plurality of data elements a metatag previously applied to a parent folder thereof.

9. The system according to claim 6 and wherein said data identifier is one of file type, author, category and language.

10. The system according to claim 6 and also comprising a metadata database which stores said metadata collected by said metadata collection functionality.

11. The system according to claim 6 and wherein said metatag application functionality is also operative to utilize said access metrics collection functionality and said metadata collection functionality to automatically employ said access metric and said data identifier to automatically apply a metatag to ones of said plurality of data elements.

12. The system according to claim 6 and wherein said metatag application functionality is also operative to utilize said access metrics collection functionality to automatically employ said access metric to automatically apply a metatag to ones of said plurality of data elements.

13. The system according to claim 6 and wherein said metatag application functionality is also operative to utilize said metadata collection functionality to automatically employ said data identifier to automatically apply a metatag to ones of said plurality of data elements.

Referenced Cited
U.S. Patent Documents
5465387 November 7, 1995 Mukherjee
5761669 June 2, 1998 Montague et al.
5889952 March 30, 1999 Hunnicutt et al.
5899991 May 4, 1999 Karch
6308173 October 23, 2001 Glasser et al.
6338082 January 8, 2002 Schneider
6393468 May 21, 2002 McGee
6772350 August 3, 2004 Belani et al.
6928439 August 9, 2005 Satoh
7007032 February 28, 2006 Chen et al.
7017183 March 21, 2006 Frey et al.
7031984 April 18, 2006 Kawamura et al.
7068592 June 27, 2006 Duvaut et al.
7401087 July 15, 2008 Copperman et al.
7403925 July 22, 2008 Schlesinger et al.
7421740 September 2, 2008 Fey et al.
7555482 June 30, 2009 Korkus
7568230 July 28, 2009 Lieberman et al.
7606801 October 20, 2009 Faitelson et al.
7716240 May 11, 2010 Lim
7720858 May 18, 2010 Dettinger et al.
7743420 June 22, 2010 Shulman et al.
7797335 September 14, 2010 Stern et al.
7797337 September 14, 2010 Fry
7801894 September 21, 2010 Bone
7844582 November 30, 2010 Arbilla
7882098 February 1, 2011 Prahlad et al.
7890530 February 15, 2011 Bilger et al.
8171050 May 1, 2012 O'Halloran et al.
8250048 August 21, 2012 Yalamanchi et al.
8285748 October 9, 2012 Thoms et al.
8306999 November 6, 2012 Gass et al.
8417678 April 9, 2013 Bone
8438124 May 7, 2013 Spivack et al.
8463815 June 11, 2013 Zoellner
8521766 August 27, 2013 Hoarty
8612404 December 17, 2013 Bone et al.
8626803 January 7, 2014 Hsu
8799225 August 5, 2014 Vaitzblit et al.
20020002557 January 3, 2002 Straube et al.
20030051026 March 13, 2003 Carter et al.
20030188198 October 2, 2003 Holdsworth et al.
20040030915 February 12, 2004 Sameshima et al.
20040186809 September 23, 2004 Schlesinger et al.
20040249847 December 9, 2004 Wang et al.
20040254919 December 16, 2004 Giuseppini
20050044399 February 24, 2005 Dorey
20050065823 March 24, 2005 Ramraj et al.
20050086268 April 21, 2005 Rogers
20050086529 April 21, 2005 Buchsbaum
20050108206 May 19, 2005 Lam et al.
20050120054 June 2, 2005 Shulman et al.
20050172126 August 4, 2005 Lange et al.
20050203881 September 15, 2005 Sakamoto et al.
20050246762 November 3, 2005 Girouard et al.
20050278334 December 15, 2005 Fey et al.
20050278785 December 15, 2005 Lieberman
20060064313 March 23, 2006 Steinbarth et al.
20060075503 April 6, 2006 Bunker, V et al.
20060090208 April 27, 2006 Smith
20060184459 August 17, 2006 Parida
20060184530 August 17, 2006 Song et al.
20060271523 November 30, 2006 Brookler et al.
20060277184 December 7, 2006 Faitelson et al.
20060294578 December 28, 2006 Burke et al.
20070033340 February 8, 2007 Tulskie et al.
20070050366 March 1, 2007 Bugir et al.
20070073698 March 29, 2007 Kanayama et al.
20070094265 April 26, 2007 Korkus
20070101387 May 3, 2007 Hua et al.
20070112743 May 17, 2007 Giampaolo et al.
20070156659 July 5, 2007 Lim
20070156693 July 5, 2007 Soin et al.
20070198608 August 23, 2007 Prahlad et al.
20070203872 August 30, 2007 Flinn et al.
20070214497 September 13, 2007 Montgomery et al.
20070244899 October 18, 2007 Faitelson et al.
20070261121 November 8, 2007 Jacobson
20070266006 November 15, 2007 Buss
20070276823 November 29, 2007 Borden et al.
20070282855 December 6, 2007 Chen et al.
20080031447 February 7, 2008 Geshwind et al.
20080034402 February 7, 2008 Botz et al.
20080162707 July 3, 2008 Beck et al.
20080172720 July 17, 2008 Botz et al.
20080201348 August 21, 2008 Edmonds
20080270462 October 30, 2008 Thomsen
20080271157 October 30, 2008 Faitelson et al.
20090037558 February 5, 2009 Stone et al.
20090077124 March 19, 2009 Spivack et al.
20090100058 April 16, 2009 Faitelson et al.
20090119298 May 7, 2009 Faitelson et al.
20090150981 June 11, 2009 Amies et al.
20090163183 June 25, 2009 O'Donoghue et al.
20090198892 August 6, 2009 Alvarez
20090249446 October 1, 2009 Jenkins et al.
20090265780 October 22, 2009 Korkus et al.
20090292930 November 26, 2009 Marano et al.
20090320088 December 24, 2009 Gill et al.
20100037324 February 11, 2010 Grant et al.
20100057815 March 4, 2010 Spivack et al.
20100070881 March 18, 2010 Hanson et al.
20100114977 May 6, 2010 Bacher
20100185650 July 22, 2010 Topatan et al.
20100299763 November 25, 2010 Marcus et al.
20110040793 February 17, 2011 Davidson
20110060916 March 10, 2011 Faitelson et al.
20110061093 March 10, 2011 Korkus et al.
20110061111 March 10, 2011 Faitelson et al.
20110184989 July 28, 2011 Faitelson et al.
20110219028 September 8, 2011 Dove
20110247074 October 6, 2011 Manring et al.
20110296490 December 1, 2011 Faitelson et al.
20120054283 March 1, 2012 Korkus et al.
20120173583 July 5, 2012 Faiteson
20120271853 October 25, 2012 Faitelson et al.
20120271855 October 25, 2012 Faitelson et al.
20120291100 November 15, 2012 Faitelson et al.
Foreign Patent Documents
1588889 March 2005 CN
101226537 July 2008 CN
1 248 178 October 2002 EP
2011/030324 March 2011 WO
2011/148364 December 2011 WO
2011/148375 December 2011 WO
2011/148376 December 2011 WO
2011/148377 December 2011 WO
2012/101620 August 2012 WO
2012/143920 October 2012 WO
Other references
  • U.S. Appl. No. 60/688,486, filed Jun. 7, 2005.
  • U.S. Appl. No. 12/673,691, filed Feb. 16, 2010.
  • Findutils; GNU Project-Free Software Foundation (FSF), 3 pages, Nov. 2006.
  • S.R. Kleiman; “Vnodes: An Architecture for Multiple File System Types in Sun UNIX”, USENIX Association, Summer Conference Proceeding, Atlanta 1986, 10 pages.
  • GENUNIX; Writing Filesystems VFS and Vnode Interfaces, 5 pages, Oct. 2007.
  • Sahadeb DE, et al; “Secure Access Control in a Multi-user Geodatabase”, available on the Internet at URL http://www10.giscafe.com 2005.
  • Sara C. Madeira; “Clustering, Fuzzy Clustering and Biclustering: An Overview”, pp. 31 to 53, Jun. 27, 2003.
  • Sara C. Madeira, et al; “Biclustering Algorithms for Biological Data Analysis: A Survey”, IEEE Transactions on Computational Biology and Bioinformatics, vol. 1, No. 1, Jan.-Mar. 2004, 22 pages; http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/bicluster.pdf.
  • Federico Stagni; “On Usage Control for Data Grids: Models, Architectures, and Specifications”, Mar. 18, 2009; Thesis (PhD Thesis), 177 pages, [Retrieved on Oct. 15, 2011].
  • Tamas Suto; “Augmenting the Core Functionality of an e-Science Grid Multi-Tier Front-End: GridSphere-based Reengineering of EPIC”, 208 pages, Submitted to Imperial College London in partial fulfillment of the requirements for the degree of Master of Engineering, 2004 [retrieved on Oct. 15, 2011].
  • Edgar Weippl, et al; “Content-based Management of Document Access Control”, 14th International Conference on Applications of Prolog (INAP), 2001, 9 pages.
  • Alex Woodie; “Varonis Prevents Unauthorized Access to Unstructured Data”, 3 pages; Four Hundred Stuff, Published Jul. 31, 2007.
  • Varonis; A List of database tables in DatAdvantage 2.7, Feb. 6, 2007, 1 page.
  • Varonis, A List of database tables in DatAdvantage 3.0, Jun. 20, 2007.
  • Varonis; “The business Case for Data Governance”, dated Mar. 27, 2007, 8 pages.
  • Varonis; “Accelerating Audits with Automation: Understanding Who's Accessing Your Unstructured Data”, Oct. 8, 2007, 7 pages; Copyright 2007 by Varonis Systems.
  • Varonis; Entitlement Reviews: A Practitioner's Guide, 16 pages, Copyright 2007 by Varonis Systems.
  • Varonis; DatAdvantage User Guide, Version 1.0, Aug. 30, 2005, 71 pages.
  • Varonis; DatAdvantage User Guide, Version 2.0, Aug. 24, 2006, 118 pages.
  • Varonis; DatAdvantage User Guide, Version 2.5, Nov. 27, 2006, 124 pages.
  • Varonis; DatAdvantage User Guide, Version 2.6, Dec. 15, 2006, 127 pages.
  • Varonis; DatAdvantage User Guide, Version 2.7, Feb. 6, 2007, 131 pages.
  • Varonis; DatAdvantage User Guide, Version 3.0, Jun. 20, 2007, 153 pages.
  • German Office Action, dated Sep. 14, 2012, German Appln. No. 11 2006 001 378.5.
  • USPTO NFOA mailed Feb. 12, 2008 in connection with U.S. Appl. No. 11/258,256.
  • USPTO FOA mailed Aug. 1, 2008 in connection with U.S. Appl. No. 11/258,256.
  • USPTO NFOA mailed Oct. 31, 2008 in connection with U.S. Appl. No. 11/635,736.
  • USPTO NFOA mailed Dec. 14, 2010 in connection with U.S. Appl. No. 11/786,522.
  • USPTO NFOA mailed Jul. 9, 2010 in connection with U.S. Appl. No. 11/789,884.
  • USPTO FOA mailed Dec. 14, 2010 in connection with U.S. Appl. No. 11/789,884.
  • USPTO NOA mailed Apr. 12, 2012 in connection with U.S. Appl. No. 11/789,884.
  • USPTO NFOA dated Sep. 16, 2010 in connection with U.S. Appl. No. 11/871,028.
  • USPTO FOA dated Apr. 28, 2011 in connection with U.S. Appl. No. 11/871,028.
  • USPTO NFOA dated Jul. 10, 2012 in connection with U.S. Appl. No. 12/861,059.
  • USPTO FOA dated Dec. 24, 2012 in connection with U.S. Appl. No. 12/861,059.
  • USPTO NFOA dated Sep. 14, 2012 in connection with U.S. Appl. No. 12/861,967.
  • USPTO NFOA dated Jul. 11, 2012 in connection with U.S. Appl. No. 13/014,762.
  • USPTO RR dated Nov. 21, 2012 in connection with U.S. Appl. No. 13/106,023.
  • USPTO NFOA dated Jan. 15, 2013 in connection with U.S. Appl. No. 13/159,903.
  • USPTO NFOA dated Sep. 19, 2012 in connection with U.S. Appl. No. 13/303,826.
  • IPRP dated Nov. 27, 2012, PCT/IL2011/000076.
  • IPRP dated Nov. 27, 2012, PCT/IL2011/000407.
  • IPRP dated Nov. 27, 2012, PCT/IL2011/000409.
  • ISR dated May 23, 2011; PCT/IL11/00065.
  • ISR and Written Opinion dated May 20, 2010; PCT/IL10/00069.
  • ISR and Written Opinion dated Jun. 14, 2011 PCT/IL11/00066.
  • ISR and Written Opinion dated Jun. 13, 2011 PCT/IL11/00076.
  • ISR and Written Opinion dated May 24, 2011, PCT/IL11/00077.
  • ISR and Written Opinion dated Nov. 2, 2011; PCT/IL11/00407.
  • ISR and Written Opinion dated Nov. 15, 2011; PCT/IL11/00408.
  • ISR and Written Opinion dated Nov. 3, 2011; PCT/IL11/00409.
  • ISR and Written Opinion dated Apr. 13, 2012; PCT/IL11/00902.
  • ISR and Written Opinion dated Aug. 31, 2012; PCT/IL2012/000163.
  • USPTO FOA dated Jul. 2, 2013 in connection with U.S. Appl. No. 13/413,748.
  • First Chinese Office Action dated Mar. 4, 2015; Appln. No. 2011800361521.
  • An Office Action dated Nov. 18, 2014, which issued during the prosecution of U.S. Appl. No. 13/384,459.
  • Third Chinese Office Action dated Apr. 11, 2016; Appln. No. 2011800381521.
Patent History
Patent number: 10296596
Type: Grant
Filed: May 26, 2011
Date of Patent: May 21, 2019
Patent Publication Number: 20120191646
Assignee: VARONIS SYSTEMS, INC. (New York, NY)
Inventors: Yakov Faitelson (Elkana), Ohad Korkus (Herzilla), Ophir Kretzer-Katzir (Reut), David Bass (Carmei Yoseph)
Primary Examiner: Dangelino N Gortayo
Application Number: 13/384,465
Classifications
Current U.S. Class: Distributed Search And Retrieval (707/770)
International Classification: G06F 17/30 (20060101); G06F 16/907 (20190101); G06F 16/16 (20190101); G06F 16/9535 (20190101); G06F 16/36 (20190101); G06F 16/2457 (20190101); G06F 16/93 (20190101);