Patents by Inventor Roger C. Raphael
Roger C. Raphael has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11829424Abstract: Discovering second-order documents and latent custodians in an e-discovery system is provided. A list of first-order documents and document custodians within a base state of the e-discovery system are identified based on a plurality of terms corresponding to a meet and confer practice for a legal matter instance. The plurality of terms is masked within the first-order documents. The first-order documents having the plurality of terms masked are divided into groups. A list of second-order documents is generated from a group of documents. A list of second-order document custodians is generated based on corresponding custodian relationships to second-order documents. Finally, each second-order document custodian in the list of second-order document custodians that has a corresponding rank exceeding a defined rank threshold level is identified as an official document custodian in the e-discovery system.Type: GrantFiled: February 20, 2020Date of Patent: November 28, 2023Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Rajesh M. Desai, Nazrul Islam, Magesh Jayapandian, Jojo Joseph
-
Patent number: 11741258Abstract: Dynamic data dissemination is provided. A resolved data subject identifier corresponding to a data subject is selected from a set of resolved data subject identifiers existing in rows of a data asset. In response to determining that the resolved data subject identifier does not correspond to a right to forget list, it is determined that the resolved data subject identifier corresponds to a data subject request list. The rows are transformed to anonymize existing pseudo and personal identifiers in cells of the rows that are tied to columns associated with data classes for which specific consent dimensions have been indicated as revoked by the data subject.Type: GrantFiled: April 16, 2021Date of Patent: August 29, 2023Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Rajesh M. Desai, Scott Schumacher, Angineh Aghakiant
-
Patent number: 11663159Abstract: A method, apparatus, system, and computer program code for policy-based enforcement in a data virtualization system is provided. Responsive to receiving a query, a computer identifies a virtual object among a set of connected objects that is represented by a set of data assets and their hierarchical relationships. The virtual object corresponds to a subset of the data assets. The computer identifies a subset of objects according to a cumulative transitive closure of the virtual object over the set of connected objects. The computer identifies a set of policies for the subset of objects. For each object in the subset of objects, the computer determines an intermediate decision according to set of policies, whereby a set of intermediate decisions is formed. The computer deterministically reconciles the set of intermediate decisions to generate a resolved decision. The computer provides access to the queried virtual objects based on the resolved decision.Type: GrantFiled: August 31, 2021Date of Patent: May 30, 2023Assignee: International Business Machines CorporationInventors: Maxim Neaga, Roger C. Raphael, Shantanu Sadanand Mundkur, Hebert Walter Pereyra, Yaxian Wang
-
Patent number: 11647004Abstract: Preserving distributions of data values of a data asset in a data anonymization operation is provided. Anonymizing data values is performed by transforming sensitive data in a set of columns over rows of the data asset while preserving distribution of the data values in the set of transformed columns to a defined degree using a set of autoencoders and loss function. The autoencoders are base trained from preexisting data in a data assets catalog and actively trained during data dissemination. Parametric coefficients of the loss function are configured and the threshold is generated using policies from an enforcement decision for the data asset and data consumer. The loss function value of a selected row is compared to the threshold. Transformed data values of the selected row are transcribed to an output row when the loss function value is greater than the threshold and disseminated to the data consumer.Type: GrantFiled: March 24, 2021Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Arjun Natarajan, Ashish Kundu, Roger C. Raphael, Aniya Aggarwal, Rajesh M. Desai, Joshua F. Payne, Mu Qiao
-
Publication number: 20230067938Abstract: A method, apparatus, system, and computer program code for policy-based enforcement in a data virtualization system is provided. Responsive to receiving a query, a computer identifies a virtual object among a set of connected objects that is represented by a set of data assets and their hierarchical relationships. The virtual object corresponds to a subset of the data assets. The computer identifies a subset of objects according to a cumulative transitive closure of the virtual object over the set of connected objects. The computer identifies a set of policies for the subset of objects. For each object in the subset of objects, the computer determines an intermediate decision according to set of policies, whereby a set of intermediate decisions is formed. The computer deterministically reconciles the set of intermediate decisions to generate a resolved decision. The computer provides access to the queried virtual objects based on the resolved decision.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Maxim Neaga, Roger C. Raphael, Shantanu Sadanand Mundkur, Hebert Walter Pereyra, Yaxian Wang
-
Patent number: 11556514Abstract: Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.Type: GrantFiled: February 24, 2021Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Mu Qiao, Scott Schumacher, Angineh Aghakiant
-
Publication number: 20220335156Abstract: Dynamic data dissemination is provided. A resolved data subject identifier corresponding to a data subject is selected from a set of resolved data subject identifiers existing in rows of a data asset. In response to determining that the resolved data subject identifier does not correspond to a right to forget list, it is determined that the resolved data subject identifier corresponds to a data subject request list. The rows are transformed to anonymize existing pseudo and personal identifiers in cells of the rows that are tied to columns associated with data classes for which specific consent dimensions have been indicated as revoked by the data subject.Type: ApplicationFiled: April 16, 2021Publication date: October 20, 2022Inventors: Roger C. Raphael, Rajesh M. Desai, Scott Schumacher, Angineh Aghakiant
-
Publication number: 20220311749Abstract: Preserving distributions of data values of a data asset in a data anonymization operation is provided. Anonymizing data values is performed by transforming sensitive data in a set of columns over rows of the data asset while preserving distribution of the data values in the set of transformed columns to a defined degree using a set of autoencoders and loss function. The autoencoders are base trained from preexisting data in a data assets catalog and actively trained during data dissemination. Parametric coefficients of the loss function are configured and the threshold is generated using policies from an enforcement decision for the data asset and data consumer. The loss function value of a selected row is compared to the threshold. Transformed data values of the selected row are transcribed to an output row when the loss function value is greater than the threshold and disseminated to the data consumer.Type: ApplicationFiled: March 24, 2021Publication date: September 29, 2022Inventors: Arjun Natarajan, ASHISH KUNDU, Roger C. Raphael, Aniya Aggarwal, Rajesh M. Desai, Joshua F. Payne, Mu Qiao
-
Publication number: 20220309155Abstract: An apparatus and related method defend against adversarial queries. A policy enforcement hypergraph is constructed to express a set of security policies. Then, the hypergraph is repeatedly traversed to determine whether a user behavior is changing over time. The user behavior is measured by reference to a vertex or an edge in the hypergraph. If it is determined that the user behavior has changed over time an enforcement action is taken based on a security policy.Type: ApplicationFiled: March 24, 2021Publication date: September 29, 2022Inventors: Joshua F. PAYNE, Ashish KUNDU, Arjun NATARAJAN, Roger C. RAPHAEL, Scott SCHUMACHER
-
Publication number: 20220269663Abstract: Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.Type: ApplicationFiled: February 24, 2021Publication date: August 25, 2022Inventors: Roger C. Raphael, Mu Qiao, Scott Schumacher, Angineh Aghakiant
-
Patent number: 11372831Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.Type: GrantFiled: July 29, 2019Date of Patent: June 28, 2022Assignee: International Business Machines CorporationInventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
-
Patent number: 11362997Abstract: A method, apparatus, system, and computer program product evaluate an information asset with a corpus of policies in conjunction with the context of access including a specific user. A large corresponding set of rules in the policy corpus are identified by computer system. A continuous process of rule evaluation occurs against information asset metadata wherein a series of processing including set of common subexpressions between the predicates of all active rules, pre-evaluation, compaction and storage are identified by the computer system in the policy and rule corpus. Metadata for the information asset is applied by the computer system to the set of common subexpressions to form partially evaluated rules for the policy. The partially evaluated rules henceforth compacted are stored by the computer system in association with the information asset.Type: GrantFiled: October 16, 2019Date of Patent: June 14, 2022Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Rajesh M. Desai, Iun Veng Leong, Brian Joseph Owings
-
Patent number: 11347891Abstract: Disclosed is a computer-implemented method to identify and anonymize personal information, the method comprising analyzing a first corpus with a personal information sniffer, wherein the first corpus includes unstructured text, wherein the personal information sniffer is configured to detect a set of types of personal information, and wherein the personal information sniffer produces a first set of results. The method comprises analyzing the first corpus with a set of annotators, wherein each annotator is configured to identify all instances of a type of personal information in the corpus, and wherein the set of annotators produces a second set of results. The method comprises comparing the first set of results and the second set of results, determining, the first set of results does not match the second set of results, and updating, based on the determining, the personal information sniffer.Type: GrantFiled: June 19, 2019Date of Patent: May 31, 2022Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Rajesh M. Desai, Iun Veng Leong, Ramakanta Samal, Ansel Blume
-
Patent number: 11321479Abstract: Enforcement of policies for tabular data access as a collection of columns over a plurality of different information assets is provided. In an enforcement knowledge graph, information asset-assigned terms are found that correspond to information assets in a virtual information asset that references a set of tabular data. Transitive closures of the information asset-assigned terms are found in a business glossary to form a table of business glossary terms. Term intersection is determined between a hash table of any column-assigned terms and the table of business glossary terms. The information assets are assigned to the virtual information asset when the term intersection is not empty. A set of policy rules associated with the set of tabular data and a context of a user making a data access request to the set of tabular data is applied to the virtual information asset to determine an access enforcement decision.Type: GrantFiled: December 6, 2019Date of Patent: May 3, 2022Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Ety Khaitzin, Scott Schumacher, Arjun Natarajan
-
Patent number: 11308235Abstract: A method, system and computer program product for detecting sensitive personal information in a storage device. A block delta list containing a list of changed blocks in the storage device is processed. After identifying the changed blocks from the block delta list, a search is performed on those identified changed blocks for sensitive personal information using a character scanning technique. After identifying a changed block deemed to contain sensitive personal information, the changed block is translated from the block level to the file level using a hierarchical reverse mapping technique. By only analyzing the changed blocks to determine if they contain sensitive personal information, a lesser quantity of blocks needs to be processed in order to detect sensitive personal information in the storage device in near real-time. In this manner, sensitive personal information is detected in the storage device using fewer computing resources in a shorter amount of time.Type: GrantFiled: March 6, 2020Date of Patent: April 19, 2022Assignee: International Business Machines CorporationInventors: Rajesh M. Desai, Mu Qiao, Roger C. Raphael, Ramani Routray
-
Patent number: 11283839Abstract: Predicting access impact of a plurality of rule changes on a corpus of information assets is provided. A set of affected rules in a new rule space for controlling access to the corpus of information assets is received. The set of affected rules is shredded to identify right-hand side terms contained in predication blocks of the set of affected rules. An enforcement knowledge graph is traversed to identify a set of hot information assets having same terms as the right-hand side terms of the set of affected rules. The set of hot information assets having the same terms as the right-hand side terms of the set of affected rules is added to a hash table of hot information assets.Type: GrantFiled: December 6, 2019Date of Patent: March 22, 2022Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Iun Veng Leong, Angineh Aghakiant, Immalla Grace Chen, Scott Schumacher
-
Patent number: 11250527Abstract: Embodiments generally relate to providing litigation management for multiple remote content systems using asynchronous bi-directional replication pipelines. In some embodiments, a method includes retrieving, at one or more inbound replicators of one or more respective bi-directional pipelines, metadata associated with documents stored in one or more content repositories. The method further includes resolving, at a governance control hub, conflicts associated with legal holds on one or more of the documents based on the metadata. The method further includes sending conflict resolution results from one or more outbound applicators of the bi-directional pipelines to the content repositories, where the content repositories enforce legal holds on the documents.Type: GrantFiled: June 18, 2019Date of Patent: February 15, 2022Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Ronald L. Rathgeber, Rajesh M. Desai, Gabriel Valencia, Justo Perez, William Russell Belknap, Sudhakar Basireddy
-
Patent number: 11210410Abstract: Serving data assets based on security policies is provided. A request to access an asset received from a user having a particular context is evaluated based on a set of asset access enforcement policies. An asset access policy enforcement decision is generated based on evaluating the request. It is determined whether the asset access policy enforcement decision is to transform particular data of the asset prior to allowing access. In response to determining that the asset access policy enforcement decision is to transform the particular data of the asset prior to allowing access, a transformation specification that includes an ordered subset of unit transformations for transforming the particular data of the asset is generated based on the particular context of the user and the set of asset access enforcement policies. A transformed asset is generated by applying the transformation specification to the asset transforming the particular data of the asset.Type: GrantFiled: September 17, 2019Date of Patent: December 28, 2021Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Hani Talal Jamjoom, Rajesh M. Desai, Iun Veng Leong, Uttama Shakya, Arjun Natarajan
-
Patent number: 11184402Abstract: A method trains a neural network to recognize whether a resource is authorized to be returned to a requester. One or more processors train a neural network to traverse a policy enforcement hypergraph in order to identify a security policy to be used for a resource request and to authorize a use of a requested resource by a requester. The policy enforcement hypergraph is derived from a policy enforcement graph that expresses a set of security profiles for resources and requesters. The processor(s) receive a resource request for a requested resource from a requester, where the resource request includes a description of the requester. A system/user inputs a description of the received resource request and a description of the policy enforcement hypergraph into the trained neural network in order to selectively return the requested resource to the requester.Type: GrantFiled: March 25, 2020Date of Patent: November 23, 2021Assignee: International Business Machines CorporationInventors: Ashish Kundu, Joshua Payne, Arjun Natarajan, Roger C. Raphael, Scott Schumacher
-
Patent number: 11178186Abstract: A method, apparatus, system, and computer program product for evaluating enforcement decisions on an asset using a policy. Rules in the policy are applied by a computer system to the asset taking into account a context for a request to access the asset in response receiving to the request to access the asset, and wherein the rules in the policy determine whether access to the asset is allowed. A determination is made by the computer system as to whether a conflict is present in an initial decision made using the rules in the policy. A set of conflict resolution processes are applied by the computer system when the conflict is present such that a final decision is made on the request to access the asset.Type: GrantFiled: March 19, 2020Date of Patent: November 16, 2021Assignee: International Business Machines CorporationInventors: Roger C. Raphael, Rajesh M. Desai, Ety Khaitzin, Shalu Agrawal, Angineh Aghakiant