Patents by Inventor Roger C. Raphael

Roger C. Raphael has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11829424
    Abstract: Discovering second-order documents and latent custodians in an e-discovery system is provided. A list of first-order documents and document custodians within a base state of the e-discovery system are identified based on a plurality of terms corresponding to a meet and confer practice for a legal matter instance. The plurality of terms is masked within the first-order documents. The first-order documents having the plurality of terms masked are divided into groups. A list of second-order documents is generated from a group of documents. A list of second-order document custodians is generated based on corresponding custodian relationships to second-order documents. Finally, each second-order document custodian in the list of second-order document custodians that has a corresponding rank exceeding a defined rank threshold level is identified as an official document custodian in the e-discovery system.
    Type: Grant
    Filed: February 20, 2020
    Date of Patent: November 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Nazrul Islam, Magesh Jayapandian, Jojo Joseph
  • Patent number: 11741258
    Abstract: Dynamic data dissemination is provided. A resolved data subject identifier corresponding to a data subject is selected from a set of resolved data subject identifiers existing in rows of a data asset. In response to determining that the resolved data subject identifier does not correspond to a right to forget list, it is determined that the resolved data subject identifier corresponds to a data subject request list. The rows are transformed to anonymize existing pseudo and personal identifiers in cells of the rows that are tied to columns associated with data classes for which specific consent dimensions have been indicated as revoked by the data subject.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: August 29, 2023
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Scott Schumacher, Angineh Aghakiant
  • Patent number: 11663159
    Abstract: A method, apparatus, system, and computer program code for policy-based enforcement in a data virtualization system is provided. Responsive to receiving a query, a computer identifies a virtual object among a set of connected objects that is represented by a set of data assets and their hierarchical relationships. The virtual object corresponds to a subset of the data assets. The computer identifies a subset of objects according to a cumulative transitive closure of the virtual object over the set of connected objects. The computer identifies a set of policies for the subset of objects. For each object in the subset of objects, the computer determines an intermediate decision according to set of policies, whereby a set of intermediate decisions is formed. The computer deterministically reconciles the set of intermediate decisions to generate a resolved decision. The computer provides access to the queried virtual objects based on the resolved decision.
    Type: Grant
    Filed: August 31, 2021
    Date of Patent: May 30, 2023
    Assignee: International Business Machines Corporation
    Inventors: Maxim Neaga, Roger C. Raphael, Shantanu Sadanand Mundkur, Hebert Walter Pereyra, Yaxian Wang
  • Patent number: 11647004
    Abstract: Preserving distributions of data values of a data asset in a data anonymization operation is provided. Anonymizing data values is performed by transforming sensitive data in a set of columns over rows of the data asset while preserving distribution of the data values in the set of transformed columns to a defined degree using a set of autoencoders and loss function. The autoencoders are base trained from preexisting data in a data assets catalog and actively trained during data dissemination. Parametric coefficients of the loss function are configured and the threshold is generated using policies from an enforcement decision for the data asset and data consumer. The loss function value of a selected row is compared to the threshold. Transformed data values of the selected row are transcribed to an output row when the loss function value is greater than the threshold and disseminated to the data consumer.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Arjun Natarajan, Ashish Kundu, Roger C. Raphael, Aniya Aggarwal, Rajesh M. Desai, Joshua F. Payne, Mu Qiao
  • Publication number: 20230067938
    Abstract: A method, apparatus, system, and computer program code for policy-based enforcement in a data virtualization system is provided. Responsive to receiving a query, a computer identifies a virtual object among a set of connected objects that is represented by a set of data assets and their hierarchical relationships. The virtual object corresponds to a subset of the data assets. The computer identifies a subset of objects according to a cumulative transitive closure of the virtual object over the set of connected objects. The computer identifies a set of policies for the subset of objects. For each object in the subset of objects, the computer determines an intermediate decision according to set of policies, whereby a set of intermediate decisions is formed. The computer deterministically reconciles the set of intermediate decisions to generate a resolved decision. The computer provides access to the queried virtual objects based on the resolved decision.
    Type: Application
    Filed: August 31, 2021
    Publication date: March 2, 2023
    Inventors: Maxim Neaga, Roger C. Raphael, Shantanu Sadanand Mundkur, Hebert Walter Pereyra, Yaxian Wang
  • Patent number: 11556514
    Abstract: Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Mu Qiao, Scott Schumacher, Angineh Aghakiant
  • Publication number: 20220335156
    Abstract: Dynamic data dissemination is provided. A resolved data subject identifier corresponding to a data subject is selected from a set of resolved data subject identifiers existing in rows of a data asset. In response to determining that the resolved data subject identifier does not correspond to a right to forget list, it is determined that the resolved data subject identifier corresponds to a data subject request list. The rows are transformed to anonymize existing pseudo and personal identifiers in cells of the rows that are tied to columns associated with data classes for which specific consent dimensions have been indicated as revoked by the data subject.
    Type: Application
    Filed: April 16, 2021
    Publication date: October 20, 2022
    Inventors: Roger C. Raphael, Rajesh M. Desai, Scott Schumacher, Angineh Aghakiant
  • Publication number: 20220311749
    Abstract: Preserving distributions of data values of a data asset in a data anonymization operation is provided. Anonymizing data values is performed by transforming sensitive data in a set of columns over rows of the data asset while preserving distribution of the data values in the set of transformed columns to a defined degree using a set of autoencoders and loss function. The autoencoders are base trained from preexisting data in a data assets catalog and actively trained during data dissemination. Parametric coefficients of the loss function are configured and the threshold is generated using policies from an enforcement decision for the data asset and data consumer. The loss function value of a selected row is compared to the threshold. Transformed data values of the selected row are transcribed to an output row when the loss function value is greater than the threshold and disseminated to the data consumer.
    Type: Application
    Filed: March 24, 2021
    Publication date: September 29, 2022
    Inventors: Arjun Natarajan, ASHISH KUNDU, Roger C. Raphael, Aniya Aggarwal, Rajesh M. Desai, Joshua F. Payne, Mu Qiao
  • Publication number: 20220309155
    Abstract: An apparatus and related method defend against adversarial queries. A policy enforcement hypergraph is constructed to express a set of security policies. Then, the hypergraph is repeatedly traversed to determine whether a user behavior is changing over time. The user behavior is measured by reference to a vertex or an edge in the hypergraph. If it is determined that the user behavior has changed over time an enforcement action is taken based on a security policy.
    Type: Application
    Filed: March 24, 2021
    Publication date: September 29, 2022
    Inventors: Joshua F. PAYNE, Ashish KUNDU, Arjun NATARAJAN, Roger C. RAPHAEL, Scott SCHUMACHER
  • Publication number: 20220269663
    Abstract: Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.
    Type: Application
    Filed: February 24, 2021
    Publication date: August 25, 2022
    Inventors: Roger C. Raphael, Mu Qiao, Scott Schumacher, Angineh Aghakiant
  • Patent number: 11372831
    Abstract: Processing a database query for sets of data includes assigning a unique identifier from an integer space to each entity within data and creating one or more sets of entities each pertaining to a corresponding entity within the data. A representation is then generated on disk for each set of entities, wherein each representation encompasses and is suited for a range of the unique identifiers of entities within a corresponding set and indicates a presence of an entity within that corresponding set. Finally, a query is processed based on the representation for each set of entities to retrieve data satisfying the query, wherein the representation provides a constant time for association and dissociation operations that are append-only operations with deferred merge and automatic filtering of deleted and duplicate entities at query time.
    Type: Grant
    Filed: July 29, 2019
    Date of Patent: June 28, 2022
    Assignee: International Business Machines Corporation
    Inventors: Rajesh M. Desai, Magesh Jayapandian, Iun V. Leong, Justo L. Perez, Roger C. Raphael, Gabriel Valencia
  • Patent number: 11362997
    Abstract: A method, apparatus, system, and computer program product evaluate an information asset with a corpus of policies in conjunction with the context of access including a specific user. A large corresponding set of rules in the policy corpus are identified by computer system. A continuous process of rule evaluation occurs against information asset metadata wherein a series of processing including set of common subexpressions between the predicates of all active rules, pre-evaluation, compaction and storage are identified by the computer system in the policy and rule corpus. Metadata for the information asset is applied by the computer system to the set of common subexpressions to form partially evaluated rules for the policy. The partially evaluated rules henceforth compacted are stored by the computer system in association with the information asset.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: June 14, 2022
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Iun Veng Leong, Brian Joseph Owings
  • Patent number: 11347891
    Abstract: Disclosed is a computer-implemented method to identify and anonymize personal information, the method comprising analyzing a first corpus with a personal information sniffer, wherein the first corpus includes unstructured text, wherein the personal information sniffer is configured to detect a set of types of personal information, and wherein the personal information sniffer produces a first set of results. The method comprises analyzing the first corpus with a set of annotators, wherein each annotator is configured to identify all instances of a type of personal information in the corpus, and wherein the set of annotators produces a second set of results. The method comprises comparing the first set of results and the second set of results, determining, the first set of results does not match the second set of results, and updating, based on the determining, the personal information sniffer.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: May 31, 2022
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Iun Veng Leong, Ramakanta Samal, Ansel Blume
  • Patent number: 11321479
    Abstract: Enforcement of policies for tabular data access as a collection of columns over a plurality of different information assets is provided. In an enforcement knowledge graph, information asset-assigned terms are found that correspond to information assets in a virtual information asset that references a set of tabular data. Transitive closures of the information asset-assigned terms are found in a business glossary to form a table of business glossary terms. Term intersection is determined between a hash table of any column-assigned terms and the table of business glossary terms. The information assets are assigned to the virtual information asset when the term intersection is not empty. A set of policy rules associated with the set of tabular data and a context of a user making a data access request to the set of tabular data is applied to the virtual information asset to determine an access enforcement decision.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: May 3, 2022
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Ety Khaitzin, Scott Schumacher, Arjun Natarajan
  • Patent number: 11308235
    Abstract: A method, system and computer program product for detecting sensitive personal information in a storage device. A block delta list containing a list of changed blocks in the storage device is processed. After identifying the changed blocks from the block delta list, a search is performed on those identified changed blocks for sensitive personal information using a character scanning technique. After identifying a changed block deemed to contain sensitive personal information, the changed block is translated from the block level to the file level using a hierarchical reverse mapping technique. By only analyzing the changed blocks to determine if they contain sensitive personal information, a lesser quantity of blocks needs to be processed in order to detect sensitive personal information in the storage device in near real-time. In this manner, sensitive personal information is detected in the storage device using fewer computing resources in a shorter amount of time.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: April 19, 2022
    Assignee: International Business Machines Corporation
    Inventors: Rajesh M. Desai, Mu Qiao, Roger C. Raphael, Ramani Routray
  • Patent number: 11283839
    Abstract: Predicting access impact of a plurality of rule changes on a corpus of information assets is provided. A set of affected rules in a new rule space for controlling access to the corpus of information assets is received. The set of affected rules is shredded to identify right-hand side terms contained in predication blocks of the set of affected rules. An enforcement knowledge graph is traversed to identify a set of hot information assets having same terms as the right-hand side terms of the set of affected rules. The set of hot information assets having the same terms as the right-hand side terms of the set of affected rules is added to a hash table of hot information assets.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Iun Veng Leong, Angineh Aghakiant, Immalla Grace Chen, Scott Schumacher
  • Patent number: 11250527
    Abstract: Embodiments generally relate to providing litigation management for multiple remote content systems using asynchronous bi-directional replication pipelines. In some embodiments, a method includes retrieving, at one or more inbound replicators of one or more respective bi-directional pipelines, metadata associated with documents stored in one or more content repositories. The method further includes resolving, at a governance control hub, conflicts associated with legal holds on one or more of the documents based on the metadata. The method further includes sending conflict resolution results from one or more outbound applicators of the bi-directional pipelines to the content repositories, where the content repositories enforce legal holds on the documents.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: February 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Ronald L. Rathgeber, Rajesh M. Desai, Gabriel Valencia, Justo Perez, William Russell Belknap, Sudhakar Basireddy
  • Patent number: 11210410
    Abstract: Serving data assets based on security policies is provided. A request to access an asset received from a user having a particular context is evaluated based on a set of asset access enforcement policies. An asset access policy enforcement decision is generated based on evaluating the request. It is determined whether the asset access policy enforcement decision is to transform particular data of the asset prior to allowing access. In response to determining that the asset access policy enforcement decision is to transform the particular data of the asset prior to allowing access, a transformation specification that includes an ordered subset of unit transformations for transforming the particular data of the asset is generated based on the particular context of the user and the set of asset access enforcement policies. A transformed asset is generated by applying the transformation specification to the asset transforming the particular data of the asset.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: December 28, 2021
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Hani Talal Jamjoom, Rajesh M. Desai, Iun Veng Leong, Uttama Shakya, Arjun Natarajan
  • Patent number: 11184402
    Abstract: A method trains a neural network to recognize whether a resource is authorized to be returned to a requester. One or more processors train a neural network to traverse a policy enforcement hypergraph in order to identify a security policy to be used for a resource request and to authorize a use of a requested resource by a requester. The policy enforcement hypergraph is derived from a policy enforcement graph that expresses a set of security profiles for resources and requesters. The processor(s) receive a resource request for a requested resource from a requester, where the resource request includes a description of the requester. A system/user inputs a description of the received resource request and a description of the policy enforcement hypergraph into the trained neural network in order to selectively return the requested resource to the requester.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: November 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ashish Kundu, Joshua Payne, Arjun Natarajan, Roger C. Raphael, Scott Schumacher
  • Patent number: 11178186
    Abstract: A method, apparatus, system, and computer program product for evaluating enforcement decisions on an asset using a policy. Rules in the policy are applied by a computer system to the asset taking into account a context for a request to access the asset in response receiving to the request to access the asset, and wherein the rules in the policy determine whether access to the asset is allowed. A determination is made by the computer system as to whether a conflict is present in an initial decision made using the rules in the policy. A set of conflict resolution processes are applied by the computer system when the conflict is present such that a final decision is made on the request to access the asset.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: November 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Ety Khaitzin, Shalu Agrawal, Angineh Aghakiant