Patents by Inventor Roger C. Raphael

Roger C. Raphael has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11157477
    Abstract: A method, computer system, and computer program product for segment differential-based document text-index modeling are provided. The embodiment may include receiving, by a processor, a document with a valid document ID and version ID tuple. The embodiment may also include determining the received document is a new version of a previously stored document and consequently multiplexing versions of the document into a single indexed document. The embodiment may further include segmenting the received document and building a token vector. The embodiment may also include calculating a difference between the received new version of the document and the previously stored document using information obtained from the segmentation. The embodiment may further include in response to the calculated difference being below a pre-configured threshold value, discarding the received new version.
    Type: Grant
    Filed: November 28, 2018
    Date of Patent: October 26, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roger C. Raphael, Rajesh M. Desai, Fumihiko Terui, Justo L. Perez, Thomas Hampp
  • Patent number: 11151132
    Abstract: Provided are a computer program product, system, and method for distributed processing of a query with distributed posting lists. A dispatch map has entries, wherein each entry identifies one of a plurality of terms in a dictionary, wherein for each of the terms there is a posting list identifying zero or more objects including the term, wherein at least one of the dispatch map entries indicate at least one distributed processing element including the posting list for the term. The dispatch map is used to dispatch sub-expressions comprising portions of a query to distributed processing elements having the posting lists for terms in the sub-expressions, wherein the distributed processing elements distributed the sub-expressions execute the sub-expressions on the posting lists for the terms in the sub-expression.
    Type: Grant
    Filed: June 13, 2019
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Rajesh M. Desai, Alon S. Housfater, Roger C. Raphael, Paul S. Taylor
  • Publication number: 20210306377
    Abstract: A method trains a neural network to recognize whether a resource is authorized to be returned to a requester. One or more processors train a neural network to traverse a policy enforcement hypergraph in order to identify a security policy to be used for a resource request and to authorize a use of a requested resource by a requester. The policy enforcement hypergraph is derived from a policy enforcement graph that expresses a set of security profiles for resources and requesters. The processor(s) receive a resource request for a requested resource from a requester, where the resource request includes a description of the requester. A system/user inputs a description of the received resource request and a description of the policy enforcement hypergraph into the trained neural network in order to selectively return the requested resource to the requester.
    Type: Application
    Filed: March 25, 2020
    Publication date: September 30, 2021
    Inventors: ASHISH KUNDU, JOSHUA PAYNE, ARJUN NATARAJAN, ROGER C. RAPHAEL, SCOTT SCHUMACHER
  • Patent number: 11132755
    Abstract: Provided are techniques for extracting, deriving, and using legal matter semantics to generate e-discovery queries in an e-discovery system. A semantic knowledge graph is iteratively built by receiving meet and confer document instances, legal matter types, historical e-discovery queries for different legal matters, and legal semantic types extracted from the historical e-discovery queries. The legal semantic types are added to the semantic knowledge graph, and a list of terms that serve as a basis of an initial query are identified. An e-discovery query is generated for an e-discovery system. The e-discovery query is modified using the semantic knowledge graph and additional input by receiving a legal matter type and meet and confer information, obtaining the legal semantic types that are relevant to the legal matter type and the meet and confer information, and modifying the e-discovery query. The modified e-discovery query is provided. Then, the modified e-discovery query is executed.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: September 28, 2021
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Nazrul Islam, Satwik Hebbar
  • Publication number: 20210297451
    Abstract: A method, apparatus, system, and computer program product for evaluating enforcement decisions on an asset using a policy. Rules in the policy are applied by a computer system to the asset taking into account a context for a request to access the asset in response receiving to the request to access the asset, and wherein the rules in the policy determine whether access to the asset is allowed. A determination is made by the computer system as to whether a conflict is present in an initial decision made using the rules in the policy. A set of conflict resolution processes are applied by the computer system when the conflict is present such that a final decision is made on the request to access the asset.
    Type: Application
    Filed: March 19, 2020
    Publication date: September 23, 2021
    Inventors: Roger C. Raphael, Rajesh M. Desai, Ety Khaitzin, Shalu Agrawal, Angineh Aghakiant
  • Publication number: 20210263977
    Abstract: Discovering second-order documents and latent custodians in an e-discovery system is provided. A list of first-order documents and document custodians within a base state of the e-discovery system are identified based on a plurality of terms corresponding to a meet and confer practice for a legal matter instance. The plurality of terms is masked within the first-order documents. The first-order documents having the plurality of terms masked are divided into groups. A list of second-order documents is generated from a group of documents. A list of second-order document custodians is generated based on corresponding custodian relationships to second-order documents. Finally, each second-order document custodian in the list of second-order document custodians that has a corresponding rank exceeding a defined rank threshold level is identified as an official document custodian in the e-discovery system.
    Type: Application
    Filed: February 20, 2020
    Publication date: August 26, 2021
    Inventors: Roger C. Raphael, Rajesh M. Desai, Nazrul Islam, Magesh Jayapandian, Jojo Joseph
  • Patent number: 11100598
    Abstract: Embodiments generally relate to providing litigation management for multiple remote content systems using asynchronous bi-directional replication pipelines. In some embodiments, a method includes retrieving, at one or more inbound replicators of one or more respective bi-directional pipelines, metadata associated with documents stored in one or more content repositories. The method further includes resolving, at a governance control hub, conflicts associated with legal holds on one or more of the documents based on the metadata. The method further includes sending conflict resolution results from one or more outbound applicators of the bi-directional pipelines to the content repositories, where the content repositories enforce legal holds on the documents.
    Type: Grant
    Filed: January 23, 2018
    Date of Patent: August 24, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roger C. Raphael, Ronald L. Rathgeber, Rajesh M. Desai, Gabriel Valencia, Justo Perez, William Russell Belknap, Sudhakar Basireddy
  • Publication number: 20210176279
    Abstract: Predicting access impact of a plurality of rule changes on a corpus of information assets is provided. A set of affected rules in a new rule space for controlling access to the corpus of information assets is received. The set of affected rules is shredded to identify right-hand side terms contained in predication blocks of the set of affected rules. An enforcement knowledge graph is traversed to identify a set of hot information assets having same terms as the right-hand side terms of the set of affected rules. The set of hot information assets having the same terms as the right-hand side terms of the set of affected rules is added to a hash table of hot information assets.
    Type: Application
    Filed: December 6, 2019
    Publication date: June 10, 2021
    Inventors: Roger C. Raphael, Iun Veng Leong, Angineh Aghakiant, Immalla Grace Chen, Scott Schumacher
  • Publication number: 20210173952
    Abstract: Enforcement of policies for tabular data access as a collection of columns over a plurality of different information assets is provided. In an enforcement knowledge graph, information asset-assigned terms are found that correspond to information assets in a virtual information asset that references a set of tabular data. Transitive closures of the information asset-assigned terms are found in a business glossary to form a table of business glossary terms. Term intersection is determined between a hash table of any column-assigned terms and the table of business glossary terms. The information assets are assigned to the virtual information asset when the term intersection is not empty. A set of policy rules associated with the set of tabular data and a context of a user making a data access request to the set of tabular data is applied to the virtual information asset to determine an access enforcement decision.
    Type: Application
    Filed: December 6, 2019
    Publication date: June 10, 2021
    Inventors: Roger C. Raphael, Ety Khaitzin, Scott Schumacher, Arjun Natarajan
  • Publication number: 20210119970
    Abstract: A method, apparatus, system, and computer program product evaluate an information asset with a corpus of policies in conjunction with the context of access including a specific user. A large corresponding set of rules in the policy corpus are identified by computer system. A continuous process of rule evaluation occurs against information asset metadata wherein a series of processing including set of common subexpressions between the predicates of all active rules, pre-evaluation, compaction and storage are identified by the computer system in the policy and rule corpus. Metadata for the information asset is applied by the computer system to the set of common subexpressions to form partially evaluated rules for the policy. The partially evaluated rules henceforth compacted are stored by the computer system in association with the information asset.
    Type: Application
    Filed: October 16, 2019
    Publication date: April 22, 2021
    Inventors: Roger C. Raphael, Rajesh M. Desai, Iun Veng Leong, Brian Joseph Owings
  • Patent number: 10984316
    Abstract: A method loads training samples and forms training data set from the training samples. The method uses the bidirectional LSTM recurrent neural network that includes one or more input cells and one or more output cells and trains it with the training data set. The method determines a sensitive information and confidence values based on analyzing a text with the trained neural network. The method selects predicted samples from the text, where the sensitive information confidence value corresponding to a one or more predicted samples is above a threshold value, based on determining that a sensitive information accuracy has improved. The method forms a new training data set, where the new training data set comprises the samples and the verified one or more predicted samples based on the verified one or more predicted samples, and trains the previously trained neural network with the new training data set.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: April 20, 2021
    Assignee: International Business Machines Corporation
    Inventors: Mu Qiao, Yuya J. Ong, Ramani Routray, Roger C. Raphael
  • Publication number: 20210109724
    Abstract: A method adapts a dataflow instance. A set of source data nodes, a set of terminal data nodes, and a set of computation nodes in the dataflow are identified from a directed graph. The set of computation nodes performs operations on data flowing from the set of source data nodes through computation nodes and onwards to terminal data nodes in the dataflow. The data nodes are evaluated with policies and a user context. A number of transformation compute nodes is computed from the policy decisions and added downstream of the set of source data nodes and optionally upstream of the set of terminal data nodes when the data, the dataflow or system does not meet the declared policies without the necessary computed number of transformation compute nodes. The number of transformation compute nodes are an additional portion of the overall set of computation nodes to enforce the declared policy.
    Type: Application
    Filed: October 11, 2019
    Publication date: April 15, 2021
    Inventors: Roger C. Raphael, Rajesh M. Desai, Sonali Surange, Hani Talal Jamjoon
  • Patent number: 10956135
    Abstract: A method adapts a dataflow instance. A set of source data nodes, a set of terminal data nodes, and a set of computation nodes in the dataflow are identified from a directed graph. The set of computation nodes performs operations on data flowing from the set of source data nodes through computation nodes and onwards to terminal data nodes in the dataflow. The data nodes are evaluated with policies and a user context. A number of transformation compute nodes is computed from the policy decisions and added downstream of the set of source data nodes and optionally upstream of the set of terminal data nodes when the data, the dataflow or system does not meet the declared policies without the necessary computed number of transformation compute nodes. The number of transformation compute nodes are an additional portion of the overall set of computation nodes to enforce the declared policy.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Roger C. Raphael, Rajesh M. Desai, Sonali Surange, Hani Talal Jamjoon
  • Publication number: 20210081550
    Abstract: Serving data assets based on security policies is provided. A request to access an asset received from a user having a particular context is evaluated based on a set of asset access enforcement policies. An asset access policy enforcement decision is generated based on evaluating the request. It is determined whether the asset access policy enforcement decision is to transform particular data of the asset prior to allowing access. In response to determining that the asset access policy enforcement decision is to transform the particular data of the asset prior to allowing access, a transformation specification that includes an ordered subset of unit transformations for transforming the particular data of the asset is generated based on the particular context of the user and the set of asset access enforcement policies. A transformed asset is generated by applying the transformation specification to the asset transforming the particular data of the asset.
    Type: Application
    Filed: September 17, 2019
    Publication date: March 18, 2021
    Inventors: Roger C. Raphael, Hani Talal Jamjoom, Rajesh M. Desai, Iun Veng Leong, Uttama Shakya, Arjun Natarajan
  • Publication number: 20210064781
    Abstract: Disclosed is a computer-implemented method to identify and anonymize personal information, the method comprising analyzing a first corpus with a personal information sniffer, wherein the first corpus includes unstructured text, wherein the personal information sniffer is configured to detect a set of types of personal information, and wherein the personal information sniffer produces a first set of results. The method comprises analyzing the first corpus with a set of annotators, wherein each annotator is configured to identify all instances of a type of personal information in the corpus, and wherein the set of annotators produces a second set of results. The method comprises comparing the first set of results and the second set of results, determining, the first set of results does not match the second set of results, and updating, based on the determining, the personal information sniffer.
    Type: Application
    Filed: June 19, 2019
    Publication date: March 4, 2021
    Inventors: Roger C. Raphael, Rajesh M. Desai, IUN VENG LEONG, RAMAKANTA SAMAL, Ansel Blume
  • Publication number: 20210028951
    Abstract: A data protection policy enforcement operation is provided for enforcing data protection policies in collaborative framework environments which permit a plurality of collaborators to jointly work on projects requiring access to project data assets. For this purpose, a method includes establishing, by a computer device, a plurality of rules for evaluating actions performed in a collaborative environment, the collaborative environment including a plurality of collaborators and a plurality of data assets associated with collaboration between the collaborators; in response to a request to perform an action in the collaborative environment, applying the rules to the plurality of data assets related to the data assets to create a plurality of determinations; in response to each of the plurality of determinations being allowed, allowing the action to be performed; and, in response to at least one of the plurality of determinations being denied, preventing the action from being performed.
    Type: Application
    Filed: July 26, 2019
    Publication date: January 28, 2021
    Inventors: Roger C. RAPHAEL, Rajesh M. DESAI, Olena WOOLF, Arron LA
  • Patent number: 10897367
    Abstract: A data protection policy enforcement operation is provided for enforcing data protection policies in collaborative framework environments which permit a plurality of collaborators to jointly work on projects requiring access to project data assets. For this purpose, a method includes establishing, by a computer device, a plurality of rules for evaluating actions performed in a collaborative environment, the collaborative environment including a plurality of collaborators and a plurality of data assets associated with collaboration between the collaborators; in response to a request to perform an action in the collaborative environment, applying the rules to the plurality of data assets related to the data assets to create a plurality of determinations; in response to each of the plurality of determinations being allowed, allowing the action to be performed; and, in response to at least one of the plurality of determinations being denied, preventing the action from being performed.
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: January 19, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roger C. Raphael, Rajesh M. Desai, Olena Woolf, Arron La
  • Patent number: 10803254
    Abstract: A data structure is generated containing enumerators for data types of a domain, text forms of the enumerators and context patterns for the text forms. The data structure also includes information extraction rules that are associated with the enumerators. The data structure is updated with additional context patterns and text forms that are identified within a set of documents to which text analytic annotators are to be tuned. The set of documents are analyzed against the updated data structure and additional extraction rules are generated based on the analysis.
    Type: Grant
    Filed: July 13, 2018
    Date of Patent: October 13, 2020
    Assignee: International Business Machines Corporation
    Inventors: Harish Deshmukh, Philip E. Parker, Roger C. Raphael, Paul S. Taylor, Gabriel Valencia
  • Patent number: 10783112
    Abstract: Provided are techniques for a high performance compliance mechanism for structured and unstructured data in an enterprise. A record to represent a collection of structured objects is generated. The record is stored in a file plan container associated with a disposition schedule. The collection of the structured objects represented by the record is disposed in accordance with the disposition schedule.
    Type: Grant
    Filed: March 27, 2017
    Date of Patent: September 22, 2020
    Assignee: International Business Machines Corporation
    Inventors: William R. Belknap, Rajesh M. Desai, Roger C. Raphael, Ronald L. Rathgeber
  • Patent number: 10713218
    Abstract: An electronic-discovery system and method, wherein content items and hold anchors are stored in a repository, tracking objects and representational anchor objects are stored in a database system, and the tracking objects represent the content items and the representational anchor objects represent the hold anchors. A first hold anchor is used for placing a hold on the content items for a first defined period of time, and a first representational anchor object and one or more of the tracking objects are used for representing and tracking the holds for the first defined period of time. When the first defined period of time expires, a second hold anchor is used for placing the hold on the content items for a second defined period of time, and a second representational anchor object and the tracking objects are used for representing and tracking the holds for the second defined period of time.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: July 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Rajesh M. Desai, Aidon P. Jennery, Lijing E. Lin, Roger C. Raphael