Patents by Inventor Berthold Reinwald

Berthold Reinwald has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240070522
    Abstract: Providing a representative dataset from an initial dataset by accessing a dataset associated with a machine learning model, receiving input parameters associated with the representative dataset selection, the input parameters including an evaluation metric, determining a density of a plurality of datapoints associated with the dataset, training a first iteration of a machine learning model using a first data point selected according to the density, determining a first value of the evaluation metric for the first iteration of the machine learning model, generating a representative subset based on the first value of the evaluation metric value, and providing the representative dataset and a final machine learning model trained using the representative dataset.
    Type: Application
    Filed: August 23, 2022
    Publication date: February 29, 2024
    Inventors: Shaikh Shahriar Quader, Aindrila Basak, Adrian Mahjour, Petr Novotny, CARLO APPUGLIESE, Berthold Reinwald, Dheeraj Arremsetty
  • Patent number: 11520986
    Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.
    Type: Grant
    Filed: July 24, 2020
    Date of Patent: December 6, 2022
    Assignee: International Business Machines Corporation
    Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
  • Publication number: 20220245425
    Abstract: A knowledge graph embedding method, system, and computer program product using a computing device to embed a knowledge graph using a graph convolutional network, the method including learning, by the computing device, an embedding of the knowledge graph that includes entities, relations, and edges, weighing, by the computing device, initial feature vectors of nodes and a convolutional layer output to compute a weight and modifying the embedding based on the weight, and using, by the computing device, the modified embedding to perform a task related to the knowledge graph.
    Type: Application
    Filed: January 29, 2021
    Publication date: August 4, 2022
    Inventors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Christoph Adrian Miksovic Czasch, Thomas Gschwind, Paolo Scotton
  • Publication number: 20220245460
    Abstract: A graph neural network (GNN) training method, system, and computer program product in a graph, include generating, by the computing device, one or more one or more hypothetical edges between two or more nodes of a plurality of nodes of a graph neural network, testing, by the computing device, to determine whether the one or more generated hypothetical edges should be connected by using negative sampling, and permanently connecting, by the computing device, the one or more tested hypothetical edges if the negative sampling indicates the connectivity.
    Type: Application
    Filed: January 29, 2021
    Publication date: August 4, 2022
    Inventors: Xiao Qin, Nasrullah Sheikh, Berthold Reinwald, Lingfei Wu
  • Publication number: 20220197977
    Abstract: A computer-implemented method is provided for predicting future data values or target labels of multivariate time series data. The method includes receiving the multivariate time series data having present values, systematic missing values, and random missing values. The method further includes masking the present values, the systematic missing values, and the random missing values using triplet encodings. The method also includes determining time intervals between current missing values, from among the systematic missing values and the random missing values, and immediately preceding ones of the present values. The method additionally includes training, by a computing device, at least one recurrent neural network with the triplet encodings, the time intervals, and multivariate time series data to perform a feedforward pass on the recurrent neural network predicting the future data values or the target labels.
    Type: Application
    Filed: December 22, 2020
    Publication date: June 23, 2022
    Inventors: Mu Qiao, Yuya Jeremy Ong, Prithviraj Sen, Berthold Reinwald
  • Publication number: 20220188567
    Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.
    Type: Application
    Filed: December 11, 2020
    Publication date: June 16, 2022
    Inventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
  • Publication number: 20220092427
    Abstract: A method, a computer program product, and a system for non-obvious relationship detection. The method includes receiving a knowledge and inputting a first node and a second node from the knowledge graph into a twin neural network. The method also includes embedding the first node and the second node, aggregating neighborhood information and position information into the node embeddings. The method further includes concatenating the neighborhood information and the position information of the first node and the second node to produce a first output vector and a second output vector. The method also includes generating a final score by comparing the first output vector with the second output vector. The final score indicates a probability of a non-obvious relationship between the first node and the second node.
    Type: Application
    Filed: September 21, 2020
    Publication date: March 24, 2022
    Inventors: Phillipp Müller, Xiao Qin, Balaji Ganesan, Berthold Reinwald, Nasrullah Sheikh
  • Publication number: 20220058465
    Abstract: In an approach for forecasting in multivariate irregularly sampled time series, a processor receives time series data having one or more missing values. A processor determines, from the time series data, non-missing values present in the time series data. A processor determines, from the time series data, zero or more mask values for the time series data. A processor determines time interval values. A processor inputs the one or more missing values, the non-missing values, the zero or more mask values, and the time interval values into a recurrent neural network. A processor determines a predicted value for the one or more missing values.
    Type: Application
    Filed: August 24, 2020
    Publication date: February 24, 2022
    Inventors: Prithviraj Sen, Berthold Reinwald, Shivam Srivastava
  • Publication number: 20220027561
    Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.
    Type: Application
    Filed: July 24, 2020
    Publication date: January 27, 2022
    Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
  • Patent number: 11194826
    Abstract: A computer-implemented method is provided that includes identifying an input dataset formatted as an input matrix, the input matrix including a plurality of rows and a plurality of columns. The computer-implemented method also includes dividing the input matrix into a plurality of input matrix blocks. Further, the computer-implemented method includes distributing the input matrix blocks to a plurality of different machines across a distributed filesystem, and sampling, by at least two of the different machines in parallel, at least two of the input matrix blocks. Finally, the computer-implemented method includes generating at least one sample matrix based on the sampling of the at least two of the input matrix blocks.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: December 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Sebastian Schelter
  • Patent number: 10534590
    Abstract: The embodiments described herein relate to recompiling an execution plan of a machine-learning program during runtime. An execution plan of a machine-learning program is compiled. In response to identifying a directed acyclic graph of high-level operations (HOP DAG) for recompilation during runtime, the execution plan is dynamically recompiled. The dynamic recompilation includes updating statistics and dynamically rewriting one or more operators of the identified HOP DAG, recomputing memory estimates of operators of the rewritten HOP DAG based on the updated statistics and rewritten operators, constructing a directed acyclic graph of low-level operations (LOP DAG) corresponding to the rewritten HOP DAG based in part on the recomputed memory estimates, and generating runtime instructions based on the LOP DAG.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Matthias Boehm, Berthold Reinwald, Shirish Tatikonda
  • Patent number: 10521435
    Abstract: A method that includes generating, in a query pre-processor, a set of pre-computed materialized sub-graphs by executing a pre-processing dynamic random-walk based search for a bin of terms. The method also includes receiving, in a query processor, a search query having at least one search query term. In response to receiving the search query, the method includes accessing the set of pre-computed materialized sub-graphs. The accessing includes accessing a text index based on the search query term to retrieve a corresponding term group identifier and accessing the corresponding pre-computed materialized sub-graph based on the term group identifier. The method also includes executing a dynamic random-walk based search on only the corresponding pre-computed materialized sub-graph and based on the executing, retrieving nodes in the dataset and transmitting the nodes as results of the query.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: December 31, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrey Balmin, Heasoo Hwang, Erik Nijkamp, Berthold Reinwald
  • Publication number: 20190171641
    Abstract: A computer-implemented method is provided that includes identifying an input dataset formatted as an input matrix, the input matrix including a plurality of rows and a plurality of columns. The computer-implemented method also includes dividing the input matrix into a plurality of input matrix blocks. Further, the computer-implemented method includes distributing the input matrix blocks to a plurality of different machines across a distributed filesystem, and sampling, by at least two of the different machines in parallel, at least two of the input matrix blocks. Finally, the computer-implemented method includes generating at least one sample matrix based on the sampling of the at least two of the input matrix blocks.
    Type: Application
    Filed: February 8, 2019
    Publication date: June 6, 2019
    Inventors: Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Sebastian Schelter
  • Patent number: 10268461
    Abstract: A method for global data flow optimization for machine learning (ML) programs. The method includes receiving, by a storage device, an initial plan for an ML program. A processor builds a nested global data flow graph representation using the initial plan. Operator directed acyclic graphs (DAGs) are connected using crossblock operators according to inter-block data dependencies. The initial plan for the ML program is re-written resulting in an optimized plan for the ML program with respect to its global data flow properties. The re-writing includes re-writes of: configuration dataflow properties, operator selection and structural changes.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: April 23, 2019
    Assignee: International Business Machines Corporation
    Inventors: Matthias Boehm, Mathias Peters, Berthold Reinwald, Shirish Tatikonda
  • Patent number: 10228922
    Abstract: Parallel execution of machine learning programs is provided. Program code is received. The program code contains at least one parallel for statement having a plurality of iterations. A parallel execution plan is determined for the program code. According to the parallel execution plan, the plurality of iterations is partitioned into a plurality of tasks. Each task comprises at least one iteration. The iterations of each task are independent. Data required by the plurality of tasks is determined. An access pattern by the plurality of tasks of the data is determined. The data is partitioned based on the access pattern.
    Type: Grant
    Filed: January 12, 2016
    Date of Patent: March 12, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Matthias Boehm, Douglas Burdick, Berthold Reinwald, Prithviraj Sen, Shirish Tatikonda, Yuanyuan Tian, Shivakumar Vaithyanathan
  • Patent number: 10229168
    Abstract: A computer-implemented method is provided that includes identifying an input dataset formatted as an input matrix, the input matrix including a plurality of rows and a plurality of columns. The computer-implemented method also includes dividing the input matrix into a plurality of input matrix blocks. Further, the computer-implemented method includes distributing the input matrix blocks to a plurality of different machines across a distributed filesystem, and sampling, by at least two of the different machines in parallel, at least two of the input matrix blocks. Finally, the computer-implemented method includes generating at least one sample matrix based on the sampling of the at least two of the input matrix blocks.
    Type: Grant
    Filed: November 20, 2015
    Date of Patent: March 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Sebastian Schelter
  • Patent number: 10223762
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: March 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
  • Patent number: 10198291
    Abstract: One embodiment provides a method for runtime piggybacking of concurrent data-parallel jobs in task-parallel machine learning (ML) programs including intercepting, by a processor, executable jobs including executable map reduce (MR) jobs and looped jobs in a job stream. The processor queues the executable jobs, and applies runtime piggybacking of multiple jobs by processing workers of different types. Runtime piggybacking for a ParFOR (parallel for) ML program is optimized including configuring the runtime piggybacking based on processing worker type, degree of parallelism and minimum time thresholds.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: February 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: Matthias Boehm, Berthold Reinwald, Shirish Tatikonda
  • Publication number: 20180260246
    Abstract: One embodiment provides a method for runtime piggybacking of concurrent data-parallel jobs in task-parallel machine learning (ML) programs including intercepting, by a processor, executable jobs including executable map reduce (MR) jobs and looped jobs in a job stream. The processor queues the executable jobs, and applies runtime piggybacking of multiple jobs by processing workers of different types. Runtime piggybacking for a ParFOR (parallel for) ML program is optimized including configuring the runtime piggybacking based on processing worker type, degree of parallelism and minimum time thresholds.
    Type: Application
    Filed: March 7, 2017
    Publication date: September 13, 2018
    Inventors: Matthias Boehm, Berthold Reinwald, Shirish Tatikonda
  • Publication number: 20180211357
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.
    Type: Application
    Filed: March 16, 2018
    Publication date: July 26, 2018
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda