Patents by Inventor Berthold Reinwald

Berthold Reinwald has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DISTRIBUTED TRAINING OF GRAPH NEURAL NETWORKS (GNN) BASED KNOWLEDGE GRAPH EMBEDDING MODELS

Publication number: 20240338551

Abstract: Aspects of the invention include techniques for scaling the training of graph neural network (GNN)-based knowledge graph embedding models for link prediction. A non-limiting example method includes receiving a knowledge graph of a data set and partitioning the knowledge graph into a plurality of partitions. At least one partition of the plurality of partitions is expanded. The method includes launching a training process for each partition of the plurality of partitions such that, during a training epoch, a respective training process samples positive and negative samples from a respective partition. An edge mini batch is formed for each training process and a computational graph is generated for each edge mini batch.

Type: Application

Filed: April 4, 2023

Publication date: October 10, 2024

Inventors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald
SCALABLE EVOLVING INCEPTION GRAPH NEURAL NETWORKS FOR DYNAMIC GRAPHS

Publication number: 20240330650

Abstract: Aspects include techniques for predicting system behaviors using a trained machine learning model. Aspects include receiving a sequence of snapshots of DTDGs, each including a plurality of nodes and generating node embeddings and transformation weight matrices for each of the plurality of nodes using a multi-hop parameter-free message passing operation. Aspects also include applying graph filters for each snapshot based on the plurality of node embeddings and a plurality of weight matrices for each of the plurality of nodes of the snapshot and concatenating the graph filters for each of the sequence of snapshots to create a final graph embedding for each snapshot. Aspects further include processing, by a self-attention layer, the final graph embedding for each snapshot as a sequence, a final embedding for each node and predicting a node value for a node of a next DTDG according to the final embedding for each node.

Type: Application

Filed: March 28, 2023

Publication date: October 3, 2024

Inventors: Xiao Qin, Nasrullah Sheikh, Berthold Reinwald
ACTIVE LEARNING FOR GRAPH NEURAL NETWORK BASED SEMANTIC SCHEMA ALIGNMENT

Publication number: 20240330693

Abstract: Embodiments are related to a technique for active learning for graph neural network based semantic schema alignment. The technique includes generating, by a first machine learning model executed on a processor, node embeddings having node pairs of a first schema and a second schema. The technique includes predicting, by a second machine learning model executed on the processor, a label output for the node pairs. The technique includes clustering the node pairs into a cluster output, determining that the label output and the cluster output are in a disagreement for at least one node pair of the node pairs, and in response to displaying the at least one node pair to a subject matter expert to generate a label for the at least one node pair, using the label for the at least one node pair as training data to further train the second machine learning model.

Type: Application

Filed: March 28, 2023

Publication date: October 3, 2024

Inventors: Abdul H. Quamar, Xiao Qin, Berthold Reinwald, Venkata Vamsikrishna Meduri
Forecasting in multivariate irregularly sampled time series with missing values

Patent number: 12050980

Abstract: In an approach for forecasting in multivariate irregularly sampled time series, a processor receives time series data having one or more missing values. A processor determines, from the time series data, non-missing values present in the time series data. A processor determines, from the time series data, zero or more mask values for the time series data. A processor determines time interval values. A processor inputs the one or more missing values, the non-missing values, the zero or more mask values, and the time interval values into a recurrent neural network. A processor determines a predicted value for the one or more missing values.

Type: Grant

Filed: August 24, 2020

Date of Patent: July 30, 2024

Assignee: International Business Machines Corporation

Inventors: Prithviraj Sen, Berthold Reinwald, Shivam Srivastava
Predicting multivariate time series with systematic and random missing values

Patent number: 12039002

Abstract: A computer-implemented method is provided for predicting future data values or target labels of multivariate time series data. The method includes receiving the multivariate time series data having present values, systematic missing values, and random missing values. The method further includes masking the present values, the systematic missing values, and the random missing values using triplet encodings. The method also includes determining time intervals between current missing values, from among the systematic missing values and the random missing values, and immediately preceding ones of the present values. The method additionally includes training, by a computing device, at least one recurrent neural network with the triplet encodings, the time intervals, and multivariate time series data to perform a feedforward pass on the recurrent neural network predicting the future data values or the target labels.

Type: Grant

Filed: December 22, 2020

Date of Patent: July 16, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mu Qiao, Yuya Jeremy Ong, Prithviraj Sen, Berthold Reinwald
SELECTING A HIGH COVERAGE DATASET

Publication number: 20240070522

Abstract: Providing a representative dataset from an initial dataset by accessing a dataset associated with a machine learning model, receiving input parameters associated with the representative dataset selection, the input parameters including an evaluation metric, determining a density of a plurality of datapoints associated with the dataset, training a first iteration of a machine learning model using a first data point selected according to the density, determining a first value of the evaluation metric for the first iteration of the machine learning model, generating a representative subset based on the first value of the evaluation metric value, and providing the representative dataset and a final machine learning model trained using the representative dataset.

Type: Application

Filed: August 23, 2022

Publication date: February 29, 2024

Inventors: Shaikh Shahriar Quader, Aindrila Basak, Adrian Mahjour, Petr Novotny, CARLO APPUGLIESE, Berthold Reinwald, Dheeraj Arremsetty
Neural-based ontology generation and refinement

Patent number: 11520986

Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.

Type: Grant

Filed: July 24, 2020

Date of Patent: December 6, 2022

Assignee: International Business Machines Corporation

Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
KNOWLEDGE GRAPH EMBEDDING USING GRAPH CONVOLUTIONAL NETWORKS WITH RELATION-AWARE ATTENTION

Publication number: 20220245425

Abstract: A knowledge graph embedding method, system, and computer program product using a computing device to embed a knowledge graph using a graph convolutional network, the method including learning, by the computing device, an embedding of the knowledge graph that includes entities, relations, and edges, weighing, by the computing device, initial feature vectors of nodes and a convolutional layer output to compute a weight and modifying the embedding based on the weight, and using, by the computing device, the modified embedding to perform a task related to the knowledge graph.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Inventors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Christoph Adrian Miksovic Czasch, Thomas Gschwind, Paolo Scotton
ADAPTIVE SELF-ADVERSARIAL NEGATIVE SAMPLING FOR GRAPH NEURAL NETWORK TRAINING

Publication number: 20220245460

Abstract: A graph neural network (GNN) training method, system, and computer program product in a graph, include generating, by the computing device, one or more one or more hypothetical edges between two or more nodes of a plurality of nodes of a graph neural network, testing, by the computing device, to determine whether the one or more generated hypothetical edges should be connected by using negative sampling, and permanently connecting, by the computing device, the one or more tested hypothetical edges if the negative sampling indicates the connectivity.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Inventors: Xiao Qin, Nasrullah Sheikh, Berthold Reinwald, Lingfei Wu
PREDICTING MULTIVARIATE TIME SERIES WITH SYSTEMATIC AND RANDOM MISSING VALUES

Publication number: 20220197977

Abstract: A computer-implemented method is provided for predicting future data values or target labels of multivariate time series data. The method includes receiving the multivariate time series data having present values, systematic missing values, and random missing values. The method further includes masking the present values, the systematic missing values, and the random missing values using triplet encodings. The method also includes determining time intervals between current missing values, from among the systematic missing values and the random missing values, and immediately preceding ones of the present values. The method additionally includes training, by a computing device, at least one recurrent neural network with the triplet encodings, the time intervals, and multivariate time series data to perform a feedforward pass on the recurrent neural network predicting the future data values or the target labels.

Type: Application

Filed: December 22, 2020

Publication date: June 23, 2022

Inventors: Mu Qiao, Yuya Jeremy Ong, Prithviraj Sen, Berthold Reinwald
GENERATION OF TRAINING DATA FROM REDACTED INFORMATION

Publication number: 20220188567

Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.

Type: Application

Filed: December 11, 2020

Publication date: June 16, 2022

Inventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
INTEGRATED GRAPH NEURAL NETWORK FOR SUPERVISED NON-OBVIOUS RELATIONSHIP DETECTION

Publication number: 20220092427

Abstract: A method, a computer program product, and a system for non-obvious relationship detection. The method includes receiving a knowledge and inputting a first node and a second node from the knowledge graph into a twin neural network. The method also includes embedding the first node and the second node, aggregating neighborhood information and position information into the node embeddings. The method further includes concatenating the neighborhood information and the position information of the first node and the second node to produce a first output vector and a second output vector. The method also includes generating a final score by comparing the first output vector with the second output vector. The final score indicates a probability of a non-obvious relationship between the first node and the second node.

Type: Application

Filed: September 21, 2020

Publication date: March 24, 2022

Inventors: Phillipp Müller, Xiao Qin, Balaji Ganesan, Berthold Reinwald, Nasrullah Sheikh
FORECASTING IN MULTIVARIATE IRREGULARLY SAMPLED TIME SERIES WITH MISSING VALUES

Publication number: 20220058465

Abstract: In an approach for forecasting in multivariate irregularly sampled time series, a processor receives time series data having one or more missing values. A processor determines, from the time series data, non-missing values present in the time series data. A processor determines, from the time series data, zero or more mask values for the time series data. A processor determines time interval values. A processor inputs the one or more missing values, the non-missing values, the zero or more mask values, and the time interval values into a recurrent neural network. A processor determines a predicted value for the one or more missing values.

Type: Application

Filed: August 24, 2020

Publication date: February 24, 2022

Inventors: Prithviraj Sen, Berthold Reinwald, Shivam Srivastava
NEURAL-BASED ONTOLOGY GENERATION AND REFINEMENT

Publication number: 20220027561

Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.

Type: Application

Filed: July 24, 2020

Publication date: January 27, 2022

Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
Single-pass distributed sampling from block-partitioned matrices

Patent number: 11194826

Abstract: A computer-implemented method is provided that includes identifying an input dataset formatted as an input matrix, the input matrix including a plurality of rows and a plurality of columns. The computer-implemented method also includes dividing the input matrix into a plurality of input matrix blocks. Further, the computer-implemented method includes distributing the input matrix blocks to a plurality of different machines across a distributed filesystem, and sampling, by at least two of the different machines in parallel, at least two of the input matrix blocks. Finally, the computer-implemented method includes generating at least one sample matrix based on the sampling of the at least two of the input matrix blocks.

Type: Grant

Filed: February 8, 2019

Date of Patent: December 7, 2021

Assignee: International Business Machines Corporation

Inventors: Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Sebastian Schelter
Dynamic recompilation techniques for machine learning programs

Patent number: 10534590

Abstract: The embodiments described herein relate to recompiling an execution plan of a machine-learning program during runtime. An execution plan of a machine-learning program is compiled. In response to identifying a directed acyclic graph of high-level operations (HOP DAG) for recompilation during runtime, the execution plan is dynamically recompiled. The dynamic recompilation includes updating statistics and dynamically rewriting one or more operators of the identified HOP DAG, recomputing memory estimates of operators of the rewritten HOP DAG based on the updated statistics and rewritten operators, constructing a directed acyclic graph of low-level operations (LOP DAG) corresponding to the rewritten HOP DAG based in part on the recomputed memory estimates, and generating runtime instructions based on the LOP DAG.

Type: Grant

Filed: April 28, 2017

Date of Patent: January 14, 2020

Assignee: International Business Machines Corporation

Inventors: Matthias Boehm, Berthold Reinwald, Shirish Tatikonda
Scaling dynamic authority-based search using materialized subgraphs

Patent number: 10521435

Abstract: A method that includes generating, in a query pre-processor, a set of pre-computed materialized sub-graphs by executing a pre-processing dynamic random-walk based search for a bin of terms. The method also includes receiving, in a query processor, a search query having at least one search query term. In response to receiving the search query, the method includes accessing the set of pre-computed materialized sub-graphs. The accessing includes accessing a text index based on the search query term to retrieve a corresponding term group identifier and accessing the corresponding pre-computed materialized sub-graph based on the term group identifier. The method also includes executing a dynamic random-walk based search on only the corresponding pre-computed materialized sub-graph and based on the executing, retrieving nodes in the dataset and transmitting the nodes as results of the query.

Type: Grant

Filed: September 21, 2015

Date of Patent: December 31, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Andrey Balmin, Heasoo Hwang, Erik Nijkamp, Berthold Reinwald
SINGLE-PASS DISTRIBUTED SAMPLING FROM BLOCK-PARTITIONED MATRICES

Publication number: 20190171641

Abstract: A computer-implemented method is provided that includes identifying an input dataset formatted as an input matrix, the input matrix including a plurality of rows and a plurality of columns. The computer-implemented method also includes dividing the input matrix into a plurality of input matrix blocks. Further, the computer-implemented method includes distributing the input matrix blocks to a plurality of different machines across a distributed filesystem, and sampling, by at least two of the different machines in parallel, at least two of the input matrix blocks. Finally, the computer-implemented method includes generating at least one sample matrix based on the sampling of the at least two of the input matrix blocks.

Type: Application

Filed: February 8, 2019

Publication date: June 6, 2019

Inventors: Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Sebastian Schelter
Global data flow optimization for machine learning programs

Patent number: 10268461

Abstract: A method for global data flow optimization for machine learning (ML) programs. The method includes receiving, by a storage device, an initial plan for an ML program. A processor builds a nested global data flow graph representation using the initial plan. Operator directed acyclic graphs (DAGs) are connected using crossblock operators according to inter-block data dependencies. The initial plan for the ML program is re-written resulting in an optimized plan for the ML program with respect to its global data flow properties. The re-writing includes re-writes of: configuration dataflow properties, operator selection and structural changes.

Type: Grant

Filed: November 23, 2015

Date of Patent: April 23, 2019

Assignee: International Business Machines Corporation

Inventors: Matthias Boehm, Mathias Peters, Berthold Reinwald, Shirish Tatikonda
Hybrid parallelization strategies for machine learning programs on top of mapreduce

Patent number: 10228922

Abstract: Parallel execution of machine learning programs is provided. Program code is received. The program code contains at least one parallel for statement having a plurality of iterations. A parallel execution plan is determined for the program code. According to the parallel execution plan, the plurality of iterations is partitioned into a plurality of tasks. Each task comprises at least one iteration. The iterations of each task are independent. Data required by the plurality of tasks is determined. An access pattern by the plurality of tasks of the data is determined. The data is partitioned based on the access pattern.

Type: Grant

Filed: January 12, 2016

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Boehm, Douglas Burdick, Berthold Reinwald, Prithviraj Sen, Shirish Tatikonda, Yuanyuan Tian, Shivakumar Vaithyanathan

1 2 3 4 next