Patents by Inventor Youssef Mroueh

Youssef Mroueh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Wasserstein barycenter model ensembling

Patent number: 12254390

Abstract: A method, system and apparatus of ensembling, including inputting a set of models that predict different sets of attributes, determining a source set of attributes and a target set of attributes using a barycenter with an optimal transport metric, and determining a consensus among the set of models whose predictions are defined on the source set of attributes.

Type: Grant

Filed: April 29, 2019

Date of Patent: March 18, 2025

Assignee: International Business Machines Corporation

Inventors: Youssef Mroueh, Pierre L. Dognin, Igor Melnyk, Jarret Ross, Tom Sercu, Cicero Nogueira Dos Santos
GENERATING CAUSAL ASSOCIATION RANKINGS USING DYNAMIC EMBEDDINGS

Publication number: 20240193411

Abstract: An embodiment for generating causal association rankings for candidate events within a window of candidate events using dynamic deep neural network generated embeddings. The embodiment may automatically receive a window of candidate events including events of a first type preceding one or more target events of interest. The embodiment may automatically generate contrastive windows of candidate events, each of the contrastive windows of candidate events of the first type corresponding to a different dropped candidate event from the received window of candidate events. The embodiment may automatically identify matching historical windows of events having resulting embeddings that are close in distance to the embeddings corresponding to the embeddings of the contrastive windows and calculate a first score for each match. The embodiment may automatically identify matching incident windows and calculate a corresponding second score.

Type: Application

Filed: December 7, 2022

Publication date: June 13, 2024

Inventors: Jiri Navratil, Karthikeyan Shanmugam, Naoki Abe, Youssef Mroueh, Mattia Rigotti, Inkit Padhi
PHYSICS-ENHANCED DEEP SURROGATE

Publication number: 20240152669

Abstract: Surrogate training can include receiving a parameterization of a physical system, where the physical system includes real physical components and the parameterization having corresponding target property in the physical system. The parameterization can be input into a neural network, where the neural network generates a different dimensional parameterization based on the input parameterization. The different dimensional parameterization can be input to a physical model that approximates the physical system. The physical model can be run using the different dimensional parameterization, where the physical model generates an output solution based on the different dimensional parameterization input to the physical model. Based on the output solution and the target property, the neural network can be trained to generate the different dimensional parameterization.

Type: Application

Filed: November 8, 2022

Publication date: May 9, 2024

Inventors: Raphael Pestourie, Youssef Mroueh, Payel Das, Steven Glenn Johnson, Christopher Vincent Rackauckas
Updating of statistical sets for decentralized distributed training of a machine learning model

Patent number: 11836220

Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.

Type: Grant

Filed: March 1, 2023

Date of Patent: December 5, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
UPDATING OF STATISTICAL SETS FOR DECENTRALIZED DISTRIBUTED TRAINING OF A MACHINE LEARNING MODEL

Publication number: 20230205843

Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.

Type: Application

Filed: March 1, 2023

Publication date: June 29, 2023

Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
Feature selection using Sobolev Independence Criterion

Patent number: 11645555

Abstract: A machine learning system that implements Sobolev Independence Criterion (SIC) for feature selection is provided. The system receives a dataset including pairings of stimuli and responses. Each stimulus includes multiple features. The system generates a correctly paired sample of stimuli and responses from the dataset by pairing stimuli and responses according to the pairings of stimuli and responses in the dataset. The system generates an alternatively paired sample of stimuli and responses from the dataset by pairing stimuli and responses differently than the pairings of stimuli and responses in the dataset. The system determines a witness function and a feature importance distribution across the features that optimizes a cost function that is evaluated based on the correctly paired and alternatively paired samples of the dataset. The system selects one or more features based on the computed feature importance distribution.

Type: Grant

Filed: October 12, 2019

Date of Patent: May 9, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
Updating of statistical sets for decentralized distributed training of a machine learning model

Patent number: 11636280

Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.

Type: Grant

Filed: January 27, 2021

Date of Patent: April 25, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
Mutual information neural estimation with Eta-trick

Patent number: 11630989

Abstract: A computing device receives a data X and Y, each having N samples. A function f(x,y) is defined to be a trainable neural network based on the data X and the data Y. A permuted version of the data Y is created. A loss mean is computed based on the trainable neural network f(x,y), the permuted version of the sample data Y, and a trainable scalar variable ?. A loss with respect to the scalar variable ? and the trainable neural network is minimized. Upon determining that the loss is at or below the predetermined threshold, estimating a mutual information (MI) between a test data XT and YT. If the estimated MI is above a predetermined threshold, the test data XT and YT is deemed to be dependent. Otherwise, it is deemed to be independent.

Type: Grant

Filed: March 9, 2020

Date of Patent: April 18, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Pierre L. Dognin, Igor Melnyk, Jarret Ross, Tom D. J. Sercu
ACTIVE LEARNING OF DATA MODELS FOR SCALED OPTIMIZATION

Publication number: 20230071046

Abstract: Embodiments of the present invention provide computer-implemented methods, computer program products and computer systems. Embodiments of the present invention can, in response to receiving parameters associated with a problem, train at least one generated data model to evaluate an estimation of a solution for the problem. Embodiments of the present invention can then generate an uncertainty quantification measure associated with an estimation of error for the at least one generated data model.

Type: Application

Filed: August 18, 2021

Publication date: March 9, 2023

Inventors: Raphaël Pestourie, Youssef Mroueh, Payel Das, Steven Glenn Johnson
UPDATING OF STATISTICAL SETS FOR DECENTRALIZED DISTRIBUTED TRAINING OF A MACHINE LEARNING MODEL

Publication number: 20220245397

Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.

Type: Application

Filed: January 27, 2021

Publication date: August 4, 2022

Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
False detection rate control with null-hypothesis

Patent number: 11373760

Abstract: A machine learning system receives a witness function that is determined based on an initial sample of a dataset comprising multiple pairs of stimuli and responses. Each stimulus includes multiple features. The system receives a holdout sample of the dataset comprising one or more pairs of stimuli and responses that are not used to determine the witness function. The system generates a simulated sample based on the holdout sample. Values of a particular feature of the stimuli of the simulated sample are predicted based on values of features other than the particular feature of the stimuli of the simulated sample. The system applies the holdout sample to the witness function to obtain a first result. The system applies the simulated sample to the witness function to obtain a second result. The system determines whether to select the particular feature based on a comparison between the first result and the second result.

Type: Grant

Filed: October 12, 2019

Date of Patent: June 28, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
DECENTRALIZED PARALLEL MIN/MAX OPTIMIZATION

Publication number: 20220129746

Abstract: Techniques are provided for decentralized parallel min/max optimizations. In one embodiment, the techniques involve generating gradients based on a first set of weights associated with a first node of a neural network, exchanging the first set of weights with a second set of weights associated with a second node, generating an average weight based on the first set of weights and the second set of weights, and updating the first set of weights and the second set of weights via a decentralized parallel optimistic stochastic gradient (DPOSG) algorithm based on the gradients and the average weight.

Type: Application

Filed: October 27, 2020

Publication date: April 28, 2022

Inventors: Mingrui LIU, Wei ZHANG, Youssef MROUEH, Xiaodong CUI, Jarret ROSS, Payel DAS
DEEP SURROGATE LANGEVIN SAMPLING FOR MULTI-OBJECTIVE CONSTRAINT BLACK BOX OPTIMIZATION WITH APPLICATIONS TO OPTIMAL INVERSE DESIGN PROBLEMS

Publication number: 20220076130

Abstract: Run a computerized numerical partial differential equation solver on at least one partial differential equation representing at least one physical constraint of a physical system, to generate a training data set. A true potential corresponds to an exact solution to the at least one partial differential equation. Using a computerized machine learning system, learn, from the training data set, a surrogate of a gradient of the true potential. Using the computerized machine learning system, apply Langevin sampling to the learned surrogate of the gradient, to obtain a plurality of samples corresponding to candidate designs for the physical system. Make the plurality of samples available to a fabrication entity.

Type: Application

Filed: August 31, 2020

Publication date: March 10, 2022

Inventors: Thanh Van Nguyen, Youssef Mroueh, Samuel Chung Hoffman, Payel Das, Pierre L. Dognin, Giuseppe Romano, Chinmay Hegde
Mutual Information Neural Estimation with Eta-Trick

Publication number: 20210287099

Abstract: A computing device receives a data X and Y, each having N samples. A function f(x,y) is defined to be a trainable neural network based on the data X and the data Y. A permuted version of the data Y is created. A loss mean is computed based on the trainable neural network f(x,y), the permuted version of the sample data Y, and a trainable scalar variable ?. A loss with respect to the scalar variable ? and the trainable neural network is minimized. Upon determining that the loss is at or below the predetermined threshold, estimating a mutual information (MI) between a test data XT and YT. If the estimated MI is above a predetermined threshold, the test data XT and YT is deemed to be dependent. Otherwise, it is deemed to be independent.

Type: Application

Filed: March 9, 2020

Publication date: September 16, 2021

Inventors: Youssef Mroueh, Pierre L. Dognin, Igor Melnyk, Jarret Ross, Tom D. J. Sercu
FEATURE SELECTION USING SOBOLEV INDEPENDENCE CRITERION

Publication number: 20210110285

Abstract: A machine learning system that implements Sobolev Independence Criterion (SIC) for feature selection is provided. The system receives a dataset including pairings of stimuli and responses. Each stimulus includes multiple features. The system generates a correctly paired sample of stimuli and responses from the dataset by pairing stimuli and responses according to the pairings of stimuli and responses in the dataset. The system generates an alternatively paired sample of stimuli and responses from the dataset by pairing stimuli and responses differently than the pairings of stimuli and responses in the dataset. The system determines a witness function and a feature importance distribution across the features that optimizes a cost function that is evaluated based on the correctly paired and alternatively paired samples of the dataset. The system selects one or more features based on the computed feature importance distribution.

Type: Application

Filed: October 12, 2019

Publication date: April 15, 2021

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
FALSE DETECTION RATE CONTROL WITH NULL-HYPOTHESIS

Publication number: 20210110409

Abstract: A machine learning system receives a witness function that is determined based on an initial sample of a dataset comprising multiple pairs of stimuli and responses. Each stimulus includes multiple features. The system receives a holdout sample of the dataset comprising one or more pairs of stimuli and responses that are not used to determine the witness function. The system generates a simulated sample based on the holdout sample. Values of a particular feature of the stimuli of the simulated sample are predicted based on values of features other than the particular feature of the stimuli of the simulated sample. The system applies the holdout sample to the witness function to obtain a first result. The system applies the simulated sample to the witness function to obtain a second result. The system determines whether to select the particular feature based on a comparison between the first result and the second result.

Type: Application

Filed: October 12, 2019

Publication date: April 15, 2021

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
Transforming source distribution to target distribution using Sobolev Descent

Patent number: 10860900

Abstract: Systems, computer-implemented methods, and computer program products for transforming a source distribution to a target distribution. A system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a sampling component that receives a source distribution having a source sample and a target distribution having a target sample. The computer executable components can further comprise an optimizer component that employs a neural network to find a critic that dynamically discriminates between the source sample and the target sample, while constraining a gradient of the neural network. The computer executable components can further comprise a morphing component that generates a first product distribution by morphing the source distribution along the gradient of the neural network to the target distribution.

Type: Grant

Filed: October 30, 2018

Date of Patent: December 8, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Tom Sercu
WASSERSTEIN BARYCENTER MODEL ENSEMBLING

Publication number: 20200342361

Abstract: A method, system and apparatus of ensembling, including inputting a set of models that predict different sets of attributes, determining a source set of attributes and a target set of attributes using a barycenter with an optimal transport metric, and determining a consensus among the set of models whose predictions are defined on the source set of attributes.

Type: Application

Filed: April 29, 2019

Publication date: October 29, 2020

Inventors: Youssef Mroueh, Pierre L. Dognin, Igor Melnyk, Jarret Ross, Tom Sercu, Cicero Nogueira Dos Santos
TRANSFORMING SOURCE DISTRIBUTION TO TARGET DISTRIBUTION USING SOBOLEV DESCENT

Publication number: 20200134399

Abstract: Systems, computer-implemented methods, and computer program products for transforming a source distribution to a target distribution. A system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a sampling component that receives a source distribution having a source sample and a target distribution having a target sample. The computer executable components can further comprise an optimizer component that employs a neural network to find a critic that dynamically discriminates between the source sample and the target sample, while constraining a gradient of the neural network. The computer executable components can further comprise a morphing component that generates a first product distribution by morphing the source distribution along the gradient of the neural network to the target distribution.

Type: Application

Filed: October 30, 2018

Publication date: April 30, 2020

Inventors: Youssef Mroueh, Tom Sercu
SELF-CRITICAL SEQUENCE TRAINING OF MULTIMODAL SYSTEMS

Publication number: 20190147355

Abstract: Machine logic for: (i) selecting a sampled word for use as a next word in a text stream; (ii) determining, by an algorithm, an expected future reward value for the sampled word using a test policy including a training policy and a test-time inference procedure; and (iii) normalizing a set of expected future reward estimate(s) using the expected future reward value for the sampled word.

Type: Application

Filed: November 14, 2017

Publication date: May 16, 2019

Inventors: Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Vaibhava Goel, Jarret Ross, Pierre L. Dognin