Patents by Inventor Mattia Rigotti

Mattia Rigotti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LAYER NORMALIZATION FOR CALIBRATED UNCERTAINTY IN DEEP LEARNING

Publication number: 20250094784

Abstract: Layer normalization in machine learning applications that includes sampling a random set of activations corresponding to a fix fraction of the overall activations to provide a plurality of subsampled activations; computing the average across the plurality of subsampled activations; and computing the standard deviation across the plurality of subsampled activations. The layer normalization further includes employing two statistics including the average of the subsampled activations and the standard deviation across the plurality of subsampled activations to normalize all of the activations as a layer normalization.

Type: Application

Filed: September 20, 2023

Publication date: March 20, 2025

Inventors: Thomas Frick, Mattia Rigotti, Diego Matteo Antognini, Ioana Giurgiu, Adelmo Cristiano Innocenza Malossi
GENERATING STRONG LABELS FOR EXAMPLES LABELLED WITH WEAK LABELS

Publication number: 20240265676

Abstract: A method for generating strong labels for examples labelled with weak labels leverages an artificial neural network, or ANN, which is assumed to have been trained on a training set of examples labelled according to weak labels (e.g., classes of structural defects in images of civil engineering structures). The method processes each example of a set of test examples by performing the following operations. The trained ANN is first executed on each example to infer a weak label. Then, the method extracts explanatory features from the ANN as executed on the example. The method generates a strong label (e.g., a region boundary of the structural defect), based on the extracted explanatory features. The method subsequently prompts a user to react to one or each of the inferred weak label and the generated strong label. The response obtained is then interpreted by the method to obtain a further weak label.

Type: Application

Filed: February 8, 2023

Publication date: August 8, 2024

Inventors: Klára Janousková, Ioana Giurgiu, Mattia Rigotti, Adelmo Cristiano Innocenza Malossi
GENERATING CAUSAL ASSOCIATION RANKINGS USING DYNAMIC EMBEDDINGS

Publication number: 20240193411

Abstract: An embodiment for generating causal association rankings for candidate events within a window of candidate events using dynamic deep neural network generated embeddings. The embodiment may automatically receive a window of candidate events including events of a first type preceding one or more target events of interest. The embodiment may automatically generate contrastive windows of candidate events, each of the contrastive windows of candidate events of the first type corresponding to a different dropped candidate event from the received window of candidate events. The embodiment may automatically identify matching historical windows of events having resulting embeddings that are close in distance to the embeddings corresponding to the embeddings of the contrastive windows and calculate a first score for each match. The embodiment may automatically identify matching incident windows and calculate a corresponding second score.

Type: Application

Filed: December 7, 2022

Publication date: June 13, 2024

Inventors: Jiri Navratil, Karthikeyan Shanmugam, Naoki Abe, Youssef Mroueh, Mattia Rigotti, Inkit Padhi
EXPLAINABLE PREDICTION MODELS BASED ON CONCEPTS

Publication number: 20240119276

Abstract: Generating a neural network model for producing explainable prediction outputs for input data samples is provided. Training dataset of data samples are provided, each having a prediction label indicating a desired prediction output from the model for that sample, and a set of concept vectors are defined comprising a plurality of concept vectors which are associated with respective predefined concepts characterizing information content of the data samples. A set of input vectors are produced from each data sample. A neural network model is trained that includes a cross-attention module for producing a sample embedding for a data sample and a prediction module for producing a prediction output from the sample embedding.

Type: Application

Filed: September 30, 2022

Publication date: April 11, 2024

Inventors: MATTIA RIGOTTI, IOANA GIURGIU, THOMAS GSCHWIND, CHRISTOPH ADRIAN MIKSOVIC CZASCH, PAOLO SCOTTON
Feature selection using Sobolev Independence Criterion

Patent number: 11645555

Abstract: A machine learning system that implements Sobolev Independence Criterion (SIC) for feature selection is provided. The system receives a dataset including pairings of stimuli and responses. Each stimulus includes multiple features. The system generates a correctly paired sample of stimuli and responses from the dataset by pairing stimuli and responses according to the pairings of stimuli and responses in the dataset. The system generates an alternatively paired sample of stimuli and responses from the dataset by pairing stimuli and responses differently than the pairings of stimuli and responses in the dataset. The system determines a witness function and a feature importance distribution across the features that optimizes a cost function that is evaluated based on the correctly paired and alternatively paired samples of the dataset. The system selects one or more features based on the computed feature importance distribution.

Type: Grant

Filed: October 12, 2019

Date of Patent: May 9, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
NEURAL NETWORKS WITH ANALOG AND DIGITAL MODULES

Publication number: 20220414445

Abstract: A neural network includes a plurality of analog arrays comprise all synaptic weights of the neural network. The neural network also includes digital modules that are co-trained along with the plurality of analog arrays. The digital modules are intermittently connected and intermittently activated when the neural network is in production. When activated and connected, the digital modules may correct weights of the analog arrays.

Type: Application

Filed: June 29, 2021

Publication date: December 29, 2022

Inventors: Malte Johannes Rasch, Mattia Rigotti
LEVERAGING EXPLANATIONS FOR TRAINING OF AN AI SYSTEM

Publication number: 20220374703

Abstract: Computer-implemented methods, computer program products, and computer systems for training of an explaining machine-learning model is disclosed. The computer-implemented method may include one or more processors configured for providing an untrained machine-learning model, providing training data for the machine-learning model comprising training input data elements, wherein each of the training input data elements relates to a prediction label representing an expected prediction value as well as to a concept label, wherein the concept label relates to a reason why the expected prediction label is expected given the training input data elements, and simultaneously updating, during a supervised training of the machine-learning model, prediction parameter values as well as concept parameter values, thereby building the explaining machine-learning model.

Type: Application

Filed: May 18, 2021

Publication date: November 24, 2022

Inventors: Mattia Rigotti, Christoph Adrian Miksovic Czasch, Paolo Scotton, Thomas Gschwind, Adelmo Cristiano Innocenza Malossi, Thomas Frick, Filip Michal Janicki
Layer-wise distillation for protecting pre-trained neural network models

Patent number: 11494637

Abstract: Neural network protection mechanisms are provided. The neural network protection engine receives a pre-trained neural network computer model and forward propagates a dataset through layers of the pre-trained neural network computer model to compute, for each layer of the pre-trained neural network computer model, inputs and outputs of the layer. For at least one layer of the pre-trained neural network computer model, a differentially private distillation operation is performed on the inputs and outputs of the at least one layer to generate modified operational parameters of the at least one layer. The modified operational parameters of the at least one layer obfuscate aspects of an original training dataset used to train the pre-trained neural network computer model, present in original operational parameters of the at least one layer. The neural network protection engine generates a privatized trained neural network model based on the modified operational parameters.

Type: Grant

Filed: March 28, 2019

Date of Patent: November 8, 2022

Assignee: International Business Machines Corporation

Inventors: Supriyo Chakraborty, Mattia Rigotti
Acceleration of convolutional neural networks on analog arrays

Patent number: 11443176

Abstract: Mechanisms are provided for acceleration of convolutional neural networks on analog arrays. Input ports receive image signals from frames in an input image. Input memory arrays store the image signals received from the input ports into a respective input memory location to create a plurality of image sub-regions in input memory arrays. A distributor associated each of a set of analog array tiles in an analog array to a part of image sub-regions of the input memory arrays, so that one or more of a set of analog memory components is associated with the image signals in a distribution order to create a respective output signal. An assembler stores each of the respective output signals into one of a set of memory outputs in an output order that is determined by the distribution order.

Type: Grant

Filed: March 22, 2019

Date of Patent: September 13, 2022

Assignee: International Business Machines Corporation

Inventors: Malte Rasch, Tayfun Gokmen, Mattia Rigotti, Wilfried Haensch
False detection rate control with null-hypothesis

Patent number: 11373760

Abstract: A machine learning system receives a witness function that is determined based on an initial sample of a dataset comprising multiple pairs of stimuli and responses. Each stimulus includes multiple features. The system receives a holdout sample of the dataset comprising one or more pairs of stimuli and responses that are not used to determine the witness function. The system generates a simulated sample based on the holdout sample. Values of a particular feature of the stimuli of the simulated sample are predicted based on values of features other than the particular feature of the stimuli of the simulated sample. The system applies the holdout sample to the witness function to obtain a first result. The system applies the simulated sample to the witness function to obtain a second result. The system determines whether to select the particular feature based on a comparison between the first result and the second result.

Type: Grant

Filed: October 12, 2019

Date of Patent: June 28, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
FEATURE SELECTION USING SOBOLEV INDEPENDENCE CRITERION

Publication number: 20210110285

Abstract: A machine learning system that implements Sobolev Independence Criterion (SIC) for feature selection is provided. The system receives a dataset including pairings of stimuli and responses. Each stimulus includes multiple features. The system generates a correctly paired sample of stimuli and responses from the dataset by pairing stimuli and responses according to the pairings of stimuli and responses in the dataset. The system generates an alternatively paired sample of stimuli and responses from the dataset by pairing stimuli and responses differently than the pairings of stimuli and responses in the dataset. The system determines a witness function and a feature importance distribution across the features that optimizes a cost function that is evaluated based on the correctly paired and alternatively paired samples of the dataset. The system selects one or more features based on the computed feature importance distribution.

Type: Application

Filed: October 12, 2019

Publication date: April 15, 2021

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
FALSE DETECTION RATE CONTROL WITH NULL-HYPOTHESIS

Publication number: 20210110409

Abstract: A machine learning system receives a witness function that is determined based on an initial sample of a dataset comprising multiple pairs of stimuli and responses. Each stimulus includes multiple features. The system receives a holdout sample of the dataset comprising one or more pairs of stimuli and responses that are not used to determine the witness function. The system generates a simulated sample based on the holdout sample. Values of a particular feature of the stimuli of the simulated sample are predicted based on values of features other than the particular feature of the stimuli of the simulated sample. The system applies the holdout sample to the witness function to obtain a first result. The system applies the simulated sample to the witness function to obtain a second result. The system determines whether to select the particular feature based on a comparison between the first result and the second result.

Type: Application

Filed: October 12, 2019

Publication date: April 15, 2021

Inventors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira Dos Santos
Layer-Wise Distillation for Protecting Pre-Trained Neural Network Models

Publication number: 20200311540

Abstract: Neural network protection mechanisms are provided. The neural network protection engine receives a pre-trained neural network computer model and forward propagates a dataset through layers of the pre-trained neural network computer model to compute, for each layer of the pre-trained neural network computer model, inputs and outputs of the layer. For at least one layer of the pre-trained neural network computer model, a differentially private distillation operation is performed on the inputs and outputs of the at least one layer to generate modified operational parameters of the at least one layer. The modified operational parameters of the at least one layer obfuscate aspects of an original training dataset used to train the pre-trained neural network computer model, present in original operational parameters of the at least one layer. The neural network protection engine generates a privatized trained neural network model based on the modified operational parameters.

Type: Application

Filed: March 28, 2019

Publication date: October 1, 2020

Inventors: Supriyo Chakraborty, Mattia Rigotti
Acceleration of Convolutional Neural Networks on Analog Arrays

Publication number: 20190354847

Abstract: Mechanisms are provided for acceleration of convolutional neural networks on analog arrays. Input ports receive image signals from frames in an input image. Input memory arrays store the image signals received from the input ports into a respective input memory location to create a plurality of image sub-regions in input memory arrays. A distributor associated each of a set of analog array tiles in an analog array to a part of image sub-regions of the input memory arrays, so that one or more of a set of analog memory components is associated with the image signals in a distribution order to create a respective output signal. An assembler stores each of the respective output signals into one of a set of memory outputs in an output order that is determined by the distribution order.

Type: Application

Filed: March 22, 2019

Publication date: November 21, 2019

Inventors: Malte Rasch, Tayfun Gokmen, Mattia Rigotti, Wilfried Haensch