Patents by Inventor Nicolo Fusi

Nicolo Fusi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230385247
    Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.
    Type: Application
    Filed: June 1, 2023
    Publication date: November 30, 2023
    Inventors: David ALVAREZ-MELIS, Nicolo FUSI
  • Patent number: 11709806
    Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: July 25, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: David Alvarez-Melis, Nicolo Fusi
  • Publication number: 20230186150
    Abstract: Generally discussed herein are devices, systems, and methods for identifying optimal hyperparameter values within a pre-defined budget. A method can include, while training a model for a number of iterations using values of a hyperparameter vector, recording objective function values of the objective function and cost function values of a cost function, fitting a function model to the objective function values and a cost model to the cost function values resulting in a fitted function model and a fitted cost model, selecting a second hyperparameter vector, determining an optimal number of iterations to perform and after which to stop training using the second hyperparameter vector, re-training the model of the type of model for the optimal number of iterations using the second hyperparameter vector, and providing hyperparameter values, of the hyperparameter vector or the second hyperparameter vector, that maximize an objective defined by the objective function.
    Type: Application
    Filed: December 15, 2021
    Publication date: June 15, 2023
    Inventors: Syrine Belakaria, Rishit Sheth, Nicolo Fusi
  • Publication number: 20230186094
    Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.
    Type: Application
    Filed: February 9, 2023
    Publication date: June 15, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Nicolo FUSI, Francesco Paolo CASALE, Jonathan GORDON
  • Patent number: 11604992
    Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: March 14, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicolo Fusi, Francesco Paolo Casale, Jonathan Gordon
  • Publication number: 20220245444
    Abstract: Embodiments of the present disclosure include a system for optimizing an artificial neural network by configuring a model, based on a plurality of training parameters, to execute a training process, monitoring a plurality of statistics produced upon execution of the training process, and adjusting one or more of the training parameters, based on one or more of the statistics, to maintain at least one of the statistics within a predetermined range. In some embodiments, artificial intelligence (AI) processors may execute a training process on a model, the training process having an associated set of training parameters. Execution of the training process may produce a plurality of statistics. Control processor(s) coupled to the AI processor(s) may receive the statistics, and in accordance therewith, adjust one or more of the training parameters to maintain at least one of the statistics within a predetermined range during execution of the training process.
    Type: Application
    Filed: January 29, 2021
    Publication date: August 4, 2022
    Inventors: Maximilian Golub, Ritchie Zhao, Eric Chung, Douglas Burger, Bita Darvish Rouhani, Ge Yang, Nicolo Fusi
  • Publication number: 20220108168
    Abstract: Aspects of the present disclosure relate to factorized neural network techniques. In examples, a layer of a machine learning model is factorized and initialized using spectral initialization. For example, an initial layer parameterized using an initial matrix is processed such that it is instead parameterized by the product of two or more matrices, thereby resulting in a factorized machine learning model. An optimizer associated with the machine learning model may also be processed to adapt a regularizer accordingly. For example, a regularizer using a weight decay function may be adapted to instead use a Frobenius decay function with respect to the factorized model layer. The factorized machine learning model may be trained using the processed optimizer and subsequently used to generate inferences.
    Type: Application
    Filed: December 2, 2020
    Publication date: April 7, 2022
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Nicolo FUSI, Mikhail KHODAK, Neil Arturo TENENHOLTZ, Lester Wayne MACKEY, II
  • Publication number: 20220092037
    Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.
    Type: Application
    Filed: November 24, 2020
    Publication date: March 24, 2022
    Inventors: David Alvarez-Melis, Nicolo Fusi
  • Publication number: 20210210168
    Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.
    Type: Application
    Filed: January 7, 2021
    Publication date: July 8, 2021
    Inventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
  • Patent number: 10923213
    Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.
    Type: Grant
    Filed: December 2, 2016
    Date of Patent: February 16, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
  • Patent number: 10762163
    Abstract: In embodiments of probabilistic matrix factorization for automated machine learning, a computing system memory maintains different workflows that each include preprocessing steps for a machine learning model, the machine learning model, and one or more parameters for the machine learning model. The computing system memory additionally maintains different data sets, upon which the different workflows can be trained and tested. A matrix is generated from the different workflows and different data sets, where cells of the matrix are populated with performance metrics that each indicate a measure of performance for a workflow applied to a data set. A low-rank decomposition of the matrix with populated performance metrics is then determined. Based on the low-rank decomposition, an optimum workflow for a new data set can be determined. The optimum workflow can be one of the different workflows or a hybrid of at least two of the different workflows.
    Type: Grant
    Filed: December 5, 2016
    Date of Patent: September 1, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Nicolo Fusi
  • Publication number: 20200143231
    Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.
    Type: Application
    Filed: November 2, 2018
    Publication date: May 7, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Nicolo FUSI, Francesco Paolo CASALE, Jonathan GORDON
  • Publication number: 20190347548
    Abstract: Systems and methods for selecting a neural network for a machine learning problem are disclosed. A method includes accessing an input matrix. The method includes accessing a machine learning problem space associated with a machine learning problem and multiple untrained candidate neural networks for solving the machine learning problem. The method includes computing, for each untrained candidate neural network, at least one expressivity measure capturing an expressivity of the candidate neural network with respect to the machine learning problem. The method includes computing, for each untrained candidate neural network, at least one trainability measure capturing a trainability of the candidate neural network with respect to the machine learning problem. The method includes selecting, based on the at least one expressivity measure and the at least one trainability measure, at least one candidate neural network for solving the machine learning problem.
    Type: Application
    Filed: May 10, 2018
    Publication date: November 14, 2019
    Inventors: Saeed Amizadeh, Ge Yang, Nicolo Fusi, Francesco Paolo Casale
  • Patent number: 10296709
    Abstract: The techniques and/or systems described herein are directed to improvements in genomic prediction using homomorphic encryption. For example, a genomic model can be generated by a prediction service provider to predict a risk of a disease or a presence of genetic traits. Genomic data corresponding to a genetic profile of an individual can be batch encoded into a plurality of polynomials, homomorphically encrypted, and provided to a service provider for evaluation. The genomic model can be batch encoded as well, and the genetic prediction may be determined by evaluating a dot product of the genomic model data the genomic data. A genomic prediction result value can be provided to a computing device associated with a user for subsequent decrypting and decoding. Homomorphic encoding and encryption can be used such that the genomic data may be applied to the prediction model and a result can be obtained without revealing any information about the model, the genomic data, or any genomic prediction.
    Type: Grant
    Filed: June 10, 2016
    Date of Patent: May 21, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Kim Laine, Nicolo Fusi, Ran Gilad-Bachrach, Kristin E. Lauter
  • Patent number: 10120975
    Abstract: This disclosure presents a model for identifying correlations in genome-wide association studies (GWAS) with function-valued traits that provides increased power and computational efficiency by use of a Gaussian process regression with radial basis function (RBF) kernels to model the function-valued traits and specialized factorizations to achieve speed. A Gaussian Process is assigned to each partition for each allele of a given single nucleotide polymorphism (SNP) which yields flexible alternative models and handles a large number of data points in a way that is statistically and computationally efficient. This model provides techniques for handling missing and unaligned function values such as would occur when not all individuals are measured at the same time points. If the data is complete algebraic re-factorization by decomposition into Kronecker products reduces the time complexity of this model thereby increasing processing speed and reducing memory usage as compared to a naive implementation.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: November 6, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicolo Fusi, Jennifer Listgarten
  • Publication number: 20180157971
    Abstract: In embodiments of probabilistic matrix factorization for automated machine learning, a computing system memory maintains different workflows that each include preprocessing steps for a machine learning model, the machine learning model, and one or more parameters for the machine learning model. The computing system memory additionally maintains different data sets, upon which the different workflows can be trained and tested. A matrix is generated from the different workflows and different data sets, where cells of the matrix are populated with performance metrics that each indicate a measure of performance for a workflow applied to a data set. A low-rank decomposition of the matrix with populated performance metrics is then determined. Based on the low-rank decomposition, an optimum workflow for a new data set can be determined. The optimum workflow can be one of the different workflows or a hybrid of at least two of the different workflows.
    Type: Application
    Filed: December 5, 2016
    Publication date: June 7, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventor: Nicolo Fusi
  • Publication number: 20180157794
    Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.
    Type: Application
    Filed: December 2, 2016
    Publication date: June 7, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
  • Publication number: 20170357749
    Abstract: The techniques and/or systems described herein are directed to improvements in genomic prediction using homomorphic encryption. For example, a genomic model can be generated by a prediction service provider to predict a risk of a disease or a presence of genetic traits. Genomic data corresponding to a genetic profile of an individual can be batch encoded into a plurality of polynomials, homomorphically encrypted, and provided to a service provider for evaluation. The genomic model can be batch encoded as well, and the genetic prediction may be determined by evaluating a dot product of the genomic model data the genomic data. A genomic prediction result value can be provided to a computing device associated with a user for subsequent decrypting and decoding. Homomorphic encoding and encryption can be used such that the genomic data may be applied to the prediction model and a result can be obtained without revealing any information about the model, the genomic data, or any genomic prediction.
    Type: Application
    Filed: June 10, 2016
    Publication date: December 14, 2017
    Inventors: Kim Laine, Nicolo Fusi, Ran Gilad-Bachrach, Kristin E. Lauter
  • Publication number: 20170286593
    Abstract: This disclosure presents a model for identifying correlations in genome-wide association studies (GWAS) with function-valued traits that provides increased power and computational efficiency by use of a Gaussian process regression with radial basis function (RBF) kernels to model the function-valued traits and specialized factorizations to achieve speed. A Gaussian Process is assigned to each partition for each allele of a given single nucleotide polymorphism (SNP) which yields flexible alternative models and handles a large number of data points in a way that is statistically and computationally efficient. This model provides techniques for handling missing and unaligned function values such as would occur when not all individuals are measured at the same time points. If the data is complete algebraic re-factorization by decomposition into Kronecker products reduces the time complexity of this model thereby increasing processing speed and reducing memory usage as compared to a naive implementation.
    Type: Application
    Filed: March 30, 2016
    Publication date: October 5, 2017
    Inventors: Nicolo Fusi, Jennifer Listgarten
  • Publication number: 20170176956
    Abstract: A control system comprises an input configured to receive sensor data sensed from a target system to be controlled by the control system. The control system has an input-aware stacker, the input-aware stacker being a predictor; and a plurality of base predictors configured to compute base outputs from features of the sensor data. The input-aware stacker is input-aware in that it is configured to take as input the features as well as the base outputs to compute a prediction. The input-aware stacker is configured to compute the prediction from uncertainty data about the base outputs and/or from at least some combinations of the features of the sensor data. The control system has an output configured to send instructions to the target system on the basis of the computed prediction.
    Type: Application
    Filed: December 17, 2015
    Publication date: June 22, 2017
    Inventors: Nicolo Fusi, Jennifer Listgarten, Miriam Huntley