Patents by Inventor Nicolo Fusi
Nicolo Fusi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230385247Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.Type: ApplicationFiled: June 1, 2023Publication date: November 30, 2023Inventors: David ALVAREZ-MELIS, Nicolo FUSI
-
Patent number: 11709806Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.Type: GrantFiled: November 24, 2020Date of Patent: July 25, 2023Assignee: Microsoft Technology Licensing, LLCInventors: David Alvarez-Melis, Nicolo Fusi
-
Publication number: 20230186150Abstract: Generally discussed herein are devices, systems, and methods for identifying optimal hyperparameter values within a pre-defined budget. A method can include, while training a model for a number of iterations using values of a hyperparameter vector, recording objective function values of the objective function and cost function values of a cost function, fitting a function model to the objective function values and a cost model to the cost function values resulting in a fitted function model and a fitted cost model, selecting a second hyperparameter vector, determining an optimal number of iterations to perform and after which to stop training using the second hyperparameter vector, re-training the model of the type of model for the optimal number of iterations using the second hyperparameter vector, and providing hyperparameter values, of the hyperparameter vector or the second hyperparameter vector, that maximize an objective defined by the objective function.Type: ApplicationFiled: December 15, 2021Publication date: June 15, 2023Inventors: Syrine Belakaria, Rishit Sheth, Nicolo Fusi
-
Publication number: 20230186094Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.Type: ApplicationFiled: February 9, 2023Publication date: June 15, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Nicolo FUSI, Francesco Paolo CASALE, Jonathan GORDON
-
Patent number: 11604992Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.Type: GrantFiled: November 2, 2018Date of Patent: March 14, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Nicolo Fusi, Francesco Paolo Casale, Jonathan Gordon
-
Publication number: 20220245444Abstract: Embodiments of the present disclosure include a system for optimizing an artificial neural network by configuring a model, based on a plurality of training parameters, to execute a training process, monitoring a plurality of statistics produced upon execution of the training process, and adjusting one or more of the training parameters, based on one or more of the statistics, to maintain at least one of the statistics within a predetermined range. In some embodiments, artificial intelligence (AI) processors may execute a training process on a model, the training process having an associated set of training parameters. Execution of the training process may produce a plurality of statistics. Control processor(s) coupled to the AI processor(s) may receive the statistics, and in accordance therewith, adjust one or more of the training parameters to maintain at least one of the statistics within a predetermined range during execution of the training process.Type: ApplicationFiled: January 29, 2021Publication date: August 4, 2022Inventors: Maximilian Golub, Ritchie Zhao, Eric Chung, Douglas Burger, Bita Darvish Rouhani, Ge Yang, Nicolo Fusi
-
Publication number: 20220108168Abstract: Aspects of the present disclosure relate to factorized neural network techniques. In examples, a layer of a machine learning model is factorized and initialized using spectral initialization. For example, an initial layer parameterized using an initial matrix is processed such that it is instead parameterized by the product of two or more matrices, thereby resulting in a factorized machine learning model. An optimizer associated with the machine learning model may also be processed to adapt a regularizer accordingly. For example, a regularizer using a weight decay function may be adapted to instead use a Frobenius decay function with respect to the factorized model layer. The factorized machine learning model may be trained using the processed optimizer and subsequently used to generate inferences.Type: ApplicationFiled: December 2, 2020Publication date: April 7, 2022Applicant: Microsoft Technology Licensing, LLCInventors: Nicolo FUSI, Mikhail KHODAK, Neil Arturo TENENHOLTZ, Lester Wayne MACKEY, II
-
Publication number: 20220092037Abstract: Generally discussed herein are devices, systems, and methods for machine learning (ML) by flowing a dataset towards a target dataset. A method can include receiving a request to operate on a first dataset including first feature, label pairs, identifying a second dataset from multiple datasets, the second dataset including second feature, label pairs, determining a distance between the first feature, label and the second feature, label pairs, and flowing the first dataset using a dataset objective that operates based on the determined distance to generate an optimized dataset.Type: ApplicationFiled: November 24, 2020Publication date: March 24, 2022Inventors: David Alvarez-Melis, Nicolo Fusi
-
Publication number: 20210210168Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.Type: ApplicationFiled: January 7, 2021Publication date: July 8, 2021Inventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
-
Patent number: 10923213Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.Type: GrantFiled: December 2, 2016Date of Patent: February 16, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
-
Patent number: 10762163Abstract: In embodiments of probabilistic matrix factorization for automated machine learning, a computing system memory maintains different workflows that each include preprocessing steps for a machine learning model, the machine learning model, and one or more parameters for the machine learning model. The computing system memory additionally maintains different data sets, upon which the different workflows can be trained and tested. A matrix is generated from the different workflows and different data sets, where cells of the matrix are populated with performance metrics that each indicate a measure of performance for a workflow applied to a data set. A low-rank decomposition of the matrix with populated performance metrics is then determined. Based on the low-rank decomposition, an optimum workflow for a new data set can be determined. The optimum workflow can be one of the different workflows or a hybrid of at least two of the different workflows.Type: GrantFiled: December 5, 2016Date of Patent: September 1, 2020Assignee: Microsoft Technology Licensing, LLCInventor: Nicolo Fusi
-
Publication number: 20200143231Abstract: Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time.Type: ApplicationFiled: November 2, 2018Publication date: May 7, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Nicolo FUSI, Francesco Paolo CASALE, Jonathan GORDON
-
Publication number: 20190347548Abstract: Systems and methods for selecting a neural network for a machine learning problem are disclosed. A method includes accessing an input matrix. The method includes accessing a machine learning problem space associated with a machine learning problem and multiple untrained candidate neural networks for solving the machine learning problem. The method includes computing, for each untrained candidate neural network, at least one expressivity measure capturing an expressivity of the candidate neural network with respect to the machine learning problem. The method includes computing, for each untrained candidate neural network, at least one trainability measure capturing a trainability of the candidate neural network with respect to the machine learning problem. The method includes selecting, based on the at least one expressivity measure and the at least one trainability measure, at least one candidate neural network for solving the machine learning problem.Type: ApplicationFiled: May 10, 2018Publication date: November 14, 2019Inventors: Saeed Amizadeh, Ge Yang, Nicolo Fusi, Francesco Paolo Casale
-
Patent number: 10296709Abstract: The techniques and/or systems described herein are directed to improvements in genomic prediction using homomorphic encryption. For example, a genomic model can be generated by a prediction service provider to predict a risk of a disease or a presence of genetic traits. Genomic data corresponding to a genetic profile of an individual can be batch encoded into a plurality of polynomials, homomorphically encrypted, and provided to a service provider for evaluation. The genomic model can be batch encoded as well, and the genetic prediction may be determined by evaluating a dot product of the genomic model data the genomic data. A genomic prediction result value can be provided to a computing device associated with a user for subsequent decrypting and decoding. Homomorphic encoding and encryption can be used such that the genomic data may be applied to the prediction model and a result can be obtained without revealing any information about the model, the genomic data, or any genomic prediction.Type: GrantFiled: June 10, 2016Date of Patent: May 21, 2019Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Kim Laine, Nicolo Fusi, Ran Gilad-Bachrach, Kristin E. Lauter
-
Patent number: 10120975Abstract: This disclosure presents a model for identifying correlations in genome-wide association studies (GWAS) with function-valued traits that provides increased power and computational efficiency by use of a Gaussian process regression with radial basis function (RBF) kernels to model the function-valued traits and specialized factorizations to achieve speed. A Gaussian Process is assigned to each partition for each allele of a given single nucleotide polymorphism (SNP) which yields flexible alternative models and handles a large number of data points in a way that is statistically and computationally efficient. This model provides techniques for handling missing and unaligned function values such as would occur when not all individuals are measured at the same time points. If the data is complete algebraic re-factorization by decomposition into Kronecker products reduces the time complexity of this model thereby increasing processing speed and reducing memory usage as compared to a naive implementation.Type: GrantFiled: March 30, 2016Date of Patent: November 6, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Nicolo Fusi, Jennifer Listgarten
-
Publication number: 20180157971Abstract: In embodiments of probabilistic matrix factorization for automated machine learning, a computing system memory maintains different workflows that each include preprocessing steps for a machine learning model, the machine learning model, and one or more parameters for the machine learning model. The computing system memory additionally maintains different data sets, upon which the different workflows can be trained and tested. A matrix is generated from the different workflows and different data sets, where cells of the matrix are populated with performance metrics that each indicate a measure of performance for a workflow applied to a data set. A low-rank decomposition of the matrix with populated performance metrics is then determined. Based on the low-rank decomposition, an optimum workflow for a new data set can be determined. The optimum workflow can be one of the different workflows or a hybrid of at least two of the different workflows.Type: ApplicationFiled: December 5, 2016Publication date: June 7, 2018Applicant: Microsoft Technology Licensing, LLCInventor: Nicolo Fusi
-
Publication number: 20180157794Abstract: In embodiments of latent space harmonization (LSH) for predictive modeling, different training data sets are obtained from different measurement methods, where input data among the training data sets is quantifiable in a common space but a mapping between output data among the training data sets is unknown. A LSH module receives the training data sets and maps a common supervised target variable of the output data to a shared latent space where the output data can be jointly yielded. Mappings from the shared latent space back to the output training data of each training data set are determined and used to generate a trained predictive model. The trained predictive model is useable to predict output data from new input data with improved predictive power from the training data obtained using various, otherwise incongruent, measurement techniques.Type: ApplicationFiled: December 2, 2016Publication date: June 7, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Nicolo Fusi, Jennifer Listgarten, Gregory Byer Darnell
-
Publication number: 20170357749Abstract: The techniques and/or systems described herein are directed to improvements in genomic prediction using homomorphic encryption. For example, a genomic model can be generated by a prediction service provider to predict a risk of a disease or a presence of genetic traits. Genomic data corresponding to a genetic profile of an individual can be batch encoded into a plurality of polynomials, homomorphically encrypted, and provided to a service provider for evaluation. The genomic model can be batch encoded as well, and the genetic prediction may be determined by evaluating a dot product of the genomic model data the genomic data. A genomic prediction result value can be provided to a computing device associated with a user for subsequent decrypting and decoding. Homomorphic encoding and encryption can be used such that the genomic data may be applied to the prediction model and a result can be obtained without revealing any information about the model, the genomic data, or any genomic prediction.Type: ApplicationFiled: June 10, 2016Publication date: December 14, 2017Inventors: Kim Laine, Nicolo Fusi, Ran Gilad-Bachrach, Kristin E. Lauter
-
Publication number: 20170286593Abstract: This disclosure presents a model for identifying correlations in genome-wide association studies (GWAS) with function-valued traits that provides increased power and computational efficiency by use of a Gaussian process regression with radial basis function (RBF) kernels to model the function-valued traits and specialized factorizations to achieve speed. A Gaussian Process is assigned to each partition for each allele of a given single nucleotide polymorphism (SNP) which yields flexible alternative models and handles a large number of data points in a way that is statistically and computationally efficient. This model provides techniques for handling missing and unaligned function values such as would occur when not all individuals are measured at the same time points. If the data is complete algebraic re-factorization by decomposition into Kronecker products reduces the time complexity of this model thereby increasing processing speed and reducing memory usage as compared to a naive implementation.Type: ApplicationFiled: March 30, 2016Publication date: October 5, 2017Inventors: Nicolo Fusi, Jennifer Listgarten
-
Publication number: 20170176956Abstract: A control system comprises an input configured to receive sensor data sensed from a target system to be controlled by the control system. The control system has an input-aware stacker, the input-aware stacker being a predictor; and a plurality of base predictors configured to compute base outputs from features of the sensor data. The input-aware stacker is input-aware in that it is configured to take as input the features as well as the base outputs to compute a prediction. The input-aware stacker is configured to compute the prediction from uncertainty data about the base outputs and/or from at least some combinations of the features of the sensor data. The control system has an output configured to send instructions to the target system on the basis of the computed prediction.Type: ApplicationFiled: December 17, 2015Publication date: June 22, 2017Inventors: Nicolo Fusi, Jennifer Listgarten, Miriam Huntley