Patents by Inventor Bita Darvish Rouhani
Bita Darvish Rouhani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220405571Abstract: Embodiments of the present disclosure include systems and methods for sparsifying narrow data formats for neural networks. A plurality of activation values in a neural network are provided to a muxing unit. A set of sparsification operations are performed on a plurality of weight values to generate a subset of the plurality of weight values and mask values associated with the plurality of weight values. The subset of the plurality of weight values are provided to a matrix multiplication unit. The muxing unit generates a subset of the plurality of activation values based on the mask values and provides the subset of the plurality of activation values to the matrix multiplication unit. The matrix multiplication unit performs a set of matrix multiplication operations on the subset of the plurality of weight values and the subset of the plurality of activation values to generate a set of outputs.Type: ApplicationFiled: June 16, 2021Publication date: December 22, 2022Inventors: Bita DARVISH ROUHANI, Venmugil Elango, Eric S. Chung, Douglas C Burger, Mattheus C. Heddes, Nishit Shah, Rasoul Shafipour, Ankit More
-
Patent number: 11526601Abstract: A method for detecting and/or preventing an adversarial attack against a target machine learning model may be provided. The method may include training, based at least on training data, a defender machine learning model to enable the defender machine learning model to identify malicious input samples. The trained defender machine learning model may be deployed at the target machine learning model. The trained defender machine learning model may be coupled with the target machine learning model to at least determine whether an input sample received at the target machine learning model is a malicious input sample and/or a legitimate input sample. Related systems and articles of manufacture, including computer program products, are also provided.Type: GrantFiled: July 12, 2018Date of Patent: December 13, 2022Assignee: The Regents of the University of CaliforniaInventors: Bita Darvish Rouhani, Tara Javidi, Farinaz Koushanfar, Mohammad Samragh Razlighi
-
Publication number: 20220383123Abstract: Embodiments of the present disclosure include systems and methods for performing data-aware model pruning for neural networks. During a training phase, a neural network is trained with a first set of data. During a validation phase, inference with the neural network is performed using a second set of data that causes the neural network to generate a first set of outputs at a layer in the neural network. During the validation phase, a plurality of mean values and a plurality of variance values are calculated based on the first set of outputs. A plurality of entropy values are calculated based on the plurality of mean values and the plurality of variance values. A second set of outputs are pruned based on the plurality of entropy values. The second set of outputs are generated by the layer of the neural network using a third set of data.Type: ApplicationFiled: May 28, 2021Publication date: December 1, 2022Inventors: Venmugil ELANGO, Bita DARVISH ROUHANI, Eric S. CHUNG, Douglas C. BURGER, Maximilian GOLUB
-
Publication number: 20220383092Abstract: Embodiments of the present disclosure includes systems and methods for reducing computational cost associated with training a neural network model. A neural network model is received and a neural network training process is executed in which the neural network model is trained according to a first fidelity during a first training phase. As a result of a determination that training of the neural network model during the first training phase satisfies one or more criteria, the neural network model is trained at a second fidelity during a second training phase, the second fidelity being a higher fidelity than the first fidelity.Type: ApplicationFiled: May 25, 2021Publication date: December 1, 2022Inventors: Ritchie ZHAO, Bita DARVISH ROUHANI, Eric S. CHUNG, Douglas C. BURGER, Maximilian GOLUB
-
Publication number: 20220366236Abstract: Embodiments of the present disclosure include systems and methods for reducing operations for training neural networks. A plurality of training data selected from a training data set is used as a plurality of inputs for training a neural network. The neural network includes a plurality of weights. A plurality of loss values are determined based on outputs generated by the neural network and expected output data of the plurality of training data. A subset of the plurality of loss values are determined. An average loss value is determined based on the subset of the plurality of loss values. A set of gradients is calculated based on the average loss value and the plurality of weights in the neural network. The plurality of weights in the neural network are adjusted based on the set of gradients.Type: ApplicationFiled: May 17, 2021Publication date: November 17, 2022Inventors: Maral MESMAKHOSROSHAHI, Bita Darvish ROUHANI, Eric S. CHUNG, Douglas C. BURGER
-
Patent number: 11494614Abstract: Perplexity scores are computed for training data samples during ANN training. Perplexity scores can be computed as a divergence between data defining a class associated with a current training data sample and a probability vector generated by the ANN model. Perplexity scores can alternately be computed by learning a probability density function (“PDF”) fitting activation maps generated by an ANN model during training. A perplexity score can then be computed for a current training data sample by computing a probability for the current training data sample based on the PDF. If the perplexity score for a training data sample is lower than a threshold, the training data sample is removed from the training data set so that it will not be utilized for training during subsequent epochs. Training of the ANN model continues following the removal of training data samples from the training data set.Type: GrantFiled: March 20, 2019Date of Patent: November 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Eric S. Chung, Douglas C. Burger, Bita Darvish Rouhani
-
Publication number: 20220253281Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.Type: ApplicationFiled: June 28, 2021Publication date: August 11, 2022Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
-
Publication number: 20220245444Abstract: Embodiments of the present disclosure include a system for optimizing an artificial neural network by configuring a model, based on a plurality of training parameters, to execute a training process, monitoring a plurality of statistics produced upon execution of the training process, and adjusting one or more of the training parameters, based on one or more of the statistics, to maintain at least one of the statistics within a predetermined range. In some embodiments, artificial intelligence (AI) processors may execute a training process on a model, the training process having an associated set of training parameters. Execution of the training process may produce a plurality of statistics. Control processor(s) coupled to the AI processor(s) may receive the statistics, and in accordance therewith, adjust one or more of the training parameters to maintain at least one of the statistics within a predetermined range during execution of the training process.Type: ApplicationFiled: January 29, 2021Publication date: August 4, 2022Inventors: Maximilian Golub, Ritchie Zhao, Eric Chung, Douglas Burger, Bita Darvish Rouhani, Ge Yang, Nicolo Fusi
-
Patent number: 11386326Abstract: A method may include a transforming a trained machine learning model including by replacing at least one layer of the trained machine learning model with a dictionary matrix and a coefficient matrix. The dictionary matrix and the coefficient matrix may be formed by decomposing a weight matrix associated with the at least one layer of the trained machine learning model. A product of the dictionary matrix and the coefficient matrix may form a reduced-dimension representation of the weight matrix associated with the at least one layer of the trained machine learning model. The transformed machine learning model may be deployed to a client. Related systems and computer program products are also provided.Type: GrantFiled: June 11, 2019Date of Patent: July 12, 2022Assignee: The Regents of the University of CaliforniaInventors: Fang Lin, Mohammad Ghasemzadeh, Bita Darvish Rouhani, Farinaz Koushanfar
-
Publication number: 20210295166Abstract: A system may include a processor and a memory. The memory may include program code that provides operations when executed by the processor. The operations may include: partitioning, based at least on a resource constraint of a platform, a global machine learning model into a plurality of local machine learning models; transforming training data to at least conform to the resource constraint of the platform; and training the global machine learning model by at least processing, at the platform, the transformed training data with a first of the plurality of local machine learning models.Type: ApplicationFiled: February 6, 2017Publication date: September 23, 2021Applicant: WILLIAM MARSH RICE UNIVERSITYInventors: Bita Darvish Rouhani, Azalia Mirhoseini, Farinaz Koushanfar
-
Publication number: 20210019605Abstract: A method may include embedding, in a hidden layer and/or an output layer of a first machine learning model, a first digital watermark. The first digital watermark may correspond to input samples altering the low probabilistic regions of an activation map associated with the hidden layer of the first machine learning model. Alternatively, the first digital watermark may correspond to input samples rarely encountered by the first machine learning model. The first digital watermark may be embedded in the first machine learning model by at least training, based on training data including the input samples, the first machine learning model. A second machine learning model may be determined to be a duplicate of the first machine learning model based on a comparison of the first digital watermark embedded in the first machine learning model and a second digital watermark extracted from the second machine learning model.Type: ApplicationFiled: March 21, 2019Publication date: January 21, 2021Inventors: Bita Darvish Rouhani, Huili Chen, Farinaz Koushanfar
-
Publication number: 20210012239Abstract: This document relates to automating the generation of machine learning models for evaluation of computer networks. Generally, the disclosed techniques can obtain network context data reflecting characteristics of a network, identify a type of evaluation to be performed on the network, and select a particular machine learning model for evaluating the network based at least on the type of evaluation. The disclosed techniques can also select one or features to train the particular machine learning model.Type: ApplicationFiled: July 12, 2019Publication date: January 14, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Behnaz ARZANI, Bita Darvish Rouhani
-
Publication number: 20200302273Abstract: Perplexity scores are computed for training data samples during ANN training. Perplexity scores can be computed as a divergence between data defining a class associated with a current training data sample and a probability vector generated by the ANN model. Perplexity scores can alternately be computed by learning a probability density function (“PDF”) fitting activation maps generated by an ANN model during training. A perplexity score can then be computed for a current training data sample by computing a probability for the current training data sample based on the PDF. If the perplexity score for a training data sample is lower than a threshold, the training data sample is removed from the training data set so that it will not be utilized for training during subsequent epochs. Training of the ANN model continues following the removal of training data samples from the training data set.Type: ApplicationFiled: March 20, 2019Publication date: September 24, 2020Inventors: Eric S. CHUNG, Douglas C. BURGER, Bita DARVISH ROUHANI
-
Publication number: 20200264876Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.Type: ApplicationFiled: February 14, 2019Publication date: August 20, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
-
Publication number: 20200265301Abstract: Technology related to incremental training of machine learning tools is disclosed. In one example of the disclosed technology, a method can include receiving operational parameters of a machine learning tool based on a primary set of training data. The machine learning tool can be a deep neural network. Input data can be applied to the machine learning tool to generate an output of the machine learning tool. A measure of prediction quality can be generated for the output of the machine learning tool. In response to determining the measure of prediction quality is below a threshold, incremental training of the operational parameters can be initiated using the input data as training data for the machine learning tool. Operational parameters of the machine learning tool can be updated based on the incremental training. The updated operational parameters can be stored.Type: ApplicationFiled: February 15, 2019Publication date: August 20, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Eric S. Chung, Bita Darvish Rouhani
-
Publication number: 20200210840Abstract: Apparatus and methods for training neural networks based on a performance metric, including adjusting numerical precision and topology as training progresses are disclosed. In some examples, block floating-point formats having relatively lower accuracy are used during early stages of training. Accuracy of the floating-point format can be increased as training progresses based on a determined performance metric. In some examples, values for the neural network are transformed to normal precision floating-point formats. The performance metric can be determined based on entropy of values for the neural network, accuracy of the neural network, or by other suitable techniques. Accelerator hardware can be used to implement certain implementations, including hardware having direct support for block floating-point formats.Type: ApplicationFiled: December 31, 2018Publication date: July 2, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Bita Darvish Rouhani, Eric S. Chung, Daniel Lo, Douglas C. Burger
-
Publication number: 20200202213Abstract: Methods and apparatus are disclosed for adjusting hyper-parameters of a neural network to compensate for noise, such as noise introduced via quantization of one or more parameters of the neural network. In some examples, the adjustment can include scaling the hyper-parameter based on at least one metric representing noise present in the neural network. The at least one metric can include a noise-to-signal ratio for weights of the neural network, such as edge weights and activation weights. In a quantized neural network, a learning rate hyper-parameter used to compute a gradient update for a layer during back propagation can be scaled based on the at least one metric. In some examples, the same scaled learning rate can be used when computing gradient updates for other layers.Type: ApplicationFiled: December 19, 2018Publication date: June 25, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Bita Darvish Rouhani, Eric S. Chung, Daniel Lo, Douglas C. Burger
-
Publication number: 20200193274Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.Type: ApplicationFiled: December 18, 2018Publication date: June 18, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Bita Darvish Rouhani, Taesik Na, Eric S. Chung, Daniel Lo, Douglas C. Burger
-
Publication number: 20200167471Abstract: A method for detecting and/or preventing an adversarial attack against a target machine learning model may be provided. The method may include training, based at least on training data, a defender machine learning model to enable the defender machine learning model to identify malicious input samples. The trained defender machine learning model may be deployed at the target machine learning model. The trained defender machine learning model may be coupled with the target machine learning model to at least determine whether an input sample received at the target machine learning model is a malicious input sample and/or a legitimate input sample. Related systems and articles of manufacture, including computer program products, are also provided.Type: ApplicationFiled: July 12, 2018Publication date: May 28, 2020Inventors: Bita Darvish Rouhani, Tara Javidi, Farinaz Koushanfar, Mohammad Samragh Razlighi
-
Publication number: 20200125960Abstract: A method, a system, and a computer program product for fast training and/or execution of neural networks. A description of a neural network architecture is received. Based on the received description, a graph representation of the neural network architecture is generated. The graph representation includes one or more nodes connected by one or more connections. At least one connection is modified. Based on the generated graph representation, a new graph representation is generated using the modified at least one connection. The new graph representation has a small-world property. The new graph representation is transformed into a new neural network architecture.Type: ApplicationFiled: October 23, 2019Publication date: April 23, 2020Inventors: Mojan Javaheripi, Farinaz Koushanfar, Bita Darvish Rouhani