Patents by Inventor Itay HUBARA

Itay HUBARA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPRESSING NEURAL NETWORKS THROUGH UNBIASED MINIMUM VARIANCE PRUNING

Publication number: 20240265260

Abstract: A DNN can be compressed by pruning one or more tensors for a deep learning operation. A first pruning parameter and a second pruning parameter are determined for a tensor. A vector having a size of the second pruning parameter may be extracted from the tensor. Pruning probabilities may be determined for the elements in the vector. One or more elements in the vector are selected based on the pruning probabilities. Alternatively, a matrix, in lieu of the vector, may be extracted from the tensor. Pruning probabilities may be determined for the columns in the matrix. One or more columns are selected based on their pruning probabilities. The number of the selected element(s) or column(s) may equal the first pruning parameter. The tensor can be modified by modifying the value(s) of the selected element(s) or column(s) and setting the value(s) of one or more unselected elements or columns to zero.

Type: Application

Filed: February 6, 2023

Publication date: August 8, 2024

Applicant: Habana Labs Ltd.

Inventors: Brian Chmiel, Itay Hubara, Ron Banner
Quantized neural network training and inference

Patent number: 10831444

Abstract: Training neural networks by constructing a neural network model having neurons each associated with a quantized activation function adapted to output a quantized activation value. The neurons are arranged in layers and connected by connections associated quantized connection weight functions adapted to output quantized connection weight values. During a training process a plurality of weight gradients are calculated during backpropagation sub-processes by computing neuron gradients, each of an output of a respective the quantized activation function in one layer with respect to an input of the respective quantized activation function. Each neuron gradient is calculated such that when an absolute value of the input is smaller than a positive constant threshold value, the respective neuron gradient is set as a positive constant output value and when the absolute value of the input is smaller than the positive constant threshold value the neuron gradient is set to zero.

Type: Grant

Filed: April 4, 2017

Date of Patent: November 10, 2020

Assignee: Technion Research & Development Foundation Limited

Inventors: Ran El-Yaniv, Itay Hubara, Daniel Soudry
Large-scale computations using an adaptive numerical format

Patent number: 10491239

Abstract: A computational device includes an input memory, which receives a first array of input numbers having a first precision represented by N bits. An output memory stores a second array of output numbers having a second precision represented by M bits, M<N. Quantization logic reads the input numbers from the input memory, extracts from each input number a set of M bits, at a bit offset within the input number that is indicated by a quantization factor, and writes a corresponding output number based on the extracted set of bits to the second array in the output memory. A quantization controller sets the quantization factor so as to optimally fit an available range of the output numbers in the second array to an actual range of the input numbers in the first array in extraction of the M bits from the input numbers.

Type: Grant

Filed: January 30, 2018

Date of Patent: November 26, 2019

Assignee: Habana Labs Ltd.

Inventor: Itay Hubara
QUANTIZED NEURAL NETWORK TRAINING AND INFERENCE

Publication number: 20170286830

Abstract: Training neural networks by constructing a neural network model having neurons each associated with a quantized activation function adapted to output a quantized activation value. The neurons are arranged in layers and connected by connections associated quantized connection weight functions adapted to output quantized connection weight values. During a training process a plurality of weight gradients are calculated during backpropagation sub-processes by computing neuron gradients, each of an output of a respective the quantized activation function in one layer with respect to an input of the respective quantized activation function. Each neuron gradient is calculated such that when an absolute value of the input is smaller than a positive constant threshold value, the respective neuron gradient is set as a positive constant output value and when the absolute value of the input is smaller than the positive constant threshold value the neuron gradient is set to zero.

Type: Application

Filed: April 4, 2017

Publication date: October 5, 2017

Inventors: Ran EL-YANIV, Itay HUBARA, Daniel SOUDRY

COMPRESSING NEURAL NETWORKS THROUGH UNBIASED MINIMUM VARIANCE PRUNING

Quantized neural network training and inference

Large-scale computations using an adaptive numerical format

QUANTIZED NEURAL NETWORK TRAINING AND INFERENCE