Patents by Inventor Eric Sather

Eric Sather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CIRCUIT FOR EXECUTING STATEFUL NEURAL NETWORK

Publication number: 20240153044

Abstract: Some embodiments provide a neural network inference circuit for executing a neural network that includes multiple nodes that use state data from previous executions of the neural network. The neural network inference circuit includes (i) a set of computation circuits configured to execute the nodes of the neural network and (ii) a set of memories configured to implement a set of one or more registers to store, while executing the neural network for a particular input, state data generated during at least two executions of the network for previous inputs. The state data is for use by the set of computation circuits when executing a set of the nodes of the neural network for the particular input.

Type: Application

Filed: January 5, 2024

Publication date: May 9, 2024

Inventors: Andrew C. Mihal, Steven L. Teig, Eric A. Sather
Removing nodes from machine-trained network based on introduction of probabilistic noise during training

Patent number: 11900238

Abstract: Some embodiments provide a method for reducing complexity of a machine-trained (MT) network that receives input data and computes output data for each input data. The MT network includes multiple computation nodes that (i) generate output values and (ii) use output values of other computation nodes as input values. During training of the MT network, the method introduces probabilistic noise to the output values of a set of the computation nodes. the method determines a subset of the computation nodes for which the introduction of the probabilistic noise to the output value does not affect the computed output data for the network. The method removes the subset of computation nodes from the trained MT network.

Type: Grant

Filed: February 3, 2020

Date of Patent: February 13, 2024

Assignee: PERCEIVE CORPORATION

Inventors: Steven L. Teig, Eric A. Sather
Circuit for executing stateful neural network

Patent number: 11868871

Abstract: Some embodiments provide a neural network inference circuit for executing a neural network that includes multiple nodes that use state data from previous executions of the neural network. The neural network inference circuit includes (i) a set of computation circuits configured to execute the nodes of the neural network and (ii) a set of memories configured to implement a set of one or more registers to store, while executing the neural network for a particular input, state data generated during at least two executions of the network for previous inputs. The state data is for use by the set of computation circuits when executing a set of the nodes of the neural network for the particular input.

Type: Grant

Filed: September 26, 2019

Date of Patent: January 9, 2024

Assignee: PERCEIVE CORPORATION

Inventors: Andrew C. Mihal, Steven L Teig, Eric A. Sather
USING BATCHES OF TRAINING ITEMS FOR TRAINING A NETWORK

Publication number: 20230409918

Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes inputs using network parameters. The method propagates a set of input training items through the MT network to generate a set of output values. The set of input training items comprises multiple training items for each of multiple categories. The method identifies multiple training item groupings in the set of input training items. Each grouping includes at least two training items in a first category and at least one training item in a second category. The method calculates a value of a loss function as a summation of individual loss functions for each of the identified training item groupings. The individual loss function for each particular training item grouping is based on the output values for the training items of the grouping. The method trains the network parameters using the calculated loss function value.

Type: Application

Filed: August 27, 2023

Publication date: December 21, 2023

Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
Loss-aware replication of neural network layers

Patent number: 11847567

Abstract: Some embodiments provide a method that receives a network with trained floating-point weight values. The network includes layers of nodes, each of which computes an output value based on input values and trained weight values. To replace a first layer of the trained network in a modified network with quantized weight values, the method defines multiple replica layers. Each replica layer includes nodes that correspond to nodes of the first layer, has a different set of allowed quantized weight values, and receives the same input values from a previous layer of the modified network such that groups of corresponding nodes from the replica layers operate correspondingly to the first layer. The method trains the quantized weight values of the modified network using a loss function with terms that account for effect on the loss function due to the quantization and for interactions between corresponding weight values of the replica layers.

Type: Grant

Filed: July 7, 2020

Date of Patent: December 19, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig, Alexandru F. Drimbarean
Quantizing neural networks using shifting and scaling

Patent number: 11847568

Abstract: Some embodiments of the invention provide a novel method for training a quantized machine-trained network. Some embodiments provide a method of scaling a feature map of a pre-trained floating-point neural network in order to match the range of output values provided by quantized activations in a quantized neural network. A quantization function is modified, in some embodiments, to be differentiable to fix the mismatch between the loss function computed in forward propagation and the loss gradient used in backward propagation. Variational information bottleneck, in some embodiments, is incorporated to train the network to be insensitive to multiplicative noise applied to each channel. In some embodiments, channels that finish training with large noise, for example, exceeding 100%, are pruned.

Type: Grant

Filed: October 8, 2019

Date of Patent: December 19, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig
Using batches of training items for training a network

Patent number: 11741369

Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes inputs using network parameters. The method propagates a set of input training items through the MT network to generate a set of output values. The set of input training items comprises multiple training items for each of multiple categories. The method identifies multiple training item groupings in the set of input training items. Each grouping includes at least two training items in a first category and at least one training item in a second category. The method calculates a value of a loss function as a summation of individual loss functions for each of the identified training item groupings. The individual loss function for each particular training item grouping is based on the output values for the training items of the grouping. The method trains the network parameters using the calculated loss function value.

Type: Grant

Filed: October 29, 2021

Date of Patent: August 29, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
Neural networks with spatial and temporal features

Patent number: 11620495

Abstract: Some embodiments provide a method for executing a neural network that includes multiple nodes. The method receives an input for a particular execution of the neural network. The method receives state data that includes data generated from at least two previous executions of the neural network. The method executes the neural network to generate a set of output data for the received input. A set of the nodes performs computations using (i) data output from other nodes of the particular execution of the neural network and (ii) the received state data generated from at least two previous executions of the neural network.

Type: Grant

Filed: September 26, 2019

Date of Patent: April 4, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Andrew C. Mihal, Steven L. Teig, Eric A. Sather
Preventing overfitting of hyperparameters during training of network

Patent number: 11610154

Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method uses a first set of inputs to train parameters of the MT network according to a set of hyperparameters that define aspects of the training. The method uses a second set of inputs to validate the MT network as trained by the first set of inputs. Based on the validation, the method modifies the hyperparameters for subsequent training of the MT network, wherein the hyperparameter modification is constrained to prevent overfitting of the modified hyperparameters to the second set of inputs.

Type: Grant

Filed: February 3, 2020

Date of Patent: March 21, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Steven L. Teig, Eric A. Sather
Training Sparse Networks With Discrete Weight Values

Publication number: 20230084673

Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method propagates multiple inputs through the MT network to generate an output for each of the inputs. each of the inputs is associated with an expected output, the MT network uses multiple network parameters to process the inputs, and each network parameter of a set of the network parameters is defined during training as a probability distribution across a discrete set of possible values for the network parameter. The method calculates a value of a loss function for the MT network that includes (i) a first term that measures network error based on the expected outputs compared to the generated outputs and (ii) a second term that penalizes divergence of the probability distribution for each network parameter in the set of network parameters from a predefined probability distribution for the network parameter.

Type: Application

Filed: November 7, 2022

Publication date: March 16, 2023

Inventors: Steven L. Teig, Eric A. Sather
Replication of neural network layers

Patent number: 11604973

Abstract: Some embodiments provide a method for training parameters of a machine-trained (MT) network. The method receives an MT network with multiple layers of nodes, each of which computes an output value based on a set of input values and a set of trained weight values. Each layer has a set of allowed weight values. For a first layer with a first set of allowed weight values, the method defines a second layer with nodes corresponding to each of the nodes of the first layer, each second-layer node receiving the same input values as the corresponding first-layer node. The second layer has a second, different set of allowed weight values, with the output values of the nodes of the first layer added with the output values of the corresponding nodes of the second layer to compute output values that are passed to a subsequent layer. The method trains the weight values.

Type: Grant

Filed: November 27, 2019

Date of Patent: March 14, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig
Training network to minimize worst case surprise

Patent number: 11586902

Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes input data using network parameters. The method maps input instances to output values by propagating the instances through the network. The input instances include instances for each of multiple categories. For a particular instance selected as an anchor instance, the method identifies each instance in a different category as a negative instance. The method calculates, for each negative instance of the anchor, a surprise function that probabilistically measures a surprise of finding an output value for an instance in the same category as the anchor that is a greater distance from the output value for the anchor instance than output value for the negative instance. The method calculates a loss function that emphasizes a maximum surprise calculated for the anchor. The method trains the network parameters using the calculated loss function value to minimize the maximum surprise.

Type: Grant

Filed: March 14, 2018

Date of Patent: February 21, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
Training sparse networks with discrete weight values

Patent number: 11537870

Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method propagates multiple inputs through the MT network to generate an output for each of the inputs. each of the inputs is associated with an expected output, the MT network uses multiple network parameters to process the inputs, and each network parameter of a set of the network parameters is defined during training as a probability distribution across a discrete set of possible values for the network parameter. The method calculates a value of a loss function for the MT network that includes (i) a first term that measures network error based on the expected outputs compared to the generated outputs and (ii) a second term that penalizes divergence of the probability distribution for each network parameter in the set of network parameters from a predefined probability distribution for the network parameter.

Type: Grant

Filed: March 14, 2018

Date of Patent: December 27, 2022

Assignee: PERCEIVE CORPORATION

Inventors: Steven L. Teig, Eric A. Sather
Training network with discrete weight values

Publication number: 20220405591

Abstract: Some embodiments provide an electronic device that includes a set of processing units and a set of machine-readable media. The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device. The set of machine-readable media stores at least two sets of machine-trained parameters for configuring the network for different types of inputs. A first of the sets of parameters is used for applying the network to a first type of input and a second of the sets of parameters is used for applying the network to a second type of input.

Type: Application

Filed: August 24, 2022

Publication date: December 22, 2022

Inventors: Steven L. Teig, Eric A. Sather
Iterative transfer of machine-trained network inputs from validation set to training set

Patent number: 11531879

Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method uses a first set of training inputs to train parameters of the MT network. The method uses a set of validation inputs to measure error for the MT network as trained by the first set of training inputs. The method adds at least a subset of the validation inputs to the first set of training inputs to create a second set of training inputs. The method uses the second set of training inputs to train the parameters of the MT network. The error measurement is used to modify the training with the second set of training inputs.

Type: Grant

Filed: June 26, 2019

Date of Patent: December 20, 2022

Assignee: PERCEIVE CORPORATION

Inventors: Steven L. Teig, Eric A. Sather
Quantizing neural networks using approximate quantization function

Patent number: 11494657

Abstract: Some embodiments of the invention provide a novel method for training a quantized machine-trained network. Some embodiments provide a method of scaling a feature map of a pre-trained floating-point neural network in order to match the range of output values provided by quantized activations in a quantized neural network. A quantization function is modified, in some embodiments, to be differentiable to fix the mismatch between the loss function computed in forward propagation and the loss gradient used in backward propagation. Variational information bottleneck, in some embodiments, is incorporated to train the network to be insensitive to multiplicative noise applied to each channel. In some embodiments, channels that finish training with large noise, for example, exceeding 100%, are pruned.

Type: Grant

Filed: October 8, 2019

Date of Patent: November 8, 2022

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig
Device storing multiple sets of parameters for machine-trained network

Patent number: 11429861

Abstract: Some embodiments provide an electronic device that includes a set of processing units and a set of machine-readable media. The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device. The set of machine-readable media stores at least two sets of machine-trained parameters for configuring the network for different types of inputs. A first of the sets of parameters is used for applying the network to a first type of input and a second of the sets of parameters is used for applying the network to a second type of input.

Type: Grant

Filed: November 16, 2017

Date of Patent: August 30, 2022

Assignee: PERCEIVE CORPORATION

Inventors: Steven L. Teig, Eric A. Sather
USING BATCHES OF TRAINING ITEMS FOR TRAINING A NETWORK

Publication number: 20220051002

Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes inputs using network parameters. The method propagates a set of input training items through the MT network to generate a set of output values. The set of input training items comprises multiple training items for each of multiple categories. The method identifies multiple training item groupings in the set of input training items. Each grouping includes at least two training items in a first category and at least one training item in a second category. The method calculates a value of a loss function as a summation of individual loss functions for each of the identified training item groupings. The individual loss function for each particular training item grouping is based on the output values for the training items of the grouping. The method trains the network parameters using the calculated loss function value.

Type: Application

Filed: October 29, 2021

Publication date: February 17, 2022

Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
Using batches of training items for training a network

Patent number: 11163986

Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes inputs using network parameters. The method propagates a set of input training items through the MT network to generate a set of output values. The set of input training items comprises multiple training items for each of multiple categories. The method identifies multiple training item groupings in the set of input training items. Each grouping includes at least two training items in a first category and at least one training item in a second category. The method calculates a value of a loss function as a summation of individual loss functions for each of the identified training item groupings. The individual loss function for each particular training item grouping is based on the output values for the training items of the grouping. The method trains the network parameters using the calculated loss function value.

Type: Grant

Filed: April 17, 2020

Date of Patent: November 2, 2021

Assignee: PERCEIVE CORPORATION

Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
Video denoising using neural networks with spatial and temporal features

Patent number: 11151695

Abstract: Some embodiments provide a method for processing a video that includes a sequence of images using a neural network. The method receives a set of video images as a set of inputs to successive executions of the neural network. The method executes the neural network for each successive video image of the set of video images to reduce an amount of noise in the video image by (i) identifying spatial features of the video image and (ii) storing a set of state data representing identified spatial features for use in identifying spatial features of subsequent video images in the set of video images. Identifying spatial features of a particular video image includes using the stored sets of spatial features of video images previous to the particular video image.

Type: Grant

Filed: September 26, 2019

Date of Patent: October 19, 2021

Assignee: PERCEIVE CORPORATION

Inventors: Andrew C. Mihal, Steven L. Teig, Eric A. Sather

1 2 next