Patents by Inventor Eric A. Sather

Eric A. Sather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12596931
    Abstract: Some embodiments provide a method for training a machine-trained network that includes multiple parameters. The method propagates a batch of input training items through the network to generate output values and compute values of a loss function for each of the input training items. The method uses the computed values of the loss function for the input training items to adjust the parameters of the network. The method computes a gradient of the loss function for each of the input training items. The method selects input training items for subsequent batches of input training items based on a ratio of the value of the loss function to the gradient of the loss function for each of the input training items.
    Type: Grant
    Filed: December 26, 2022
    Date of Patent: April 7, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Steven L. Teig, Eric A. Sather, Andrew F. Siegel, Evgeny Sorkin
  • Patent number: 12579430
    Abstract: Some embodiments provide a method for improving structural sparsity of a machine-trained (MT) network. The method receives a network having multiple layers. Each layer of a set of the layers includes multiple filters of weight values. The method replaces the filters of a particular layer of the network with (i) a first set of filters of weight values, (ii) a set of scale values for the first set of filters, and (iii) a second set of filters of weight values. Each scale value corresponds to a different one of the filters of the first set of filters. The method trains the network by applying constraints to bias at least a subset of the scale values towards zero. When a particular scale value falls below a threshold value, the particular scale value is set to zero.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: March 17, 2026
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12572798
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method receives a network having multiple layers. Each layer of a set of the layers includes multiple weight values. The method trains the network by alternately (1) propagating inputs through the network to generate outputs and adjusting the weight values based on differences between the generated outputs and expected outputs and (2) identifying sets of the weight values for removal according to a set of constraints that accounts for (i) a total number of weight values and (ii) an amount of time required to execute the network on a particular type of integrated circuit.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: March 10, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12536413
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method receives a network comprising a plurality of parameters. The method trains the network by iteratively (i) propagating inputs through the network to generate outputs and adjusting the parameters based on differences between the generated outputs and expected outputs to minimize a loss function with respect to the parameters, (ii) probabilistically projecting the parameters to minimize the loss function with respect to a set of constraints on the weight values, the probabilistic projection treating the parameters as probability distributions, and updating a set of variables of the loss function based on the probability distributions.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: January 27, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12462350
    Abstract: Some embodiments provide a neural network inference circuit for executing a neural network that includes multiple nodes that use state data from previous executions of the neural network. The neural network inference circuit includes (i) a set of computation circuits configured to execute the nodes of the neural network and (ii) a set of memories configured to implement a set of one or more registers to store, while executing the neural network for a particular input, state data generated during at least two executions of the network for previous inputs. The state data is for use by the set of computation circuits when executing a set of the nodes of the neural network for the particular input.
    Type: Grant
    Filed: January 5, 2024
    Date of Patent: November 4, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Andrew C. Mihal, Steven L. Teig, Eric A. Sather
  • Patent number: 12367661
    Abstract: Some embodiments provide a method for training a machine-trained network that includes multiple parameters. The method propagates a batch of input training items through the network to generate output values and compute values of a loss function for each of the input training items. The method computes a weight for each input training item based on the computed loss function values for each of the input training items. The method selects input training items with larger weights more often than input training items with smaller weights for subsequent batches of input training items.
    Type: Grant
    Filed: December 26, 2022
    Date of Patent: July 22, 2025
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Steven L. Teig, Eric A. Sather, Andrew F. Siegel, Evgeny Sorkin
  • Patent number: 12299555
    Abstract: Some embodiments provide an electronic device that includes a set of processing units and a set of machine-readable media. The set of machine-readable media stores sets of instructions for applying a network of computation nodes to an input received by the device. The set of machine-readable media stores at least two sets of machine-trained parameters for configuring the network for different types of inputs. A first of the sets of parameters is used for applying the network to a first type of input and a second of the sets of parameters is used for applying the network to a second type of input.
    Type: Grant
    Filed: August 24, 2022
    Date of Patent: May 13, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Steven L. Teig, Eric A. Sather
  • Patent number: 12248880
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes inputs using network parameters. The method propagates a set of input training items through the MT network to generate a set of output values. The set of input training items comprises multiple training items for each of multiple categories. The method identifies multiple training item groupings in the set of input training items. Each grouping includes at least two training items in a first category and at least one training item in a second category. The method calculates a value of a loss function as a summation of individual loss functions for each of the identified training item groupings. The individual loss function for each particular training item grouping is based on the output values for the training items of the grouping. The method trains the network parameters using the calculated loss function value.
    Type: Grant
    Filed: August 27, 2023
    Date of Patent: March 11, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
  • Publication number: 20250028945
    Abstract: Some embodiments provide a method for executing a layer of a neural network, for a circuit that restricts a number of weight values used per layer. The method applies a first set of weights to a set of inputs to generate a first set of results. The first set of weights are restricted to a first set of allowed values. For each of one or more additional sets of weights, the method applies the respective additional set of weights to the same set of inputs to generate a respective additional set of results. The respective additional set of weights is restricted to a respective additional set of allowed values that is related to the first set of allowed values and the other additional sets of allowed values. The method generates outputs for the particular layer by combining the first set of results with each respective additional set of results.
    Type: Application
    Filed: May 17, 2024
    Publication date: January 23, 2025
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12175368
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method propagates multiple inputs through the MT network to generate an output for each of the inputs. each of the inputs is associated with an expected output, the MT network uses multiple network parameters to process the inputs, and each network parameter of a set of the network parameters is defined during training as a probability distribution across a discrete set of possible values for the network parameter. The method calculates a value of a loss function for the MT network that includes (i) a first term that measures network error based on the expected outputs compared to the generated outputs and (ii) a second term that penalizes divergence of the probability distribution for each network parameter in the set of network parameters from a predefined probability distribution for the network parameter.
    Type: Grant
    Filed: November 7, 2022
    Date of Patent: December 24, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Steven L. Teig, Eric A. Sather
  • Patent number: 12165066
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes input data using network parameters. The method maps a set of input instances to a set of output values by propagating the set of input instances through the MT network. The set of input instances includes input instances for each of multiple categories. For a particular input instance selected as an anchor instance, the method calculates a true positive rate (TPR) for the MT network as a function of a distance between the output value for the anchor instance and the output value for each input instance not in a same category as the anchor instance. The method calculates a loss function for the anchor instance that maximizes the TPR for the MT network at low false positive rate. The method trains the network parameters using the calculated loss function.
    Type: Grant
    Filed: March 14, 2018
    Date of Patent: December 10, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
  • Patent number: 12136039
    Abstract: Some embodiments provide a method for training multiple parameters of a machine-trained (MT) network subject to a sparsity constraint that requires a threshold portion of the parameters to be equal to zero. A first set of the parameters subject to the sparsity constraint are grouped into groups of parameters. For each parameter of a second set of the parameters subject to the sparsity constraint, the method determines an accuracy penalty associated with the parameter being set to zero. For each group of parameters in the first set of parameters, the method determines a minimum accuracy penalty for each possible number of parameters in the group being set to zero. The method uses the determined accuracy penalties to set to the value zero at least the threshold portion of the plurality of parameters.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: November 5, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12112254
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network. The method uses a set of training inputs to train parameters of the MT network according to an initial loss function. The method uses a set of validation inputs to compute an error measure for the MT network as trained by the first set of training inputs. The method modifies the loss function for subsequent training of the MT network based on the computed error measure. The method uses the set of training inputs to train the parameters of the MT network according to the modified loss function.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: October 8, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Steven L. Teig, Eric A. Sather
  • Patent number: 12061988
    Abstract: Some embodiments provide a method for training parameters of a network. The method receives a network with layers of nodes. Each node of a set of the layers computes an output value based on a set of input values and a set of trained weight values. A first layer of the network includes a first number of filters. The method replaces the first layer with a second layer having a second number of filters that is less than the first number and a third layer, following the second layer, having the first number of filters. Each weight value in the filters of the second and third layers is restricted to a set of allowed quantized weight values. A total number of weight values in the filters of the second and third layers is less than a total number of weight values in the filters of the first layer.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: August 13, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12061981
    Abstract: Some embodiments provide a method for training parameters of a network. the method receives a machine-trained (MT) network with multiple layers of computation nodes. Each computation node of a set of the layers computes an output value based on a set of input values and a set of trained weight values. A first layer of the MT network includes a first number of filters. The method replaces the first layer with (i) a second layer having a second number of filters that is less than the first number of filters and (ii) a third layer having the first number of filters. Output values of computation nodes of the second layer are quantized and the third layer using the quantized output values of the second layer as input values.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: August 13, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 12045725
    Abstract: Some embodiments provide a method for training a network including layers that each includes multiple nodes. The method identifies a set of related layers of the network. Each node in one of the related layers has corresponding nodes in each of the other related layers. Each set of corresponding nodes receives a same set of inputs and applies different sets of weights to the inputs to generate an output. The method identifies an element-wise addition layer including nodes that each add outputs of a different set of corresponding nodes from the related layers to generate a sum. The method uses a set of outputs generated by the nodes of each related layer to determine batch normalization parameters specific to each layer of the set of related layers. The method uses data generated by the element-wise addition layer to determine batch normalization parameters for the set of related layers.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: July 23, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig
  • Publication number: 20240193426
    Abstract: Some embodiments of the invention provide a novel method for training a quantized machine-trained network. Some embodiments provide a method of scaling a feature map of a pre-trained floating-point neural network in order to match the range of output values provided by quantized activations in a quantized neural network. A quantization function is modified, in some embodiments, to be differentiable to fix the mismatch between the loss function computed in forward propagation and the loss gradient used in backward propagation. Variational information bottleneck, in some embodiments, is incorporated to train the network to be insensitive to multiplicative noise applied to each channel. In some embodiments, channels that finish training with large noise, for example, exceeding 100%, are pruned.
    Type: Application
    Filed: December 15, 2023
    Publication date: June 13, 2024
    Inventors: Eric A. Sather, Steven L. Teig
  • Patent number: 11995537
    Abstract: Some embodiments provide a method for training a machine-trained (MT) network that processes input data using network parameters. The method maps a set of input instances to a set of output values by propagating the set of input instances through the MT network. The set of input instances include input instances for each of multiple categories. The method selects multiple input instances as anchor instances. For each anchor instance, the method computes a loss function as a comparison between the output value for the anchor instance and each output value for an input instance in a different category than the anchor. The method computes a total loss function for the MT network as a sum of the loss function computed for each anchor instance. The method trains the network parameters using the computed total loss function.
    Type: Grant
    Filed: March 14, 2018
    Date of Patent: May 28, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig, Andrew C. Mihal
  • Patent number: 11995533
    Abstract: Some embodiments provide a method for executing a layer of a neural network, for a circuit that restricts a number of weight values used per layer. The method applies a first set of weights to a set of inputs to generate a first set of results. The first set of weights are restricted to a first set of allowed values. For each of one or more additional sets of weights, the method applies the respective additional set of weights to the same set of inputs to generate a respective additional set of results. The respective additional set of weights is restricted to a respective additional set of allowed values that is related to the first set of allowed values and the other additional sets of allowed values. The method generates outputs for the particular layer by combining the first set of results with each respective additional set of results.
    Type: Grant
    Filed: November 14, 2019
    Date of Patent: May 28, 2024
    Assignee: PERCEIVE CORPORATION
    Inventors: Eric A. Sather, Steven L. Teig
  • Publication number: 20240153044
    Abstract: Some embodiments provide a neural network inference circuit for executing a neural network that includes multiple nodes that use state data from previous executions of the neural network. The neural network inference circuit includes (i) a set of computation circuits configured to execute the nodes of the neural network and (ii) a set of memories configured to implement a set of one or more registers to store, while executing the neural network for a particular input, state data generated during at least two executions of the network for previous inputs. The state data is for use by the set of computation circuits when executing a set of the nodes of the neural network for the particular input.
    Type: Application
    Filed: January 5, 2024
    Publication date: May 9, 2024
    Inventors: Andrew C. Mihal, Steven L. Teig, Eric A. Sather