Patents by Inventor Marinus Willem VAN BAALEN

Marinus Willem VAN BAALEN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230376272
    Abstract: A processor-implemented method for fast floating point simulations with learnable parameters includes receiving a single precision input. An integer quantization process is performed on the input. Each element of the input is scaled based on a scaling parameter to generate an m-bit floating point output, where m is an integer.
    Type: Application
    Filed: January 27, 2023
    Publication date: November 23, 2023
    Inventors: Marinus Willem VAN BAALEN, Jorn Wilhelmus Timotheus PETERS, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT, Andrey KUZMIN
  • Publication number: 20230306233
    Abstract: A processor-implemented method includes bit shifting a binary representation of a neural network parameter. The neural network parameter has fewer bits, b, than a number of hardware bits, B, supported by hardware that processes the neural network parameter. The bit shifting effectively multiplies the neural network parameter by 2B-b. The method also includes dividing a quantization scale by 2B-b to obtain an updated quantization scale. The method further includes quantizing the bit shifted binary representation with the updated quantization scale to obtain a value for the neural network parameter.
    Type: Application
    Filed: January 30, 2023
    Publication date: September 28, 2023
    Inventors: Marinus Willem VAN BAALEN, Brian KAHNE, Eric Wayne MAHURIN, Tijmen Pieter Frederik BLANKEVOORT, Andrey KUZMIN, Andrii SKLIAR, Markus NAGEL
  • Publication number: 20230108248
    Abstract: A processor-implemented method includes retrieving, for a layer of a set of layers of an artificial neural network (ANN), a dense quantized matrix representing a codebook and a sparse quantized matrix representing linear coefficients. The dense quantized matrix and the sparse quantized matrix may be associated with a weight tensor of the layer. The processor-implemented method also includes determining, for the layer of the set of layers, the weight tensor based on a product of the dense quantized matrix and the sparse quantized matrix. The processor-implemented method further includes processing, at the layer, an input based on the weight tensor.
    Type: Application
    Filed: October 4, 2022
    Publication date: April 6, 2023
    Inventors: Andrey KUZMIN, Marinus Willem VAN BAALEN, Markus NAGEL, Arash BEHBOODI
  • Patent number: 11604987
    Abstract: Various embodiments include methods and neural network computing devices implementing the methods, for generating an approximation neural network. Various embodiments may include performing approximation operations on a weights tensor associated with a layer of a neural network to generate an approximation weights tensor, determining an expected output error of the layer in the neural network due to the approximation weights tensor, subtracting the expected output error from a bias parameter of the layer to determine an adjusted bias parameter and substituting the adjusted bias parameter for the bias parameter in the layer. Such operations may be performed for one or more layers in a neural network to produce an approximation version of the neural network for execution on a resource limited processor.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: March 14, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Marinus Willem Van Baalen, Tijmen Pieter Frederik Blankevoort, Markus Nagel
  • Publication number: 20230076290
    Abstract: A method for quantizing a pre-trained neural network includes computing a loss on a training set of candidate weights of the neural network. A rounding parameter is assigned to each candidate weight. The rounding parameter is a binary random value or a multinomial value. A quantized weight value is computed based on the loss and the rounding parameter.
    Type: Application
    Filed: February 4, 2021
    Publication date: March 9, 2023
    Inventors: Rana Ali AMJAD, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT, Marinus Willem VAN BAALEN, Christos LOUIZOS
  • Publication number: 20230058159
    Abstract: Various embodiments include methods and devices for joint mixed-precision quantization and structured pruning. Embodiments may include determining whether a plurality of gates of quantization and pruning gates are selected for combination, and in response to determining that the plurality of gates are selected for combination, iteratively for each successive gate of the plurality of gates selected for combination quantizing a residual error of a quantized tensor to a scale of a next bit-width producing a residual error quantized tensor in which the next bit-width increases for each successive iteration, and adding the quantized tensor and the residual error quantized tensor producing a next quantized tensor in which the next quantized tensor has the next bit-width, and in which the next quantized tensor is the quantized tensor for a successive iteration.
    Type: Application
    Filed: April 29, 2021
    Publication date: February 23, 2023
    Inventors: Marinus Willem VAN BAALEN, Christos LOUIZOS, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT, Rana Ali AMJAD
  • Publication number: 20220245457
    Abstract: Various embodiments include methods and devices for neural network pruning. Embodiments may include receiving as an input a weight tensor for a neural network, increasing a level of sparsity of the weight tensor generating a sparse weight tensor, updating the neural network using the sparse weight tensor generating an updated weight tensor, decreasing a level of sparsity of the updated weight tensor generating a dense weight tensor, increasing the level of sparsity of the dense weight tensor the dense weight tensor generating a final sparse weight tensor, and using the neural network with the final sparse weight tensor to generate inferences. Some embodiments may include increasing a level of sparsity of a first sparse weight tensor generating a second sparse weight tensor, updating the neural network using the second sparse weight tensor generating a second updated weight tensor, and decreasing the level of sparsity the second updated weight tensor.
    Type: Application
    Filed: November 23, 2021
    Publication date: August 4, 2022
    Inventors: Suraj SRINIVAS, Tijmen Pieter Frederik BLANKEVOORT, Andrey KUZMIN, Markus NAGEL, Marinus Willem VAN BAALEN, Andrii SKLIAR
  • Publication number: 20200302299
    Abstract: Various embodiments include methods and neural network computing devices implementing the methods for performing quantization in neural networks. Various embodiments may include equalizing ranges of weight tensors or output channel weights within a first layer of the neural network by scaling each of the output channel weights of the first layer by a corresponding scaling factor, and scaling each of a second adjacent layer's corresponding input channel weights by applying an inverse of the corresponding scaling factor to the input channel weights. The corresponding scaling factor may be determined using a black-box optimizer on a quantization error metric or based on heuristics, equalization of dynamic ranges, equalization of range extrema (minima or maxima), differential learning using straight through estimator (STE) methods and a local or global loss, or using an error metric for the quantization error and a black-box optimizer that minimizes the error metric with respect to the scaling.
    Type: Application
    Filed: March 23, 2020
    Publication date: September 24, 2020
    Inventors: Markus NAGEL, Marinus Willem VAN BAALEN, Tijmen Pieter Frederik BLANKEVOORT
  • Publication number: 20200302298
    Abstract: Various embodiments include methods and neural network computing devices implementing the methods for methods for method for generating an approximation neural network correcting for errors due to approximation operations. Various embodiments may include performing approximation operations on a weights tensor associated with a layer of a neural network to generate an approximation weights tensor, determining an expected output error of the layer in the neural network due to the approximation weights tensor, subtracting the expected output error from a bias parameter of the layer to determine an adjusted bias parameter and substituting the adjusted bias parameter for the bias parameter in the layer. Such operations may be performed for all layers in a neural network to produce an approximation version of the neural network for execution on a resource limited processor.
    Type: Application
    Filed: March 23, 2020
    Publication date: September 24, 2020
    Inventors: Marinus Willem VAN BAALEN, Tijmen Pieter Frederik BLANKEVOORT, Markus NAGEL