Patents by Inventor Tijmen Pieter Frederik BLANKEVOORT

Tijmen Pieter Frederik BLANKEVOORT has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250245494
    Abstract: Systems and techniques are described herein for quantizing a codebook used in the context of quantizing post-training parameters (e.g., vectors of weights) of a pre-trained model. For example, a device can perform rank reduction on a tensor of a codebook associated with parameters of a layer of a pre-trained machine learning model to generate a first tensor factor having a first shape and a second tensor factor having a second shape. The device can perform an optimization technique on the first tensor factor and the second tensor factor to minimize an output reconstruction error of the layer. The device can quantize the first tensor factor to generate a reduced size codebook.
    Type: Application
    Filed: August 30, 2024
    Publication date: July 31, 2025
    Inventors: Marinus Willem VAN BAALEN, Eric Wayne MAHURIN, Paul Nicholas WHATMOUGH, Andrey KUZMIN, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT
  • Publication number: 20250124255
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. The method generally includes receiving an input query for processing. Using a first generative artificial intelligence model, an embedding representation of the received input query is generated. The embedding representation generally includes an embedding of the received input query in a first dimensionality. The embedding representation is projected into a projected representation of the received input query. Generally, the projected representation comprises a representation in a second dimensionality. A response to the received input query is generated using a second generative artificial intelligence model and the projected representation, and the generated response is output.
    Type: Application
    Filed: October 13, 2023
    Publication date: April 17, 2025
    Inventors: Benjamin BERGNER, Andrii SKLIAR, Babak EHTESHAMI BEJNORDI, Yuki Markus ASANO, Tijmen Pieter Frederik BLANKEVOORT, Joseph Binamira SORIAGA
  • Patent number: 12271800
    Abstract: Various aspects provide methods for learning, such as continual learning, that support task-incremental learning using a multi-head classification architecture. Various aspects may enable conditional computing to support multi-head classification. Various aspects provide methods for learning, such as continual learning, that support class-incremental learning using a single-head classification architecture. Various aspects may enable conditional computing to support single-head classification by predicting the task associated with a given test input and selecting an associated classification head based at least in part on the task prediction.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: April 8, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Davide Abati, Babak Ehteshami Bejnordi, Jakub Mikolaj Tomczak, Tijmen Pieter Frederik Blankevoort
  • Publication number: 20250111232
    Abstract: An apparatus has one or more memories and one or more processor(s) coupled to the memories. The processor(s) is configured to estimate a local curvature of a loss landscape of a neural network. The processor(s) is also configured to dynamically allocate parameters to be removed from the neural network based on the local curvature. The processor(s) is further configured to update remaining weights of the neural network based on the parameters to be removed.
    Type: Application
    Filed: September 28, 2023
    Publication date: April 3, 2025
    Inventors: Tycho VAN DER OUDERAA, Markus NAGEL, Marinus Willem VAN BAALEN, Tijmen Pieter Frederik BLANKEVOORT
  • Publication number: 20250103882
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for efficiently adapting a machine learning model from a base task to a downstream task based on frozen matrices. An example method generally includes receiving an input for processing through a layer of a neural network. An output of the layer of the neural network is generated based on a first product, the first product being based on a first trainable scaling vector, a first frozen matrix, a second trainable scaling vector, a second frozen matrix, and the received input.
    Type: Application
    Filed: February 21, 2024
    Publication date: March 27, 2025
    Inventors: Dawid Jan KOPICZKO, Yuki Markus ASANO, Tijmen Pieter Frederik BLANKEVOORT
  • Publication number: 20250094781
    Abstract: Systems and techniques are described herein for training and using multitask machine learning models. For example, a computing device can obtain training data for a first task in a layer in a neural network; perform, based on a determination from a first gating mechanism, the shared function on shared features of the training data using at least one shared channel to generate a shared feature map; perform, based on the determination from the first gating mechanism, the first task-specific function on first task-specific features of the training data using at least one first task-specific channel to generate a first task-specific feature map; generate an output for the first task-specific branch based on performing the shared function on the shared features and performing the first task-specific function on the first task-specific features; and update at least one parameter of the first gating mechanism based on the output.
    Type: Application
    Filed: September 18, 2023
    Publication date: March 20, 2025
    Inventors: Babak EHTESHAMI BEJNORDI, Gaurav KUMAR, Amelie Marie Estelle ROYER, Eduardo ESTEVES, Tijmen Pieter Frederik BLANKEVOORT, Mohsen GHAFOORIAN
  • Publication number: 20250094768
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for training and inferencing using a multi-domain machine learning model. An example method generally includes extracting, using a first neural network block, a plurality of features associated with inputs in a multi-domain input data set. A confusion matrix is generated based on the extracted plurality of features. A plurality of clusters is identified from the confusion matrix. Each cluster in the plurality of clusters generally corresponds to one or more data domains in the multi-domain input data set. A first gating neural network is trained based on the multi-domain input data set and the identified plurality of clusters. A plurality of second neural network blocks is trained based on a division of the multi-domain input data set into data associated with each cluster of the plurality of clusters.
    Type: Application
    Filed: September 15, 2023
    Publication date: March 20, 2025
    Inventors: Andrii SKLIAR, Babak EHTESHAMI BEJNORDI, Amelie Marie Estelle ROYER, Tijmen Pieter Frederik BLANKEVOORT, Dushyant MEHTA
  • Patent number: 12242956
    Abstract: Various embodiments include methods and neural network computing devices implementing the methods for performing quantization in neural networks. Various embodiments may include equalizing ranges of weight tensors or output channel weights within a first layer of the neural network by scaling each of the output channel weights of the first layer by a corresponding scaling factor, and scaling each of a second adjacent layer's corresponding input channel weights by applying an inverse of the corresponding scaling factor to the input channel weights. The corresponding scaling factor may be determined using a black-box optimizer on a quantization error metric or based on heuristics, equalization of dynamic ranges, equalization of range extrema (minima or maxima), differential learning using straight through estimator (STE) methods and a local or global loss, or using an error metric for the quantization error and a black-box optimizer that minimizes the error metric with respect to the scaling.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: March 4, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Markus Nagel, Marinus Willem van Baalen, Tijmen Pieter Frederik Blankevoort
  • Publication number: 20250005452
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for mitigating weight oscillation during quantization-aware training. In one example, a method includes identifying oscillation of a parameter of a machine learning model during quantization-aware training of the machine learning model, and applying an oscillation mitigation procedure during the quantization-aware training of the machine learning model in response to identifying the oscillation, the oscillation mitigation procedure comprising at least one of oscillation dampening or parameter freezing.
    Type: Application
    Filed: January 24, 2023
    Publication date: January 2, 2025
    Inventors: Markus NAGEL, Marios FOURNARAKIS, Tijmen Pieter Frederik BLANKEVOORT, Yelysei BONDARENKO
  • Publication number: 20240386239
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for processing data using a transformer neural network. The method generally includes receiving an input for processing using a transformer neural network. An attention output is generated in the transformer neural network. Generally, the attention output may be generated such that outlier values for the attention output are attenuated in the transformer neural network. An output of the transformer neural network is generated based on the generated attention output.
    Type: Application
    Filed: October 6, 2023
    Publication date: November 21, 2024
    Inventors: Yelysei BONDARENKO, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT
  • Patent number: 12131258
    Abstract: A method for compressing a deep neural network includes determining a pruning ratio for a channel and a mixed-precision quantization bit-width based on an operational budget of a device implementing the deep neural network. The method further includes quantizing a weight parameter of the deep neural network and/or an activation parameter of the deep neural network based on the quantization bit-width. The method also includes pruning the channel of the deep neural network based on the pruning ratio.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: October 29, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Yadong Lu, Ying Wang, Tijmen Pieter Frederik Blankevoort, Christos Louizos, Matthias Reisser, Jilei Hou
  • Publication number: 20240161487
    Abstract: Systems and techniques are described for adaptive mixed-resolution processing. According to some aspects, a device can divide an input image into first tokens having a first resolution and second tokens having a second resolution. The device can generate first token representations for token(s) from the first tokens corresponding to a first region of the input image and generate second token representations for token(s) from the second tokens corresponding to the first region of the input image. The device can process, using a neural network model, the first token representations and the second token representations to determine the first resolution or the second resolution as a scale for the first region of the input image. The device can process, using a transformer neural network model, the first region of the input image according to the scale for the first region.
    Type: Application
    Filed: September 29, 2023
    Publication date: May 16, 2024
    Inventors: Jakob DRACHMANN HAVTORN, Amelie Marie Estelle ROYER, Tijmen Pieter Frederik BLANKEVOORT, Babak EHTESHAMI BEJNORDI
  • Publication number: 20230376272
    Abstract: A processor-implemented method for fast floating point simulations with learnable parameters includes receiving a single precision input. An integer quantization process is performed on the input. Each element of the input is scaled based on a scaling parameter to generate an m-bit floating point output, where m is an integer.
    Type: Application
    Filed: January 27, 2023
    Publication date: November 23, 2023
    Inventors: Marinus Willem VAN BAALEN, Jorn Wilhelmus Timotheus PETERS, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT, Andrey KUZMIN
  • Publication number: 20230334324
    Abstract: A computing device may be configured to intelligently activate gating within a current layer of a neural network that includes two or more filters. The computing device may receive a layer-specific input data that is specific to the current layer of the neural network, generate statistics based on the received layer-specific input data; and use the generated statistics to assign a relevance score to each of the two or more filters. Each assigned relevance score may indicate the relevance of the corresponding filter to the received layer-specific input data. The computing device may determine an activation status of each of the two or more filters in the current layer based on the identified relevance and apply the received layer-specific input data to the activated filters in the two or more filters to generate an output activation for the current layer of the neural network.
    Type: Application
    Filed: June 20, 2023
    Publication date: October 19, 2023
    Inventors: Babak EHTESHAMI BEJNORDI, Tijmen Pieter Frederik BLANKEVOORT, Max WELLING
  • Publication number: 20230306233
    Abstract: A processor-implemented method includes bit shifting a binary representation of a neural network parameter. The neural network parameter has fewer bits, b, than a number of hardware bits, B, supported by hardware that processes the neural network parameter. The bit shifting effectively multiplies the neural network parameter by 2B-b. The method also includes dividing a quantization scale by 2B-b to obtain an updated quantization scale. The method further includes quantizing the bit shifted binary representation with the updated quantization scale to obtain a value for the neural network parameter.
    Type: Application
    Filed: January 30, 2023
    Publication date: September 28, 2023
    Inventors: Marinus Willem VAN BAALEN, Brian KAHNE, Eric Wayne MAHURIN, Tijmen Pieter Frederik BLANKEVOORT, Andrey KUZMIN, Andrii SKLIAR, Markus NAGEL
  • Publication number: 20230281510
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for machine learning. In one aspect, base model output data is generated, the generating including processing input data with at least a portion of a base model of a machine learning model architecture, and the base model output data is processed with a routing model of the machine learning model architecture in order to determine a selected expert model, of a plurality of expert models, with which to process the base model output data. Expert model output data is generated, where generating the expert model output data includes processing the base model output data with the selected expert model, and final output data from the machine learning model architecture is generated, where generating the final output data includes processing the base model output data and the expert model output data with an ensemble model of the machine learning model architecture.
    Type: Application
    Filed: January 13, 2023
    Publication date: September 7, 2023
    Inventors: Amelie Marie Estelle ROYER, Ilia KARMANOV, Andrii SKLIAR, Babak EHTESHAMI BEJNORDI, Tijmen Pieter Frederik BLANKEVOORT
  • Patent number: 11704571
    Abstract: A method for pruning weights of an artificial neural network based on a learned threshold includes determining a pruning threshold for pruning a first set of pre-trained weights of multiple pre-trained weights based on a function of a classification loss and a regularization loss. Weights are pruned from the first set of pre-trained weights when a first value of the weight is less than the pruning threshold. A second set of pre-trained weights of the multiple pre-trained weights is fine-tuned or adjusted in response to a second value of each pre-trained weight in the second set of pre-trained weights being greater than the pruning threshold.
    Type: Grant
    Filed: October 9, 2020
    Date of Patent: July 18, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Kambiz Azarian Yazdi, Tijmen Pieter Frederik Blankevoort, Jin Won Lee, Yash Sanjay Bhalgat
  • Publication number: 20230139347
    Abstract: A processor-implemented method for providing per-embedding-group activation quantization includes receiving sequential data at a first layer of a transformer neural network. The sequential data is processed via the first layer of the transformer neural network to generate an activation tensor. The activation tensor is split into multiple groups of embeddings. Each of the embeddings groups has a different set of quantization parameters. Each of the embedding groups is quantized separately based on the corresponding quantization parameters of the different set of quantization parameters. The quantized embedding groups are multiplied with a set of weights to generate an output.
    Type: Application
    Filed: October 28, 2022
    Publication date: May 4, 2023
    Inventors: Yelysei BONDARENKO, Markus NAGEL, Tijmen Pieter Frederik BLANKEVOORT
  • Publication number: 20230090941
    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for processing a video stream using a machine learning model. An example method generally includes generating a first group of tokens from a first frame of the video stream and a second group of tokens from a second frame of the video stream. A first set of tokens associated with features to be reused from the first frame and a second set of tokens associated with features to be computed from the second frame are identified based on a comparison of tokens from the first group of tokens to corresponding tokens in the second group of tokens. A feature output is generated for portions of the second frame corresponding to the second set of tokens. Features associated with the first set of tokens are combined with the generated feature output into a representation of the second frame.
    Type: Application
    Filed: September 20, 2022
    Publication date: March 23, 2023
    Inventors: Yawei LI, Bert MOONS, Tijmen Pieter Frederik BLANKEVOORT, Amirhossein HABIBIAN, Babak EHTESHAMI BEJNORDI
  • Patent number: 11604987
    Abstract: Various embodiments include methods and neural network computing devices implementing the methods, for generating an approximation neural network. Various embodiments may include performing approximation operations on a weights tensor associated with a layer of a neural network to generate an approximation weights tensor, determining an expected output error of the layer in the neural network due to the approximation weights tensor, subtracting the expected output error from a bias parameter of the layer to determine an adjusted bias parameter and substituting the adjusted bias parameter for the bias parameter in the layer. Such operations may be performed for one or more layers in a neural network to produce an approximation version of the neural network for execution on a resource limited processor.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: March 14, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Marinus Willem Van Baalen, Tijmen Pieter Frederik Blankevoort, Markus Nagel