Patents by Inventor Prerit DAK

Prerit DAK has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12165252
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Grant
    Filed: October 3, 2023
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Publication number: 20240029336
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Application
    Filed: October 3, 2023
    Publication date: January 25, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Patent number: 11790590
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: October 17, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Patent number: 11573765
    Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: February 7, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Prerit Dak
  • Publication number: 20220319089
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Publication number: 20200192631
    Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Inventors: Milind N. NEMLEKAR, Prerit DAK