Patents by Inventor Prerit DAK

Prerit DAK has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multi-accelerator compute dispatch

Patent number: 12165252

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Grant

Filed: October 3, 2023

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
MULTI-ACCELERATOR COMPUTE DISPATCH

Publication number: 20240029336

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Application

Filed: October 3, 2023

Publication date: January 25, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
Multi-accelerator compute dispatch

Patent number: 11790590

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Grant

Filed: March 31, 2021

Date of Patent: October 17, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
Fused convolution and batch normalization for neural networks

Patent number: 11573765

Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.

Type: Grant

Filed: December 13, 2018

Date of Patent: February 7, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Prerit Dak
MULTI-ACCELERATOR COMPUTE DISPATCH

Publication number: 20220319089

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
FUSED CONVOLUTION AND BATCH NORMALIZATION FOR NEURAL NETWORKS

Publication number: 20200192631

Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.

Type: Application

Filed: December 13, 2018

Publication date: June 18, 2020

Inventors: Milind N. NEMLEKAR, Prerit DAK

Multi-accelerator compute dispatch

MULTI-ACCELERATOR COMPUTE DISPATCH

Multi-accelerator compute dispatch

Fused convolution and batch normalization for neural networks

MULTI-ACCELERATOR COMPUTE DISPATCH

FUSED CONVOLUTION AND BATCH NORMALIZATION FOR NEURAL NETWORKS