Patents by Inventor Tariq Afzal

Tariq Afzal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Neural network accelerator with reconfigurable memory

Patent number: 12169786

Abstract: Described herein is a neural network accelerator (NNA) with reconfigurable memory resources for forming a set of local memory buffers comprising at least one activation buffer, at least one weight buffer, and at least one output buffer. The NNA supports a plurality of predefined memory configurations that are optimized for maximizing throughput and reducing overall power consumption in different types of neural networks. The memory configurations differ with respect to at least one of a total amount of activation, weight, or output buffer memory, or a total number of activation, weight, or output buffers. Depending on which type of neural network is being executed and the memory behavior of the specific neural network, a memory configuration can be selected accordingly.

Type: Grant

Filed: June 27, 2019

Date of Patent: December 17, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Tariq Afzal, Arvind Mandhani, Shiva Navab
Decompression and compression of neural network data using different compression schemes

Patent number: 11868867

Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.

Type: Grant

Filed: November 17, 2022

Date of Patent: January 9, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Tariq Afzal, Arvind Mandhani
System and method for optimizing DRAM bus switching using LLC

Patent number: 11567885

Abstract: The present disclosure relates to a system and method for optimizing switching of a DRAM bus using LLC. An embodiment of the disclosure includes sending a first type request from a first type queue to the second memory via the memory bus if a direction setting of the memory bus is in a first direction corresponding to the first type request, decrementing a current direction credit count by a first type transaction decrement value, if the decremented current direction credit count is greater than zero, sending another first type request to the second memory via the memory bus and decrementing the current direction credit count again by the first type transaction decrement value, and if the decremented current direction credit count is zero, switching the direction setting of the memory bus to a second direction and resetting the current direction credit count to a second type initial value.

Type: Grant

Filed: May 12, 2017

Date of Patent: January 31, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Milan Shah, Tariq Afzal, Thomas Zou
Decompression and compression of neural network data using different compression schemes

Patent number: 11537853

Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.

Type: Grant

Filed: June 27, 2019

Date of Patent: December 27, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Tariq Afzal, Arvind Mandhani
Neural network accelerator with compact instruct set

Patent number: 11520561

Abstract: Described herein is a neural network accelerator with a set of neural processing units and an instruction set for execution on the neural processing units. The instruction set is a compact instruction set including various compute and data move instructions for implementing a neural network. Among the compute instructions are an instruction for performing a fused operation comprising sequential computations, one of which involves matrix multiplication, and an instruction for performing an elementwise vector operation. The instructions in the instruction set are highly configurable and can handle data elements of variable size. The instructions also implement a synchronization mechanism that allows asynchronous execution of data move and compute operations across different components of the neural network accelerator as well as between multiple instances of the neural network accelerator.

Type: Grant

Filed: June 27, 2019

Date of Patent: December 6, 2022

Assignee: Amazon Technologies, Inc.

Inventor: Tariq Afzal
SYSTEM AND METHOD FOR OPTIMIZING DRAM BUS SWITCHING USING LLC

Publication number: 20210271610

Abstract: The present disclosure relates to a system and method for optimizing switching of a DRAM bus using LLC. An embodiment of the disclosure includes sending a first type request from a first type queue to the second memory via the memory bus if a direction setting of the memory bus is in a first direction corresponding to the first type request, decrementing a current direction credit count by a first type transaction decrement value, if the decremented current direction credit count is greater than zero, sending another first type request to the second memory via the memory bus and decrementing the current direction credit count again by the first type transaction decrement value, and if the decremented current direction credit count is zero, switching the direction setting of the memory bus to a second direction and resetting the current direction credit count to a second type initial value.

Type: Application

Filed: May 12, 2017

Publication date: September 2, 2021

Applicant: LG ELECTRONICS INC.

Inventors: Milan SHAH, Tariq AFZAL, Thomas ZOU
Autonomous prefetch engine

Patent number: 10705987

Abstract: A control circuit for controlling memory prefetch requests to system level cache (SLC). The control circuit includes a circuit identifying memory access requests received at the system level cache (SLC), where each of the memory access requests includes an address (ANEXT) of memory to be accessed. Another circuit associates a tracker with each of the memory access streams. A further circuit performs tracking for the memory access streams by: when the status is tracking and the address (ANEXT) points to an interval between the current address (ACURR) and the last prefetched address (ALAST), issuing a prefetch request to the SLC; and when the status is tracking, and distance (ADIST) between the current address (ACURR) and the last prefetched address (ALAST) is greater than a specified maximum prefetch for the associated tracker, waiting for further requests to control a prefetch process.

Type: Grant

Filed: May 12, 2017

Date of Patent: July 7, 2020

Assignee: LG ELECTRONICS INC.

Inventors: Arkadi Avrukin, Seungyoon Song, Tariq Afzal, Yongjae Hong, Michael Frank, Thomas Zou, Hoshik Kim, Jungsook Lee
AUTONOMOUS PREFETCH ENGINE

Publication number: 20190138452

Abstract: A control circuit for controlling memory prefetch requests to system level cache (SLC). The control circuit includes a circuit identifying memory access requests received at the system level cache (SLC), where each of the memory access requests includes an address (ANEXT) of memory to be accessed. Another circuit associates a tracker with each of the memory access streams. A further circuit performs tracking for the memory access streams by: when the status is tracking and the address (ANEXT) points to an interval between the current address (ACURR) and the last prefetched address (ALAST), issuing a prefetch request to the SLC; and when the status is tracking, and distance (ADIST) between the current address (ACURR) and the last prefetched address (ALAST) is greater than a specified maximum prefetch for the associated tracker, waiting for further requests to control a prefetch process.

Type: Application

Filed: May 12, 2017

Publication date: May 9, 2019

Applicant: LG ELECTRONICS INC.

Inventors: Arkadi AVRUKIN, Seungyoon SONG, Tariq AFZAL, Yongjae HONG, Michael FRANK, Thomas ZOU, Hoshik KIM, Jungsook LEE
Programming interface for a reconfigurable processing system

Publication number: 20030233639

Abstract: A method for automatically compiling computer program written in a high level programming language into a program for execution by a reconfigurable processing system. The method comprises automatically determining a set of instructions to be executed by the reconfigurable processing system that will result in the optimization of the execution of the computer program. Next, executable code is generated for the reconfigurable processing system with the instructions. In the preferred embodiment, the high level programming language provides a development environment which utilizes concepts from both flexible and fixed hardware programming.

Type: Application

Filed: June 11, 2002

Publication date: December 18, 2003

Inventor: Tariq Afzal