Patents by Inventor Srinivasan Narayanamoorthy

Srinivasan Narayanamoorthy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Accelerator for sparse-dense matrix multiplication

Patent number: 11829440

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: April 13, 2021

Date of Patent: November 28, 2023

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
Systolic array accelerator systems and methods

Patent number: 11003619

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

Type: Grant

Filed: February 24, 2019

Date of Patent: May 11, 2021

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Jayaram Bobba, Ankit More
Accelerator for sparse-dense matrix multiplication

Patent number: 10984074

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: February 24, 2020

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
Accelerator for sparse-dense matrix multiplication

Patent number: 10867009

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: July 6, 2020

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
SYSTOLIC ARRAY ACCELERATOR SYSTEMS AND METHODS

Publication number: 20200272596

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

Type: Application

Filed: February 24, 2019

Publication date: August 27, 2020

Applicant: INTEL CORPORATION

Inventors: Srinivasan Narayanamoorthy, Jayaram Bobba, Ankit More
Accelerator for sparse-dense matrix multiplication

Patent number: 10572568

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: March 28, 2018

Date of Patent: February 25, 2020

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
Multiplication circuit providing dynamic truncation

Patent number: 9639328

Abstract: A fixed-point multiplier providing reduced energy usage dynamically truncates received operands according to the location of computationally important bits in the operands and provides the truncated operands to a reduced width multiplier offering reduced energy usage. Information about the location of the dynamic truncation is used to properly shift the result of the multiplier to provide an approximation of full multiplication of the operands.

Type: Grant

Filed: August 6, 2014

Date of Patent: May 2, 2017

Assignee: Wisconsin Alumni Research Foundation

Inventors: Srinivasan Narayanamoorthy, Nam Sung Kim
Multiplication Circuit Providing Dynamic Truncation

Publication number: 20160041813

Abstract: A fixed-point multiplier providing reduced energy usage dynamically truncates received operands according to the location of computationally important bits in the operands and provides the truncated operands to a reduced width multiplier offering reduced energy usage. Information about the location of the dynamic truncation is used to properly shift the result of the multiplier to provide an approximation of full multiplication of the operands.

Type: Application

Filed: August 6, 2014

Publication date: February 11, 2016

Inventors: Srinivasan Narayanamoorthy, Nam Sung Kim