Patents by Inventor Kenneth J. Janik

Kenneth J. Janik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ACCELERATOR FOR SPARSE-DENSE MATRIX MULTIPLICATION

Publication number: 20240070226

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Application

Filed: November 9, 2023

Publication date: February 29, 2024

Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
Accelerator for sparse-dense matrix multiplication

Patent number: 11829440

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: April 13, 2021

Date of Patent: November 28, 2023

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
ACCELERATOR FOR SPARSE-DENSE MATRIX MULTIPLICATION

Publication number: 20210342417

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Application

Filed: April 13, 2021

Publication date: November 4, 2021

Applicant: Intel Corporation

Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
Accelerator for sparse-dense matrix multiplication

Patent number: 10984074

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: February 24, 2020

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
Accelerator for sparse-dense matrix multiplication

Patent number: 10867009

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: July 6, 2020

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
ACCELERATOR FOR SPARSE-DENSE MATRIX MULTIPLICATION

Publication number: 20200334323

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Application

Filed: July 6, 2020

Publication date: October 22, 2020

Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
ACCELERATOR FOR SPARSE-DENSE MATRIX MULTIPLICATION

Publication number: 20200265107

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Application

Filed: February 24, 2020

Publication date: August 20, 2020

Applicant: Intel Corporation

Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
Accelerator for sparse-dense matrix multiplication

Patent number: 10572568

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: March 28, 2018

Date of Patent: February 25, 2020

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
ACCELERATOR FOR SPARSE-DENSE MATRIX MULTIPLICATION

Publication number: 20190042542

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Application

Filed: March 28, 2018

Publication date: February 7, 2019

Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
Method and apparatus for prefetching data to a lower level cache memory

Patent number: 7383418

Abstract: A prefetching scheme to detect when a load misses the lower level cache and hits the next level cache. Consequently, the prefetching scheme utilizes the previous information for the cache miss to the lower level cache and hit to the next higher level of cache memory that may result in initiating a sidedoor prefetch load for fetching the previous or next cache line into the lower level cache. In order to generate an address for the sidedoor prefetch, a history of cache access is maintained in a queue.

Type: Grant

Filed: September 1, 2004

Date of Patent: June 3, 2008

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, K S Venkatraman, Anwar Rohillah, Eric Sprangle, Ronak Singhal
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 7117345

Abstract: A method of executing more than one thread at a time in a computer system that has a plurality of threads, including a first and second thread. The method comprises providing a first and a second reorder buffer, reading first instructions and first operands associated with the first thread from the first reorder buffer, executing one of the first instructions and storing a result in the first reorder buffer which includes marking the result with a tag associating the result with the first thread, reading second instructions and second operands associated with the second thread from the second reorder buffer, and executing one of the second instructions and storing a result in the second reorder buffer which includes marking the result with a tag associating the result with the second thread.

Type: Grant

Filed: December 9, 2003

Date of Patent: October 3, 2006

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Publication number: 20040187119

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Application

Filed: December 9, 2003

Publication date: September 23, 2004

Applicant: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with recorder buffer

Patent number: 6691222

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: March 18, 2003

Date of Patent: February 10, 2004

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Publication number: 20030177340

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Application

Filed: March 18, 2003

Publication date: September 18, 2003

Applicant: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 6553485

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: January 22, 2002

Date of Patent: April 22, 2003

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Publication number: 20020099928

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Application

Filed: January 22, 2002

Publication date: July 25, 2002

Applicant: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 6351805

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: February 23, 2001

Date of Patent: February 26, 2002

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Publication number: 20010010073

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Application

Filed: February 23, 2001

Publication date: July 26, 2001

Applicant: Intel Corporation.

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 6247115

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: August 15, 2000

Date of Patent: June 12, 2001

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 6163839

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: September 30, 1998

Date of Patent: December 19, 2000

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller