Patents by Inventor Kenneth J. Janik
Kenneth J. Janik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240070226Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: ApplicationFiled: November 9, 2023Publication date: February 29, 2024Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
-
Patent number: 11829440Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: GrantFiled: April 13, 2021Date of Patent: November 28, 2023Assignee: Intel CorporationInventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
-
Publication number: 20210342417Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: ApplicationFiled: April 13, 2021Publication date: November 4, 2021Applicant: Intel CorporationInventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
-
Patent number: 10984074Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: GrantFiled: February 24, 2020Date of Patent: April 20, 2021Assignee: Intel CorporationInventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
-
Patent number: 10867009Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: GrantFiled: July 6, 2020Date of Patent: December 15, 2020Assignee: Intel CorporationInventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
-
Publication number: 20200334323Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: ApplicationFiled: July 6, 2020Publication date: October 22, 2020Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
-
Publication number: 20200265107Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: ApplicationFiled: February 24, 2020Publication date: August 20, 2020Applicant: Intel CorporationInventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
-
Patent number: 10572568Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: GrantFiled: March 28, 2018Date of Patent: February 25, 2020Assignee: Intel CorporationInventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
-
Publication number: 20190042542Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.Type: ApplicationFiled: March 28, 2018Publication date: February 7, 2019Inventors: Srinivasan NARAYANAMOORTHY, Nadathur Rajagopalan SATISH, Alexey SUPRUN, Kenneth J. JANIK
-
Patent number: 7383418Abstract: A prefetching scheme to detect when a load misses the lower level cache and hits the next level cache. Consequently, the prefetching scheme utilizes the previous information for the cache miss to the lower level cache and hit to the next higher level of cache memory that may result in initiating a sidedoor prefetch load for fetching the previous or next cache line into the lower level cache. In order to generate an address for the sidedoor prefetch, a history of cache access is maintained in a queue.Type: GrantFiled: September 1, 2004Date of Patent: June 3, 2008Assignee: Intel CorporationInventors: Kenneth J. Janik, K S Venkatraman, Anwar Rohillah, Eric Sprangle, Ronak Singhal
-
Patent number: 7117345Abstract: A method of executing more than one thread at a time in a computer system that has a plurality of threads, including a first and second thread. The method comprises providing a first and a second reorder buffer, reading first instructions and first operands associated with the first thread from the first reorder buffer, executing one of the first instructions and storing a result in the first reorder buffer which includes marking the result with a tag associating the result with the first thread, reading second instructions and second operands associated with the second thread from the second reorder buffer, and executing one of the second instructions and storing a result in the second reorder buffer which includes marking the result with a tag associating the result with the second thread.Type: GrantFiled: December 9, 2003Date of Patent: October 3, 2006Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Publication number: 20040187119Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: ApplicationFiled: December 9, 2003Publication date: September 23, 2004Applicant: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Patent number: 6691222Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: GrantFiled: March 18, 2003Date of Patent: February 10, 2004Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Publication number: 20030177340Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: ApplicationFiled: March 18, 2003Publication date: September 18, 2003Applicant: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Patent number: 6553485Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: GrantFiled: January 22, 2002Date of Patent: April 22, 2003Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Publication number: 20020099928Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: ApplicationFiled: January 22, 2002Publication date: July 25, 2002Applicant: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Patent number: 6351805Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: GrantFiled: February 23, 2001Date of Patent: February 26, 2002Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Publication number: 20010010073Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: ApplicationFiled: February 23, 2001Publication date: July 26, 2001Applicant: Intel Corporation.Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Patent number: 6247115Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: GrantFiled: August 15, 2000Date of Patent: June 12, 2001Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
-
Patent number: 6163839Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.Type: GrantFiled: September 30, 1998Date of Patent: December 19, 2000Assignee: Intel CorporationInventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller