Patents by Inventor Dhiraj KALAMKAR
Dhiraj KALAMKAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12379927Abstract: Techniques for scale and reduction of BF16 data elements are described. An exemplary instruction includes fields for an having fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a BF16 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of the exponent of the power of 2 value is a floor value of a BF16 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand.Type: GrantFiled: August 31, 2021Date of Patent: August 5, 2025Assignee: Intel CorporationInventors: Menachem Adelman, Alexander Heinecke, Robert Valentine, Zeev Sperber, Amit Gradstein, Mark Charney, Evangelos Georganas, Dhiraj Kalamkar, Christopher Hughes, Cristina Anderson
-
Publication number: 20250217141Abstract: Techniques for performing BF16 FMA in response to an instruction are described.Type: ApplicationFiled: January 3, 2025Publication date: July 3, 2025Inventors: Alexander HEINECKE, Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Patent number: 12229554Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand.Type: GrantFiled: August 31, 2021Date of Patent: February 18, 2025Assignee: Intel CorporationInventors: Alexander Heinecke, Menachem Adelman, Robert Valentine, Zeev Sperber, Amit Gradstein, Mark Charney, Evangelos Georganas, Dhiraj Kalamkar, Christopher Hughes, Cristina Anderson
-
Publication number: 20250004768Abstract: Decoder circuitry to decode an instruction indicating a first vector register having a 128-bit lane to store a first matrix having two rows by K columns of data elements having a number of bits, a storage location having 128 bits to store a second matrix having K rows by two columns of data elements having the number of bits, and a second vector register having a 128-bit lane to store a third matrix having two rows by two columns of data elements having a greater number of bits. Execution circuitry is to perform operations for the instruction, including to generate and store a result matrix having two rows by two columns of result data elements having the greater number of bits in 128-bit lane of second vector register. The result matrix represents accumulation of the third matrix with product matrix generated from matrix multiplication using the first and second matrices.Type: ApplicationFiled: June 30, 2023Publication date: January 2, 2025Inventors: Alexander HEINECKE, Wing Shek WONG, Stephen ROBINSON, Raanan SADE, Amit GRADSTEIN, Simon RUBANOVICH, Michael ESPIG, Dan BAUM, Evangelos GEORGANAS, Dhiraj KALAMKAR
-
Publication number: 20240184585Abstract: Techniques for comparing BF16 data elements are described. An exemplary BF16 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.Type: ApplicationFiled: February 8, 2024Publication date: June 6, 2024Inventors: Alexander HEINECKE, Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20240037378Abstract: Systems, apparatuses and methods may provide for technology that identifies an embedding table associated with a neural network. The neural network is associated with a plurality of compute nodes. The technology further identifies a number of entries of the embedding table, and determines whether to process gradients associated with the embedding table as dense gradients or sparse gradients based on the number of entries.Type: ApplicationFiled: December 24, 2020Publication date: February 1, 2024Applicant: Intel CorporationInventors: Guokai Ma, Jiong Gong, Dhiraj Kalamkar, Rachitha Prem Seelin, Hongzhen Liu, Akshay Jain, Liangang Zhang
-
Publication number: 20230072105Abstract: Techniques for comparing BF16 data elements are described. An exemplary BF16 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.Type: ApplicationFiled: August 31, 2021Publication date: March 9, 2023Inventors: Alexander HEINECKE, Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20230069000Abstract: Techniques for performing arithmetic operations on BF16 values are described. An exemplary instruction includes fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of location of a packed data destination operand, wherein the opcode is to indicate an arithmetic operation execution circuitry is to perform, for each data element position of the identified packed data source operands, the arithmetic operation on BF16 data elements in that data element position in BF16 format and store a result of each arithmetic operation into a corresponding data element position of the identified packed data destination operand.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Alexander HEINECKE, Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20230060146Abstract: Techniques for BF16 classification or manipulation using single instructions are described. An exemplary instruction includes fields for an opcode, an identification of a location of a packed data source operand, an indication of one or more classification checks to perform, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a classification according to the indicated one or more classification checks and store a result of the classification in a corresponding data element position of the destination operand.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Menachem ADELMAN, Alexander HEINECKE, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20230068781Abstract: Techniques for scale and reduction of BF16 data elements are described. An exemplary instruction includes fields for an having fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a BF16 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of the exponent of the power of 2 value is a floor value of a BF16 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Menachem ADELMAN, Alexander HEINECKE, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20230067810Abstract: Techniques for performing BF16 FMA in response to an instruction are described.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Alexander HEINECKE, Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON
-
Publication number: 20230061618Abstract: Techniques for performing square root or reciprocal square root calculations on BF16 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a BF16 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.Type: ApplicationFiled: August 31, 2021Publication date: March 2, 2023Inventors: Menachem ADELMAN, Alexander HEINECKE, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN, Mark CHARNEY, Evangelos GEORGANAS, Dhiraj KALAMKAR, Christopher HUGHES, Cristina ANDERSON