Patents by Inventor Dan Baum
Dan Baum has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240256276Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.Type: ApplicationFiled: February 5, 2024Publication date: August 1, 2024Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL, Raanan SADE
-
Patent number: 12039332Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.Type: GrantFiled: January 28, 2022Date of Patent: July 16, 2024Assignee: Intel CorporationInventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Jesus Corbal, Dan Baum, Alexander Heinecke, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 12032485Abstract: Techniques to allow use of metadata in unused bits of virtual addresses are described. A processor of an aspect includes a decode circuit to decode a memory access instruction. The instruction to indicate one or more memory address operands that are to have address generation information and metadata. An execution circuit coupled with the decode circuit to generate a 64-bit virtual address based on the one or more memory address operands. The 64-bit virtual address having a bit 63, an X-bit address field starting at a bit 0 to store an address generated from the address generation information, and one or more metadata bits to store the metadata. The execution circuit also to perform a canonicality check on the 64-bit virtual address that does not fail due to non-canonical values of the metadata stored in the one or more metadata bits. Other processors, methods, systems, and instructions are disclosed.Type: GrantFiled: December 23, 2020Date of Patent: July 9, 2024Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Gilbert Neiger, Stephen Robinson, Dan Baum, Ron Gabor
-
Publication number: 20240192954Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.Type: ApplicationFiled: February 16, 2024Publication date: June 13, 2024Applicant: Intel CorporationInventors: Robert VALENTINE, Mark J. CHARNEY, Elmoustapha OULD-AHMED-VALL, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Bret L. TOLL, Raanan SADE, Igor YANOVER, Yuri GEBIL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Menachem ADELMAN, Simon RUBANOVICH
-
Patent number: 11977886Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.Type: GrantFiled: March 28, 2022Date of Patent: May 7, 2024Assignee: Intel CorporationInventors: Robert Valentine, Menachem Adelman, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Dan Baum, Yuri Gebil, Raanan Sade
-
Publication number: 20240134644Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.Type: ApplicationFiled: December 29, 2023Publication date: April 25, 2024Applicant: Intel CorporationInventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Simon RUBANOVICH
-
Publication number: 20240126551Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.Type: ApplicationFiled: December 28, 2023Publication date: April 18, 2024Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
-
Patent number: 11954489Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.Type: GrantFiled: December 13, 2021Date of Patent: April 9, 2024Assignee: Intel CorporationInventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
-
Publication number: 20240111533Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.Type: ApplicationFiled: December 8, 2023Publication date: April 4, 2024Inventors: Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus CORBAL, Dan BAUM, Alexander F. HEINECKE, Elmoustaha OULD-AHMED-VALL, Yuri GEBIL, Raanan SADE
-
Publication number: 20240103867Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.Type: ApplicationFiled: November 28, 2023Publication date: March 28, 2024Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
-
Publication number: 20240078285Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.Type: ApplicationFiled: November 6, 2023Publication date: March 7, 2024Inventors: Dan BAUM, Chen KOREN, Elmoustapha OULD-AHMED-VALL, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
-
Publication number: 20240045690Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.Type: ApplicationFiled: September 1, 2023Publication date: February 8, 2024Inventors: Dan BAUM, Michael ESPIG, James GUILFORD, Wajdi K. FEGHALI, Raanan SADE, Christopher J. HUGHES, Robert VALENTINE, Bret TOLL, Elmoustapha OULD-AHMED-VALL, Mark J. CHARNEY, Vinodh GOPAL, Ronen ZOHAR, Alexander F. HEINECKE
-
Patent number: 11886875Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.Type: GrantFiled: December 26, 2018Date of Patent: January 30, 2024Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
-
Publication number: 20240004659Abstract: Techniques for an instruction for a Runtime Call operation are described. An example apparatus comprises decoder circuitry to decode a single instruction, the single instruction to include a field for an identifier of an opcode, the opcode to indicate execution circuitry is to execute a no operation when a runtime call destination equals a predetermined value; and execute an indirect call with the runtime call destination as a destination address when the runtime call destination does not equal the predetermined value. Other examples are described and claimed.Type: ApplicationFiled: June 29, 2022Publication date: January 4, 2024Applicant: Intel CorporationInventors: Michael LeMay, Dan Baum, Joseph Cihula, Joao Batista Correa Gomes Moreira, Anjo Lucas Vahldiek-Oberwagner, Scott Constable, Andreas Kleen, Konrad Lai, Henrique de Medeiros Kawakami, David M. Durham
-
Publication number: 20230409326Abstract: Techniques and mechanisms for processor circuitry to execute a load and expand instruction of an instruction set to generate decompressed matrix data. In an embodiment, the instruction comprises a source operand which indicates a location from which compressed matrix data, and corresponding metadata, are to be accessed. A destination operand of the instruction indicates a location which is to receive decompressed metadata, which is generated, during execution of the instruction, based on the compressed matrix data and the corresponding metadata. The metadata comprises compression mask information which identifies which elements of the matrix have been masked from the compressed matrix data. In another embodiment, the instruction further comprises a count operand which identifies a total number of the unmasked matrix elements which are represented in the compressed matrix data.Type: ApplicationFiled: June 15, 2022Publication date: December 21, 2023Applicant: Intel CorporationInventors: Menachem Adelman, Amit Gradstein, Simon Rubanovich, Barukh Ziv, Uri Sherman, Dana Rip, Shahar Mizrahi, Dan Baum, Rinat Rappoport, Nilesh Jain, Zeev Sperber, Gideon Stupp, Alexander Heinecke, Christopher Hughes, Evangelos Georganas
-
Patent number: 11847185Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.Type: GrantFiled: September 24, 2021Date of Patent: December 19, 2023Assignee: Intel CorporationInventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
-
Patent number: 11847452Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.Type: GrantFiled: June 28, 2021Date of Patent: December 19, 2023Assignee: Intel CorporationInventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
-
Publication number: 20230393769Abstract: A processor includes a register to store an encoded pointer for a memory address within a first memory allocation of a plurality of memory allocations in a memory region of a memory. The processor further includes circuitry to receive a memory operation request based on the encoded pointer and to obtain a first tag of a plurality of tags stored in a table in the memory. Each memory allocation of the plurality of memory allocations is associated with a respective one of the plurality of tags stored in the table. The circuitry is to further obtain pointer metadata stored in the encoded pointer and to determine whether to perform a memory operation corresponding to the memory operation request based, at least in part, on a determination of whether the first pointer metadata corresponds to the first tag.Type: ApplicationFiled: September 30, 2022Publication date: December 7, 2023Applicant: Intel CorporationInventors: David M. Durham, Michael LeMay, Sergej Deutsch, Dan Baum
-
Patent number: 11748103Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.Type: GrantFiled: February 15, 2022Date of Patent: September 5, 2023Assignee: Intel CorporationInventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
-
Patent number: 11714642Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.Type: GrantFiled: March 28, 2022Date of Patent: August 1, 2023Assignee: Intel CorporationInventors: Robert Valentine, Menachem Adelman, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Dan Baum, Yuri Gebil