Patents by Inventor Dan Baum

Dan Baum has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements

Patent number: 12287843

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Grant

Filed: November 6, 2023

Date of Patent: April 29, 2025

Assignee: Intel Corporation

Inventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Systems, methods, and apparatus for tile configuration

Patent number: 12282773

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Grant

Filed: December 8, 2023

Date of Patent: April 22, 2025

Assignee: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
Cryptographic computing in multitenant environments

Patent number: 12277234

Abstract: A processor, a system, a machine readable medium, and a method.

Type: Grant

Filed: December 26, 2020

Date of Patent: April 15, 2025

Assignee: Intel Corporation

Inventors: David M. Durham, Michael D. LeMay, Salmin Sultana, Karanvir S. Grewal, Michael E. Kounavis, Sergej Deutsch, Andrew James Weiler, Abhishek Basak, Dan Baum, Santosh Ghosh
SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION

Publication number: 20250117222

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Application

Filed: October 29, 2024

Publication date: April 10, 2025

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
SYSTEMS, METHODS, AND APPARATUSES FOR TILE TRANSPOSE

Publication number: 20250117221

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

Type: Application

Filed: October 18, 2024

Publication date: April 10, 2025

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Menachem ADELMAN, Simon RUBANOVICH
Systems for performing instructions to quickly convert and use tiles as 1D vectors

Patent number: 12265826

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Grant

Filed: December 28, 2023

Date of Patent: April 1, 2025

Assignee: Intel Corporation

Inventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Systems, methods, and apparatuses for matrix add, subtract, and multiply

Patent number: 12260213

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Grant

Filed: December 10, 2021

Date of Patent: March 25, 2025

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
METHOD AND APPARATUS FOR DATA/INSTRUCTION ACCESS BASED ON PERFORMANCE HINTS

Publication number: 20250068424

Abstract: Methods, apparatus, and computer programs are disclosed for data/instruction access based on performance hints. In some embodiments, a method comprises decoding an instruction to access data or code by a core of a computer processor, the instruction to provide one or more hints on how the data or code is to be processed through a cache hierarchy of the computer processor based on the instruction, the one or more hints indicating which level of the cache hierarchy or which cache in a level of the cache hierarchy to load or store the data or code, a priority of the data or code in a cache, or how the data or code is to be shared among multiple cores of the computer processor. The method further comprises processing the data or code based on the one or more hints responsive to the decoded instruction.

Type: Application

Filed: November 8, 2024

Publication date: February 27, 2025

Inventors: Duane GALBI, Christopher J. HUGHES, Dan BAUM
METHOD AND APPARATUS FOR PARTIAL VIRTUALIZATION IN A PROCESSOR

Publication number: 20250068422

Abstract: Methods, apparatus, and computer programs are disclosed for context switching. In some embodiments, a method comprises dedicating a first subset of a plurality of vector registers to a first thread of a plurality of threads for thread execution; and responsive to a context switch from the first thread to a second thread, bypassing saving a state of the first subset of the plurality of vector registers; and saving a state of a second subset of the plurality of vector registers, wherein the second subset of the plurality of vector registers is not dedicated to the first thread, and wherein the first and second subsets are mutually exclusive.

Type: Application

Filed: November 8, 2024

Publication date: February 27, 2025

Inventors: Duane GALBI, Christopher J. HUGHES, Dan BAUM, H. Peter ANVIN, Stijn EYERMAN
APPARATUS AND METHOD FOR PREFETCHING DATA WITH HINTS

Publication number: 20250004773

Abstract: An apparatus and method are described for prefetching data with hints. For example, one embodiment of a processor comprises: a plurality of cores to process instructions; a first core of the plurality of cores comprising: decoder circuitry to decode instructions indicating memory operations including load operations of a first type with shared data hints and load operations of a second type without shared data hints; execution circuitry to execute the instructions to perform the memory operations; data prefetch circuitry to store tracking data in a tracking data structure responsive to the memory operations, a portion of the tracking data associated with the first type of load operations; and the data prefetch circuitry to detect memory access patterns using the tracking data, the data prefetch circuitry to responsively issue one or more prefetch operations using shared data hints based, at least in part, on the portion of the tracking data associated with the first type of load operations.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Christopher J. HUGHES, Zhe WANG, Dan BAUM, Venkateswara Rao MADDURI, Chen DAN, Joseph NUZMAN
VECTOR PACKED MATRIX MULTIPLICATION AND ACCUMULATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20250004768

Abstract: Decoder circuitry to decode an instruction indicating a first vector register having a 128-bit lane to store a first matrix having two rows by K columns of data elements having a number of bits, a storage location having 128 bits to store a second matrix having K rows by two columns of data elements having the number of bits, and a second vector register having a 128-bit lane to store a third matrix having two rows by two columns of data elements having a greater number of bits. Execution circuitry is to perform operations for the instruction, including to generate and store a result matrix having two rows by two columns of result data elements having the greater number of bits in 128-bit lane of second vector register. The result matrix represents accumulation of the third matrix with product matrix generated from matrix multiplication using the first and second matrices.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Alexander HEINECKE, Wing Shek WONG, Stephen ROBINSON, Raanan SADE, Amit GRADSTEIN, Simon RUBANOVICH, Michael ESPIG, Dan BAUM, Evangelos GEORGANAS, Dhiraj KALAMKAR
APPARATUS AND METHOD FOR A LOAD INSTRUCTION WITH A READ-SHARED INDICATION

Publication number: 20250004765

Abstract: Techniques for loading data with a hint related to data sharing with other cores. For example, one embodiment of an apparatus comprises: a plurality of cores to process instructions; a first core of the plurality of cores comprising: decoder circuitry to decode a single instruction, the single instruction having a first field for an opcode to indicate a load operation to read data from a memory, a second field to indicate a memory address for a location of the data in the memory, and a third field to store a value to indicate whether the data is expected to be shared between the first core and at least a second core of the plurality of cores; execution circuitry to execute the single instruction to read the data from the location in the memory; and cache controller circuitry to store the data in one or more caches in a state selected based on the value.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Christopher J. HUGHES, Zhe WANG, Dan BAUM, Venkateswara Rao MADDURI, Alexander HEINECKE, Evangelos GEORGANAS, Chen DAN, Joseph NUZMAN
SYSTEMS, METHODS, AND APPARATUSES FOR TILE LOAD

Publication number: 20250004716

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Application

Filed: May 3, 2024

Publication date: January 2, 2025

Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL, Raanan SADE
MULTIPLYING AND ADDING SMALL-EXPONENT FLOATING-POINT FORMAT DATA ELEMENTS WITH INTEGER ADDITION

Publication number: 20250004721

Abstract: A method of an aspect includes multiplying pairs of corresponding small-exponent floating-point data elements to generate corresponding small-exponent floating-point products. The small-exponent floating-point data elements and the small-exponent floating-point products each have no more than six exponent bits. The method also includes converting the small-exponent floating-point products to signed fixed-point products and accumulating the signed fixed-point products, and an optional signed fixed-point accumulation value, by fixed-point addition to generate a signed fixed-point accumulation value. Other methods, processors, systems, and instructions are disclosed.

Type: Application

Filed: September 10, 2024

Publication date: January 2, 2025

Inventors: Simon Rubanovich, Amit Gradstein, Sagi Meller, Uri Reuven Tassa, Dan Baum
Systems, methods, and apparatuses for tile load, multiplication and accumulation

Patent number: 12182571

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Grant

Filed: January 23, 2023

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Menachem Adelman, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Dan Baum, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
Systems and methods for performing matrix compress and decompress instructions

Patent number: 12175246

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: September 1, 2023

Date of Patent: December 24, 2024

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Patent number: 12147804

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Grant

Filed: July 22, 2021

Date of Patent: November 19, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
MEMORY SAFETY USING TAG CHECKING INSTRUCTIONS AND ISLANDS OF TAGS IN LINE WITH BUCKETED DATA

Publication number: 20240354108

Abstract: Techniques for implementing instructions and modified instruction encodings for checking tags and for interspersing islands of tags in line with bucketed data for locality by a processor are described. In an example, an apparatus includes decoder circuitry and execution circuitry. The decoder circuitry is to decode an instruction into a decoded instruction. The instruction has an opcode to indicate that the execution circuitry is to use metadata and instruction encodings to selectively perform a memory safety check. The execution circuitry is to execute the decoded instruction according to the opcode.

Type: Application

Filed: September 29, 2023

Publication date: October 24, 2024

Applicant: Intel Corporation

Inventors: Michael LeMay, David M. Durham, Joseph Cihula, Joseph Nuzman, Dan Baum, Jonathan Combs
Systems, methods, and apparatuses for tile transpose

Patent number: 12124847

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

Type: Grant

Filed: July 1, 2017

Date of Patent: October 22, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Menachem Adelman, Simon Rubanovich
DIRECT SATURATED IN-PLACE FLOATING POINT INTO 8-BIT INTEGER DOWNCONVERT INSTRUCTION(S)

Publication number: 20240329994

Abstract: Techniques for converting floating-point to integer are described. An example of an instruction to perform such a conversion includes fields for an opcode, an identification of location of a packed data source operand, an identification of location of a packed data destination operand, an indication of a location in each packed data element of the packed data destination to store an 8-bit integer (INT8) value, wherein the opcode is to indicate to conversion circuitry is to downconvert data of each packed data element of the packed data source operand to an INT8 value and make available for storage the INT8 value in the identified location of a corresponding packed data element of the packed data destination.

Type: Application

Filed: March 30, 2023

Publication date: October 3, 2024

Inventors: Uri Sherman, Dan Baum, Menachem Adelman, Amit Gradstein

1 2 3 4 5 … next