Patents by Inventor Michael Espig

Michael Espig has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements

Patent number: 12287843

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Grant

Filed: November 6, 2023

Date of Patent: April 29, 2025

Assignee: Intel Corporation

Inventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
ZERO-CLEARING SCALAR MOVES

Publication number: 20250028533

Abstract: Techniques for zero clearing scalar moves are described. For example, one or more instructions are supported which, when executed, are to cause a scalar move of a 16-bit or 32-bit floating-point value from a source to a destination. When the destination is a vector register, all other data elements are to be zeroed.

Type: Application

Filed: July 21, 2023

Publication date: January 23, 2025

Inventors: John MORGAN, Michael ESPIG, Deepti AGGARWAL
SUPPORT FOR LESS THAN 512-BIT OPERAND PROCESSING

Publication number: 20250004764

Abstract: Techniques for providing 512-bit operands or smaller are described. In some examples, a prefix of an instruction is utilized to define the operand (vector) length. For example, an instruction is to at least include fields for a prefix, an opcode, and operand addressing information, wherein the prefix and addressing information are to be used by decoder circuitry to determine support for a particular a vector length for one or more operands of the instance of the single instruction and the opcode is to indicate one or more operations to perform on the one or more operands.

Type: Application

Filed: July 1, 2023

Publication date: January 2, 2025

Inventors: Michael ESPIG, Menachem ADELMAN, Jonathan COMBS, Amit GRADSTEIN, Christopher J. HUGHES, Vivekananthan SANJEEPAN, Wing Shek WONG
VECTOR PACKED MATRIX MULTIPLICATION AND ACCUMULATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20250004768

Abstract: Decoder circuitry to decode an instruction indicating a first vector register having a 128-bit lane to store a first matrix having two rows by K columns of data elements having a number of bits, a storage location having 128 bits to store a second matrix having K rows by two columns of data elements having the number of bits, and a second vector register having a 128-bit lane to store a third matrix having two rows by two columns of data elements having a greater number of bits. Execution circuitry is to perform operations for the instruction, including to generate and store a result matrix having two rows by two columns of result data elements having the greater number of bits in 128-bit lane of second vector register. The result matrix represents accumulation of the third matrix with product matrix generated from matrix multiplication using the first and second matrices.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Alexander HEINECKE, Wing Shek WONG, Stephen ROBINSON, Raanan SADE, Amit GRADSTEIN, Simon RUBANOVICH, Michael ESPIG, Dan BAUM, Evangelos GEORGANAS, Dhiraj KALAMKAR
Apparatuses, methods, and systems for a packed data convolution instruction with shift control and width control

Patent number: 12182570

Abstract: Systems, methods, and apparatuses to support packed data convolution instructions with shift control and width control are described.

Type: Grant

Filed: June 25, 2021

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Deepti Aggarwal, Michael Espig, Robert Valentine, Sumit Mohan, Prakaram Joshi, Richard Winterton
Systems and methods for performing matrix compress and decompress instructions

Patent number: 12175246

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: September 1, 2023

Date of Patent: December 24, 2024

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
Instruction and logic for sum of square differences

Patent number: 12099838

Abstract: In an embodiment, a processor includes: a fetch circuit to fetch instructions, the instructions including a sum of squared differences (SSD) instruction; a decode circuit to decode the SSD instruction; and an execution circuit to, during an execution of the decoded SSD instruction, generate an SSD output vector based on a plurality of input vectors, the SSD output vector including a plurality of squared differences values. Other embodiments are described and claimed.

Type: Grant

Filed: December 23, 2020

Date of Patent: September 24, 2024

Assignee: Intel Corporation

Inventors: Deepti Aggarwal, Michael Espig, Chekib Nouira, Robert Valentine, Mark Charney
RESTRICTING VECTOR LENGTH IN A PROCESSOR

Publication number: 20240220248

Abstract: Techniques to restrict vector length in a processor are described. A method of an aspect that may be performed by a processor includes executing first instances of vector instructions having respective opcode values regardless of whether they specify wider vectors of a wider vector width or narrower vectors of a narrower vector width, when a control value is a first value. The method also includes executing second instances of vector instructions having the respective opcode values when they specify narrower vectors of the narrower vector width, but do not specify wider vectors of the wider vector width, when the control value is a second different value. The method also includes preventing execution of third instances of vector instructions having the respective opcode values when they specify wider vectors of the wider vector width, when the control value is the second value. Other methods, processors, and systems are disclosed.

Type: Application

Filed: December 29, 2022

Publication date: July 4, 2024

Inventors: Vivekananthan SANJEEPAN, Gilbert NEIGER, Michael ESPIG
TRUNCATION FLOATING-POINT CONVERSION TO INTEGER WITH SATURATION

Publication number: 20240103872

Abstract: Techniques for performing floating-point to integer conversion with saturation are described. In some examples, an instruction is executed to perform the conversion. In some examples, a single instruction to include at least one or more fields for an opcode and one or more fields for location information for at least a first source operand and a destination operand, wherein the opcode is to indicate execution circuitry is to convert, using truncation or saturation, each floating-point data element of at least the first source operand to an integer value and store the integer value into a corresponding data element position of the destination operand, wherein truncation is to be used when a conversion is inexact and saturation is to be used when a conversion overflows.

Type: Application

Filed: March 29, 2023

Publication date: March 28, 2024

Inventors: John MORGAN, Deepti AGGARWAL, Michael ESPIG
VECTOR MULTIPLY-ADD/SUBTRACT WITH INTERMEDIATE ROUNDING

Publication number: 20240103865

Abstract: Techniques for using and/or supporting multiplication with add and/or subtract instructions with an intermediate (after multiplication) round are described. In some examples, an instruction at least having one or more fields for an opcode and location information for three packed data source operands, wherein the opcode is to indicate execution circuitry is to perform, per packed data element position, a multiplication, a round, addition and/or subtraction, and a round, using the three packed data source operands and storage into a corresponding packed data element position of an identified destination location, wherein which packed data element positions are to be added and subtracted is defined by the opcode is supported.

Type: Application

Filed: March 30, 2023

Publication date: March 28, 2024

Inventors: Michael ESPIG, Mikko BYCKLING, Maxim LOKTYUKHIN, Dmitry Yurievich BABOKIN, Amit GRADSTEIN, Deepti AGGARWAL
FLOATING-POINT SCALAR COMPARISON WITH ENHANCED FLAGS

Publication number: 20240103866

Abstract: Detailed herein are examples of instructions and their hardware support for floating-point comparison that makes use of the distinction between signed integer comparison and unsigned integer comparison to make an analogous distinction between floating-point relationships including unordered and those that do not. These instructions may reduce the number of instructions required to compare and conditionally execute operations in a program, including instructions to load values and instructions to explicitly test for the unordered condition.

Type: Application

Filed: July 1, 2023

Publication date: March 28, 2024

Inventors: John MORGAN, Deepti AGGARWAL, Michael ESPIG, H. Peter ANVIN
SYSTEMS AND METHODS OF INSTRUCTIONS TO ACCELERATE MULTIPLICATION OF SPARSE MATRICES USING BITMASKS THAT IDENTIFY NON-ZERO ELEMENTS

Publication number: 20240078285

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Application

Filed: November 6, 2023

Publication date: March 7, 2024

Inventors: Dan BAUM, Chen KOREN, Elmoustapha OULD-AHMED-VALL, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
SYSTEMS AND METHODS FOR PERFORMING MATRIX COMPRESS AND DECOMPRESS INSTRUCTIONS

Publication number: 20240045690

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Application

Filed: September 1, 2023

Publication date: February 8, 2024

Inventors: Dan BAUM, Michael ESPIG, James GUILFORD, Wajdi K. FEGHALI, Raanan SADE, Christopher J. HUGHES, Robert VALENTINE, Bret TOLL, Elmoustapha OULD-AHMED-VALL, Mark J. CHARNEY, Vinodh GOPAL, Ronen ZOHAR, Alexander F. HEINECKE
Systems and methods for performing nibble-sized operations on matrix elements

Patent number: 11886875

Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.

Type: Grant

Filed: December 26, 2018

Date of Patent: January 30, 2024

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
VECTOR UNPACK BASED ON SELECTION INFORMATION

Publication number: 20240004648

Abstract: Techniques for vector unpacking are described. In some examples a single instruction is executed to perform vector unpacking.

Type: Application

Filed: July 2, 2022

Publication date: January 4, 2024

Inventors: Venkateswara Rao MADDURI, Jason BRANDT, Jeff WIEDEMEIER, Michael ESPIG
INSTRUCTIONS AND SUPPORT FOR HORIZONTAL REDUCTIONS

Publication number: 20240004662

Abstract: Techniques for performing horizontal reductions are described. In some examples, an instance of a horizontal instruction is to include at least one field for an opcode, one or more fields to reference a first source operand, and one or more fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least perform a horizontal reduction using at least one data element of a non-masked data element position of at least the first source operand and store a result of the horizontal reduction in the destination operand.

Type: Application

Filed: July 2, 2022

Publication date: January 4, 2024

Inventors: Menachem ADELMAN, Amit GRADSTEIN, Regev SHEMY, Chitra NATARAJAN, Leonardo BORGES, Chytra SHIVASWAMY, Igor ERMOLAEV, Michael ESPIG, Or BEIT AHARON, Jeff WIEDEMEIER
Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements

Patent number: 11847185

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Grant

Filed: September 24, 2021

Date of Patent: December 19, 2023

Assignee: Intel Corporation

Inventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Method and apparatus for efficient binary and ternary support in fused multiply-add (FMA) circuits

Patent number: 11836464

Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.

Type: Grant

Filed: June 14, 2022

Date of Patent: December 5, 2023

Assignee: Intel Corporation

Inventors: Aditya Varma, Michael Espig
Systems and methods for performing matrix compress and decompress instructions

Patent number: 11748103

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: February 15, 2022

Date of Patent: September 5, 2023

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
METHOD AND APPARATUS FOR SEPARABLE CONVOLUTION FILTER OPERATIONS ON MATRIX MULTIPLICATION ARRAYS

Publication number: 20230185873

Abstract: Methods and apparatus relating to separable convolution filter operations on matrix multiplication arrays are described. In an embodiment, logic circuitry generates a first convolution kernel and a second convolution kernel based on a two-dimensional convolution kernel. A matrix processing array comprising a plurality of Fused Multiply-Add (FMA) blocks applies the first convolution kernel to input data during a first pass to generate an intermediate data and the matrix processing array applies the second convolution kernel to the intermediate data to generate output data. Other embodiments are also disclosed and claimed.

Type: Application

Filed: December 10, 2021

Publication date: June 15, 2023

Applicant: Intel Corporation

Inventors: Michael Espig, Deepti Aggarwal

1 2 3 next