Patents by Inventor Mark J. Charney

Mark J. Charney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods to store a tile register pair to memory

Patent number: 12293186

Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.

Type: Grant

Filed: November 2, 2023

Date of Patent: May 6, 2025

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements

Patent number: 12287843

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Grant

Filed: November 6, 2023

Date of Patent: April 29, 2025

Assignee: Intel Corporation

Inventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Systems, methods, and apparatuses for matrix operations

Patent number: 12282525

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address, and execution circuitry to execute the decoded instruction to store configuration information about usage of storage for two-dimensional data structures at the memory address.

Type: Grant

Filed: November 3, 2023

Date of Patent: April 22, 2025

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Systems, methods, and apparatus for tile configuration

Patent number: 12282773

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Grant

Filed: December 8, 2023

Date of Patent: April 22, 2025

Assignee: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
Apparatuses, methods, and systems for instructions to convert 16-bit floating-point formats

Patent number: 12277419

Abstract: Systems, methods, and apparatuses relating to instructions to convert 16-bit floating-point formats are described. In one embodiment, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode and locations of a source vector comprising N plurality of 16-bit half-precision floating-point elements, and a destination vector to store N plurality of 16-bit bfloat floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the source vector from 16-bit half-precision floating-point format to 16-bit bfloat floating-point format and store each converted element into a corresponding location of the destination vector, decode circuitry to decode the fetched single instruction into a decoded single instruction, and the execution circuitry to respond to the decoded single instruction as specified by the opcode.

Type: Grant

Filed: December 24, 2020

Date of Patent: April 15, 2025

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Menachem Adelman, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Systems for performing instructions to quickly convert and use tiles as 1D vectors

Patent number: 12265826

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Grant

Filed: December 28, 2023

Date of Patent: April 1, 2025

Assignee: Intel Corporation

Inventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Systems, methods, and apparatuses for matrix add, subtract, and multiply

Patent number: 12260213

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Grant

Filed: December 10, 2021

Date of Patent: March 25, 2025

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
Systems and methods to load a tile register pair

Patent number: 12236242

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

Type: Grant

Filed: March 20, 2023

Date of Patent: February 25, 2025

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Interruptible and restartable matrix multiplication instructions, processors, methods, and systems

Patent number: 12204898

Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.

Type: Grant

Filed: August 30, 2023

Date of Patent: January 21, 2025

Assignee: Intel Corporation

Inventors: Edward T. Grochowski, Asit K. Mishra, Robert Valentine, Mark J. Charney, Simon C. Steely, Jr.
Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 12197617

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: July 21, 2023

Date of Patent: January 14, 2025

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L Toll, Mark J. Charney
Apparatus and method for vector packed dual complex-by-complex and dual complex-by-complex conjugate multiplication

Patent number: 12190111

Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers and complex conjugates. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes multiplier circuitry to multiply select real and imaginary data elements in the first and second source registers to generate a plurality of real and imaginary products; adder circuitry to add/subtract various real and imaginary products, scale the results according to an immediate of the instruction, round the scaled results; and saturation circuitry to saturate the rounded results.

Type: Grant

Filed: June 26, 2021

Date of Patent: January 7, 2025

Assignee: Intel Corporation

Inventors: Venkateswara Rao Madduri, Robert Valentine, Mark J. Charney
SYSTEMS, APPARATUSES, AND METHODS FOR ADDITION OF PARTIAL PRODUCTS

Publication number: 20250004763

Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.

Type: Application

Filed: September 16, 2024

Publication date: January 2, 2025

Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
Systems, methods, and apparatuses for tile load, multiplication and accumulation

Patent number: 12182571

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Grant

Filed: January 23, 2023

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Menachem Adelman, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Dan Baum, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
Systems and methods for computing dot products of nibbles in two tile operands

Patent number: 12182568

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).

Type: Grant

Filed: August 14, 2023

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall
Systems and methods for performing matrix compress and decompress instructions

Patent number: 12175246

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: September 1, 2023

Date of Patent: December 24, 2024

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Patent number: 12147804

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Grant

Filed: July 22, 2021

Date of Patent: November 19, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
Systems, methods, and apparatuses for heterogeneous computing

Patent number: 12135981

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Grant

Filed: June 9, 2023

Date of Patent: November 5, 2024

Assignee: Intel Corporation

Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
Systems and methods for performing instructions to convert to 16-bit floating-point format

Patent number: 12131154

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Grant

Filed: June 28, 2022

Date of Patent: October 29, 2024

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Systems, apparatuses, and methods for addition of partial products

Patent number: 12124846

Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.

Type: Grant

Filed: August 28, 2023

Date of Patent: October 22, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
Systems, methods, and apparatuses for tile transpose

Patent number: 12124847

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

Type: Grant

Filed: July 1, 2017

Date of Patent: October 22, 2024

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Menachem Adelman, Simon Rubanovich

1 2 3 4 5 … next