Patents by Inventor Christopher J. Hughes

Christopher J. Hughes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MATRIX TRANSPOSE AND MULTIPLY

Publication number: 20240329938

Abstract: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.

Type: Application

Filed: March 15, 2024

Publication date: October 3, 2024

Applicant: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Barukh Ziv, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Mark J. Charney, Christopher J. Hughes, Alexander F. Heinecke, Evangelos Georganas, Binh Pham
Processor instructions for data compression and decompression

Patent number: 12106104

Abstract: A processor that includes compression instructions to compress multiple adjacent data blocks of uncompressed read-only data stored in memory into one compressed read-only data block and store the compressed read-only data block in multiple adjacent blocks in the memory is provided. During execution of an application to operate on the read-only data, one of the multiple adjacent blocks storing the compressed read-only block is read from memory, stored in a prefetch buffer and decompressed in the memory controller. In response to a subsequent request during execution of the application for an adjacent data block in the compressed read-only data block, the uncompressed adjacent block is read directly from the prefetch buffer.

Type: Grant

Filed: December 23, 2020

Date of Patent: October 1, 2024

Assignee: Intel Corporation

Inventors: Zhe Wang, Alaa R. Alameldeen, Christopher J. Hughes
DYNAMIC CACHE COHERENCE PROTOCOL BASED ON RUNTIME INTERCONNECT UTILIZATION

Publication number: 20240303195

Abstract: In one embodiment, a processor includes interconnect circuitry, processing circuitry, a first cache, and cache controller circuitry. The interconnect circuitry communicates over a processor interconnect with a second processor that includes a second cache. The processing circuitry generates a memory read request for a corresponding memory address of a memory. Based on the memory read request, the cache controller circuitry detects a cache miss in the first cache, which indicates that the first cache does not contain a valid copy of data for the corresponding memory address. Based on the cache miss, the cache controller circuitry requests the data from the second cache or the memory based on a current bandwidth utilization of the processor interconnect.

Type: Application

Filed: December 15, 2021

Publication date: September 12, 2024

Applicant: Intel Corporation

Inventors: Keqiang Wu, Lingxiang Xiang, Heidi Pan, Christopher J. Hughes, Zhe Wang
Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions

Patent number: 12056489

Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described.

Type: Grant

Filed: May 5, 2023

Date of Patent: August 6, 2024

Assignee: Intel Corporation

Inventors: Naveen Mellempudi, Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
APPARATUSES, METHODS, AND SYSTEMS FOR 8-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS

Publication number: 20240241722

Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described.

Type: Application

Filed: March 28, 2024

Publication date: July 18, 2024

Applicant: Intel Corporation

Inventors: Naveen MELLEMPUDI, Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Christopher J. HUGHES, Evangelos GEORGANAS, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions

Patent number: 12020028

Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described.

Type: Grant

Filed: December 26, 2020

Date of Patent: June 25, 2024

Assignee: Intel Corporation

Inventors: Naveen Mellempudi, Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Instructions for remote atomic operations

Patent number: 11989555

Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.

Type: Grant

Filed: June 29, 2017

Date of Patent: May 21, 2024

Assignee: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Jonas Svennebring, Samantika S. Sury, Christopher J. Hughes, Jong Soo Park, Lingxiang Xiang
INTER-CLUSTER SHARED DATA MANAGEMENT IN SUB-NUMA CLUSTER

Publication number: 20240152448

Abstract: An embodiment of an integrated circuit may comprise circuitry communicatively coupled to two or more sub-non-uniform memory access clusters (SNCs) to allocate a specified memory space in the two or more SNCs in accordance with a SNC memory allocation policy indicated from a request to initialize the specified memory space. An embodiment of an apparatus may comprise decode circuitry to decode a single instruction, the single instruction to include a field for an opcode, and execution circuitry to execute the decoded instruction according to the opcode to provide an indicated SNC memory allocation policy (e.g., a SNC policy hint). Other embodiments are disclosed and claimed.

Type: Application

Filed: June 21, 2021

Publication date: May 9, 2024

Applicant: Intel Corporation

Inventors: Zhe Wang, Lingxiang Xiang, Christopher J. Hughes
Matrix transpose and multiply

Patent number: 11972230

Abstract: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.

Type: Grant

Filed: June 27, 2020

Date of Patent: April 30, 2024

Assignee: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Barukh Ziv, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Mark J. Charney, Christopher J. Hughes, Alexander F. Heinecke, Evangelos Georganas, Binh Pham
CONFIGURING AND DYNAMICALLY RECONFIGURING CHAINS OF ACCELERATORS

Publication number: 20240126555

Abstract: A method of an aspect includes receiving a request for a chained accelerator operation, and configuring a chain of accelerators to perform the chained accelerator operation. This may include configuring a first accelerator to access an input data from a source memory location in system memory, process the input data, and generate first intermediate data. This may also include configuring a second accelerator to receive the first intermediate data, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.

Type: Application

Filed: October 17, 2022

Publication date: April 18, 2024

Inventors: Saurabh GAYEN, Christopher J. HUGHES, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS

Publication number: 20240126551

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Application

Filed: December 28, 2023

Publication date: April 18, 2024

Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
CHAINED ACCELERATOR OPERATIONS WITH STORAGE FOR INTERMEDIATE RESULTS

Publication number: 20240127392

Abstract: A chip or other apparatus of an aspect includes a first accelerator and a second accelerator. The first accelerator has support for a chained accelerator operation. The first accelerator is to be controlled as part of the chained accelerator operation to access an input data from a source memory location in system memory, process the input data, generate first intermediate data, and store the first intermediate data to a storage. The second accelerator also has support for the chained accelerator operation. The second accelerator is to be controlled as part of the chained accelerator operation to receive the first intermediate data from the storage, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.

Type: Application

Filed: October 17, 2022

Publication date: April 18, 2024

Inventors: Christopher J. HUGHES, Saurabh GAYEN, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
CHAINED ACCELERATOR OPERATIONS

Publication number: 20240126613

Abstract: A chip or other apparatus of an aspect includes a first accelerator and a second accelerator. The first accelerator has support for a chained accelerator operation. The first accelerator is to be controlled as part of the chained accelerator operation to access an input data from a source memory location in system memory, process the input data, and generate first intermediate data. The second accelerator also has support for the chained accelerator operation. The second accelerator is to be controlled as part of the chained accelerator operation to receive the first intermediate data, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.

Type: Application

Filed: October 17, 2022

Publication date: April 18, 2024

Inventors: Saurabh GAYEN, Christopher J. HUGHES, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
Systems and methods for performing instructions to transform matrices into row-interleaved format

Patent number: 11954490

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

Type: Grant

Filed: April 28, 2023

Date of Patent: April 9, 2024

Assignee: Intel Corporation

Inventors: Raanan Sade, Robert Valentine, Bret Toll, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Systems for performing instructions to quickly convert and use tiles as 1D vectors

Patent number: 11954489

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Grant

Filed: December 13, 2021

Date of Patent: April 9, 2024

Assignee: Intel Corporation

Inventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS

Publication number: 20240103867

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Application

Filed: November 28, 2023

Publication date: March 28, 2024

Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions

Patent number: 11941395

Abstract: Systems, methods, and apparatuses relating to 16-bit floating-point matrix dot product instructions are described.

Type: Grant

Filed: December 24, 2020

Date of Patent: March 26, 2024

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Menachem Adelman, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Data element rearrangement, processors, methods, systems, and instructions

Patent number: 11941394

Abstract: A processor includes a decode unit to decode an instruction indicating a source packed data operand having source data elements and indicating a destination storage location. Each of the source data elements has a source data element value and a source data element position. An execution unit, in response to the instruction, stores a result packed data operand having result data elements each having a result data element value and a result data element position. Each result data element value is one of: (1) equal to a source data element position of a source data element, closest to one end of the source operand, having a source data element value equal to the result data element position of the result data element; and (2) a replacement value, when no source data element has a source data element value equal to the result data element position of the result data element.

Type: Grant

Filed: December 9, 2019

Date of Patent: March 26, 2024

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Jong Soo Park
Method and apparatus for data-ready memory operations

Patent number: 11934830

Abstract: Disclosed embodiments relate to a new instruction for performing data-ready memory access operations.

Type: Grant

Filed: June 13, 2022

Date of Patent: March 19, 2024

Assignee: Intel Corporation

Inventors: William M. Brown, Mikhail Plotnikov, Christopher J. Hughes
SYSTEMS AND METHODS OF INSTRUCTIONS TO ACCELERATE MULTIPLICATION OF SPARSE MATRICES USING BITMASKS THAT IDENTIFY NON-ZERO ELEMENTS

Publication number: 20240078285

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Application

Filed: November 6, 2023

Publication date: March 7, 2024

Inventors: Dan BAUM, Chen KOREN, Elmoustapha OULD-AHMED-VALL, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE

prev 1 2 3 4 5 6 … next