Patents by Inventor Christopher J. Hughes

Christopher J. Hughes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS

Publication number: 20200104135

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Application

Filed: September 27, 2018

Publication date: April 2, 2020

Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
APPARATUS AND METHOD FOR PROCESSING STRUCTURE OF ARRAYS (SOA) AND ARRAY OF STRUCTURES (AOS) DATA

Publication number: 20200097298

Abstract: An apparatus and method for processing array of structures (AoS) and structure of arrays (SoA) data. For example, one embodiment of a processor comprises: a destination tile register to store data elements in a structure of arrays (SoA) format; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch an array of structures (AoS) gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the AoS gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register, to read data elements from the system memory addresses in an AoS format, and to load the data elements to the destination tile register in an SoA format.

Type: Application

Filed: September 24, 2018

Publication date: March 26, 2020

Inventors: CHRISTOPHER J. HUGHES, BRET TOLL, ALEXANDER HEINECKE, DAN BAUM, ELMOUSTAPHA OULD-AHMED-VALL, RAANAN SADE, ROBERT VALENTINE, MARK CHARNEY
APPARATUS AND METHOD FOR TILE GATHER AND TILE SCATTER

Publication number: 20200097291

Abstract: An apparatus and method for tile-based gather and scatter operations. For example, one embodiment of a processor comprises: a destination tile register to store a 2-D arrangement of data elements; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch a tile gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the tile gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register and to load the data elements from the system memory addresses to the destination tile register.

Type: Application

Filed: September 24, 2018

Publication date: March 26, 2020

Inventors: CHRISTOPHER J. HUGHES, BRET TOLL, ALEXANDER HEINECKE, DAN BAUM, ELMOUSTAPHA OULD-AHMED-VALL, RAANAN SADE, ROBERT VALENTINE, MARK CHARNEY
DATA ELEMENT COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20200089494

Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.

Type: Application

Filed: September 23, 2019

Publication date: March 19, 2020

Inventors: Asit K. MISHRA, Edward T. GROCHOWSKI, Jonathan D. PEARCE, Deborah T. MARR, Ehud COHEN, Elmoustapha OULD-AHMED-VALL, Jesus Corbal SAN ADRIAN, Robert VALENTINE, Mark J. CHARNEY, Christopher J. HUGHES, Milind B. GIRKAR
Spatial and temporal merging of remote atomic operations

Patent number: 10572260

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Grant

Filed: December 29, 2017

Date of Patent: February 25, 2020

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 10545761

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: December 20, 2018

Date of Patent: January 28, 2020

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS OF A MATRIX OPERATIONS ACCELERATOR

Publication number: 20200026745

Abstract: Systems, methods, and apparatuses relating to a matrix operations accelerator are described.

Type: Application

Filed: September 27, 2019

Publication date: January 23, 2020

Inventors: Kamlesh R. Pillai, Christopher J. Hughes, Alexander Heinecke
Data element rearrangement, processors, methods, systems, and instructions

Patent number: 10503502

Abstract: A processor includes a decode unit to decode an instruction indicating a source packed data operand having source data elements and indicating a destination storage location. Each of the source data elements has a source data element value and a source data element position. An execution unit, in response to the instruction, stores a result packed data operand having result data elements each having a result data element value and a result data element position. Each result data element value is one of: (1) equal to a source data element position of a source data element, closest to one end of the source operand, having a source data element value equal to the result data element position of the result data element; and (2) a replacement value, when no source data element has a source data element value equal to the result data element position of the result data element.

Type: Grant

Filed: September 25, 2015

Date of Patent: December 10, 2019

Assignee: INTEL CORPORATION

Inventors: Christopher J. Hughes, Jong Soo Park
Read and write masks update instruction for vectorization of recursive computations over independent data

Patent number: 10503505

Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.

Type: Grant

Filed: April 2, 2018

Date of Patent: December 10, 2019

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
System, Apparatus And Method For Selective Enabling Of Locality-Based Instruction Handling

Publication number: 20190370180

Abstract: In an embodiment, a processor includes a sparse access buffer having a plurality of entries each to store for a memory access instruction to a particular address, address information and count information; and a memory controller to issue read requests to a memory, the memory controller including a locality controller to receive a memory access instruction having a no-locality hint and override the no-locality hint based at least in part on the count information stored in an entry of the sparse access buffer. Other embodiments are described and claimed.

Type: Application

Filed: August 14, 2019

Publication date: December 5, 2019

Inventors: Berkin Akin, Rajat Agarwal, Jong Soo Park, Christopher J. Hughes, Chiachen Chou
Instruction and logic for suppression of hardware prefetchers

Patent number: 10496410

Abstract: A processor includes a core, a hardware prefetcher, and a prefetcher control module. The hardware prefetcher includes logic to make speculative prefetch requests, through a memory subsystem, for elements for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to selectively suppress, based on a hardware-prefetch suppression instruction executed by the core, a speculative prefetch request to be made by the hardware prefetcher.

Type: Grant

Filed: December 23, 2014

Date of Patent: December 3, 2019

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Christopher J. Hughes, Daehyun Kim, Jong Soo Park
No-locality hint vector memory access processors, methods, systems, and instructions

Patent number: 10467144

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

Type: Grant

Filed: March 30, 2018

Date of Patent: November 5, 2019

Assignee: Intel Corporation

Inventor: Christopher J. Hughes
No-locality hint vector memory access processors, methods, systems, and instructions

Patent number: 10452555

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

Type: Grant

Filed: March 30, 2018

Date of Patent: October 22, 2019

Assignee: Intel Corporation

Inventor: Christopher J. Hughes
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 10452398

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: December 20, 2018

Date of Patent: October 22, 2019

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
SYSTEMS AND METHODS FOR IMPLEMENTING CHAINED TILE OPERATIONS

Publication number: 20190303167

Abstract: Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

Type: Application

Filed: March 30, 2018

Publication date: October 3, 2019

Inventors: Christopher J. HUGHES, Alexander F. HEINECKE, Robert Valentine, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall
APPARATUS AND METHOD FOR PROCESSING EFFICIENT MULTICAST OPERATION

Publication number: 20190303152

Abstract: An apparatus and method for processing efficient multicast operation.

Type: Application

Filed: March 30, 2018

Publication date: October 3, 2019

Inventors: CHRISTOPHER J. HUGHES, DAN BAUM
Data element comparison processors, methods, systems, and instructions

Patent number: 10423411

Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.

Type: Grant

Filed: September 26, 2015

Date of Patent: September 24, 2019

Assignee: Intel Corporation

Inventors: Asit K. Mishra, Edward T. Grochowski, Jonathan D. Pearce, Deborah T. Marr, Ehud Cohen, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal San Adrian, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Milind B. Girkar
System, apparatus and method for selective enabling of locality-based instruction handling

Patent number: 10409727

Abstract: In an embodiment, a processor includes a sparse access buffer having a plurality of entries each to store for a memory access instruction to a particular address, address information and count information; and a memory controller to issue read requests to a memory, the memory controller including a locality controller to receive a memory access instruction having a no-locality hint and override the no-locality hint based at least in part on the count information stored in an entry of the sparse access buffer. Other embodiments are described and claimed.

Type: Grant

Filed: March 31, 2017

Date of Patent: September 10, 2019

Assignee: Intel Corporation

Inventors: Berkin Akin, Rajat Agarwal, Jong Soo Park, Christopher J. Hughes, Chiachen Chou
System, apparatus and method for overriding of non-locality-based instruction handling

Patent number: 10402336

Abstract: In one embodiment, a processor includes: a core including a decode unit to decode a memory access instruction having a no-locality hint to indicate that data associated with the memory access instruction has at least one of non-spatial locality and non-temporal locality; and a locality controller to determine whether to override the no-locality hint based at least in part on one or more performance monitoring values. Other embodiments are described and claimed.

Type: Grant

Filed: March 31, 2017

Date of Patent: September 3, 2019

Assignee: Intel Corporation

Inventors: Berkin Akin, Rajat Agarwal, Jong Soo Park, Christopher J. Hughes
Systems, apparatuses, and methods for data speculation execution

Patent number: 10387156

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction to continue a data speculative execution (DSX) and to determine that a DSX loop iteration is to be committed, commit speculative stores associated with the DSX loop iteration, and start a new DSX loop iteration.

Type: Grant

Filed: December 24, 2014

Date of Patent: August 20, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar

prev … 3 4 5 6 7 8 9 10 11 … next