Patents by Inventor Christopher J. Hughes

Christopher J. Hughes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Coalescing adjacent gather/scatter operations

Patent number: 9626193

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: April 18, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9626192

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: April 18, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9612842

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: April 4, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
DATA ELEMENT COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20170090924

Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.

Type: Application

Filed: September 26, 2015

Publication date: March 30, 2017

Applicant: Intel Corporation

Inventors: Asit K. Mishra, Edward T. Grochowski, Jonathan D. Pearce, Deborah T. Marr, Ehud Cohen, Elmoustapha OuId-Ahmed-Vall, Jesus Corbal San Adrian, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Milind B. Girkar
SCATTER BY INDICES TO REGISTER, AND DATA ELEMENT REARRANGEMENT, PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20170090921

Abstract: A processor includes a decode unit to decode an instruction indicating a source packed data operand having source data elements and indicating a destination storage location. Each of the source data elements has a source data element value and a source data element position. An execution unit, in response to the instruction, stores a result packed data operand having result data elements each having a result data element value and a result data element position. Each result data element value is one of: (1) equal to a source data element position of a source data element, closest to one end of the source operand, having a source data element value equal to the result data element position of the result data element; and (2) a replacement value, when no source data element has a source data element value equal to the result data element position of the result data element.

Type: Application

Filed: September 25, 2015

Publication date: March 30, 2017

Applicant: INTEL CORPORATION

Inventors: Christopher J. Hughes, Jong Soo Park
No-locality hint vector memory access processors, methods, systems, and instructions

Patent number: 9600442

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

Type: Grant

Filed: July 18, 2014

Date of Patent: March 21, 2017

Assignee: Intel Corporation

Inventor: Christopher J Hughes
Hardware prefetcher for indirect access patterns

Patent number: 9582422

Abstract: Two techniques address bottlenecking in processors. The first is indirect prefetching. The technique can be especially useful for graph analytics and sparse matrix applications. For graph analytics and sparse matrix applications, the addresses of most random memory accesses come from an index array B which is sequentially scanned by an application. The random accesses are actually indirect accesses in the form A[B[i]]. A hardware component is introduced to detect this pattern. The hardware can then read B a certain distance ahead, and prefetch the corresponding element in A. For example, if the “prefetch distance” is k, when B[i] is accessed, the hardware reads B[i+k], and then A[B[i+k]. For partial cacheline accessing, the indirect accesses are usually accessing random memory locations and only accessing a small portion of a cacheline. Instead of loading the whole cacheline into L1 cache, the second technique only loads a part of the cacheline.

Type: Grant

Filed: December 24, 2014

Date of Patent: February 28, 2017

Assignee: INTEL CORPORATION

Inventors: Xiangyao Yu, Christopher J. Hughes, Nadathur Rajagopalan Satish
Coalescing adjacent gather/scatter operations

Patent number: 9575765

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: February 21, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
HYBRID HARDWARE AND SOFTWARE IMPLEMENTATION OF TRANSACTIONAL MEMORY ACCESS

Publication number: 20170039068

Abstract: Embodiments of the invention relate a hybrid hardware and software implementation of transactional memory accesses in a computer system. A processor including a transactional cache and a regular cache is utilized in a computer system that includes a policy manager to select one of a first mode (a hardware mode) or a second mode (a software mode) to implement transactional memory accesses. In the hardware mode the transactional cache is utilized to perform read and write memory operations and in the software mode the regular cache is utilized to perform read and write memory operations.

Type: Application

Filed: October 20, 2016

Publication date: February 9, 2017

Inventors: Sanjeev KUMAR, Christopher J. HUGHES, Partha KUNDU, Anthony NGUYEN
Coalescing adjacent gather/scatter operations

Patent number: 9563429

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: February 7, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Instruction and logic to provide pushing buffer copy and store functionality

Patent number: 9563425

Abstract: Instructions and logic provide pushing buffer copy and store functionality. Some embodiments include a first hardware thread or processing core, and a second hardware thread or processing core, a cache to store cache coherent data in a cache line for a shared memory address accessible by the second hardware thread or processing core. Responsive to decoding an instruction specifying a source data operand, said shared memory address as a destination operand, and one or more owner of said shared memory address, one or more execution units copy data from the source data operand to the cache coherent data in the cache line for said shared memory address accessible by said second hardware thread or processing core in the cache when said one or more owner includes said second hardware thread or processing core.

Type: Grant

Filed: November 28, 2012

Date of Patent: February 7, 2017

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Changkyu Kim, Daehyun Kim, Victor W. Lee, Jong Soo Park
APPLICATION DRIVEN HARDWARE CACHE MANAGEMENT

Publication number: 20160378651

Abstract: A processor includes a processing core to generate a memory request for an application data in an application. The processor also includes a virtual page group memory management (VPGMM) unit coupled to the processing core to specify a caching priority (CP) to the application data for the application. The caching priority identifies importance of the application data in a cache.

Type: Application

Filed: June 24, 2015

Publication date: December 29, 2016

Inventors: Subramanya R. Dulloor, Rajesh M. Sankaran, David A. Koufaty, Christopher J. Hughes, Jong Soo Park, Sheng Li
INSTRUCTION AND LOGIC FOR CHARACTERIZATION OF DATA ACCESS

Publication number: 20160378473

Abstract: A processor includes a front end to receive an instruction, a decoder to decode the instruction, a core to execute the first instruction, and a retirement unit to retire the first instruction. The core includes logic to execute the first instruction, including logic to repeatedly record a translation lookaside buffer (TLB) until a designated number of records are determined, and flush the TLB after a flush interval.

Type: Application

Filed: June 26, 2015

Publication date: December 29, 2016

Inventors: Kshitij A. Doshi, Christopher J. Hughes
Hybrid hardware and software implementation of transactional memory access

Patent number: 9529715

Abstract: Embodiments of the invention relate a hybrid hardware and software implementation of transactional memory accesses in a computer system. A processor including a transactional cache and a regular cache is utilized in a computer system that includes a policy manager to select one of a first mode (a hardware mode) or a second mode (a software mode) to implement transactional memory accesses. In the hardware mode the transactional cache is utilized to perform read and write memory operations and in the software mode the regular cache is utilized to perform read and write memory operations.

Type: Grant

Filed: March 15, 2013

Date of Patent: December 27, 2016

Assignee: Intel Corporation

Inventors: Sanjeev Kumar, Christopher J. Hughes, Partha Kundu, Anthony Nguyen
SYSTEMS, APPARATUSES, AND METHODS FOR DATA SPECULATION EXECUTION

Publication number: 20160357556

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction to continue a data speculative execution (DSX) and to determine that a DSX loop iteration is to be committed, commit speculative stores associated with the DSX loop iteration, and start a new DSX loop iteration.

Type: Application

Filed: December 24, 2014

Publication date: December 8, 2016

Inventors: Elmoustapha OULD-AHMED-VALL, Christopher J. HUGHES, Robert VALENTINE, Milind B. GIRKAR
Vector instructions to enable efficient synchronization and parallel reduction operations

Patent number: 9513905

Abstract: In one embodiment, a processor may include a vector unit to perform operations on multiple data elements responsive to a single instruction, and a control unit coupled to the vector unit to provide the data elements to the vector unit, where the control unit is to enable an atomic vector operation to be performed on at least some of the data elements responsive to a first vector instruction to be executed under a first mask and a second vector instruction to be executed under a second mask. Other embodiments are described and claimed.

Type: Grant

Filed: March 28, 2008

Date of Patent: December 6, 2016

Assignee: Intel Corporation

Inventors: Mikhail Smelyanskiy, Sanjeev Kumar, Daehyun Kim, Jatin Chhugani, Changkyu Kim, Christopher J. Hughes, Victor W. Lee, Anthony D. Nguyen, Yen-Kuang Chen
Read and Write Masks Update Instruction for Vectorization of Recursive Computations Over Independent Data

Publication number: 20160335086

Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.

Type: Application

Filed: July 25, 2016

Publication date: November 17, 2016

Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
CONTROLLING DISPLACEMENT IN A CO-OPERATIVE AND ADAPTIVE MULTIPLE-LEVEL MEMORY SYSTEM

Publication number: 20160321185

Abstract: In one embodiment, a processor includes a control logic to determine whether to enable an incoming data block associated with a first priority to displace, in a cache memory coupled to the processor, a candidate victim data block associated with a second priority and stored in the cache memory, based at least in part on the first and second priorities, a first access history associated with the incoming data block and a second access history associated with the candidate victim data block. Other embodiments are described and claimed.

Type: Application

Filed: April 28, 2015

Publication date: November 3, 2016

Inventors: Kshitij A. Doshi, Christopher J. Hughes
Methods, Apparatus, Instructions and Logic to Provide Permute Controls With Leading Zero Count Functionality

Publication number: 20160299763

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Application

Filed: June 21, 2016

Publication date: October 13, 2016

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
Methods, apparatus, instructions, and logic to provide vector address conflict detection functionality

Patent number: 9411584

Abstract: Instructions and logic provide SIMD address conflict detection functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store an offset for a data element in a memory. A destination register has corresponding data fields, each of these data fields to store a variable second plurality of bits to store a conflict mask having a mask bit for each offset. Responsive to decoding a vector conflict instruction, execution units compare the offset in each data field with every less significant data field to determine if they hold a matching offset, and in corresponding conflict masks in the destination register, set any mask bits corresponding to a less significant data field with a matching offset. Vector address conflict detection can be used with variable sized elements and to generate conflict masks to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: December 29, 2012

Date of Patent: August 9, 2016

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Brett L. Toll, Mark J. Charney, Milind B. Girkar

prev … 8 9 10 11 12 13 14 15 16 … next