Patents by Inventor Christopher J. Hughes

Christopher J. Hughes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems, apparatuses, and methods for data speculation execution

Patent number: 10387158

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction inside a speculative execution (DSX) and rollback execution to a stored address and clear a DSX status indication in a DSX status register, and thereby abort the DSX.

Type: Grant

Filed: December 24, 2014

Date of Patent: August 20, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar
COALESCING ADJACENT GATHER/SCATTER OPERATIONS

Publication number: 20190250921

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Application

Filed: April 29, 2019

Publication date: August 15, 2019

Inventors: Andrew T. FORSYTH, Brian J. HICKMANN, Jonathan C. HALL, Christopher J. HUGHES
REMOTE ATOMIC OPERATIONS IN MULTI-SOCKET SYSTEMS

Publication number: 20190243761

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Application

Filed: April 11, 2019

Publication date: August 8, 2019

Applicant: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
DYNAMIC HOME TILE MAPPING

Publication number: 20190236013

Abstract: Technologies for migration of dynamic home tile mapping are described. An apparatus includes means for receiving coherence messages from other processor cores on the die, means for recording locations from which the coherence messages originate and means for determining distances between the requested home tiles and the locations from which the coherence messages originate. The apparatus includes means for determining whether an average distance between a particular home tile, whose identifier is stored in the home tile table, exceeds a threshold. When the average distance exceeds the defined threshold, the apparatus includes means for migrating the particular home tile to another location.

Type: Application

Filed: April 12, 2019

Publication date: August 1, 2019

Inventors: Christopher J. Hughes, Daehyun Kim, Jong Soo Park, Richard M. Yoo
DISCRETE COSINE TRANSFORM/INVERSE DISCRETE COSINE TRANSFORM (DCT/IDCT) SYSTEMS AND METHODS

Publication number: 20190228049

Abstract: The present disclosure is directed to systems and methods for performing discrete cosine transforms and inverse discrete cosine transforms (DCT/IDCT) using a CORDIC algorithm implemented in systolic array circuitry that includes a plurality cells or nodes, each containing circuitry to implement the CORDIC algorithm. DCT/IDCT control circuitry multiplies the systolic array output matrix generated by the systolic array circuitry by a scaling factor that may include a defined scaling value or an actual cosine value. The DCT/IDCT control circuitry causes the transfer of the scaled systolic array output matrix to combination circuitry where the DCT/IDCT input matrix is combined with the scaled systolic array output matrix to provide the DCT/IDCT output matrix. The DCT/IDCT control circuitry also transfers bypass information to at least a portion of the cells or nodes in the systolic array circuitry.

Type: Application

Filed: March 30, 2019

Publication date: July 25, 2019

Applicant: Intel Corporation

Inventors: Kamlesh R. Pillai, Christopher J. Hughes
SPATIAL AND TEMPORAL MERGING OF REMOTE ATOMIC OPERATIONS

Publication number: 20190205139

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Application

Filed: December 29, 2017

Publication date: July 4, 2019

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
NO-LOCALITY HINT VECTOR MEMORY ACCESS PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20190179762

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

Type: Application

Filed: February 15, 2019

Publication date: June 13, 2019

Inventor: Christopher J. Hughes
Transaction end plus commit to persistence instructions, processors, methods, and systems

Patent number: 10318295

Abstract: A processor of an aspect includes a decode unit to decode a transaction end plus commit to persistence instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to atomically ensure that data associated with all prior store to memory operations made to a persistent memory, which are to have been accepted to memory when performance of the instruction begins, but which are not necessarily to have been stored in the persistent memory when the performance of the instruction begins, are to be stored in the persistent memory before the instruction becomes globally visible. The execution unit, in response to the instruction, is also to atomically end a transactional memory transaction before the instruction becomes globally visible.

Type: Grant

Filed: December 22, 2015

Date of Patent: June 11, 2019

Assignee: Intel Corporation

Inventors: Kshitij A. Doshi, Christopher J. Hughes
APPLICATION DRIVEN HARDWARE CACHE MANAGEMENT

Publication number: 20190171396

Abstract: A processor includes a processing core to generate a memory request for an application data in an application. The processor also includes a virtual page group memory management (VPGMM) unit coupled to the processing core to specify a caching priority (CP) to the application data for the application. The caching priority identifies importance of the application data in a cache.

Type: Application

Filed: November 13, 2018

Publication date: June 6, 2019

Inventors: Subramanya R. Dulloor, Rajesh M. Sankaran, David A. Koufaty, Christopher J. Hughes, Jong Soo Park, Sheng Li
Systems, apparatuses, and methods for data speculation execution

Patent number: 10303525

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.

Type: Grant

Filed: December 24, 2014

Date of Patent: May 28, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar, Hideki Ido, Youfeng Wu, Cheng Wang
Dynamic home tile mapping

Patent number: 10303606

Abstract: Technologies for migration of dynamic home tile mapping are described. A cache controller can receive coherence messages from other processor cores on the die. The cache controller records locations from which the coherence messages originate and determine distances between the requested home tiles and the locations from which the coherence messages originate. The cache controller determines whether an average distance between a particular home tile, whose identifier is stored in the home tile table, exceeds a threshold. When the average distance exceeds the defined threshold, the cache controller migrates the particular home tile to another location.

Type: Grant

Filed: March 21, 2017

Date of Patent: May 28, 2019

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Daehyun Kim, Jong Soo Park, Richard M. Yoo
Remote atomic operations in multi-socket systems

Patent number: 10296459

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Type: Grant

Filed: December 29, 2017

Date of Patent: May 21, 2019

Assignee: Intel Corporation

Inventors: Doddaballapur N. Jayasimha, Samantika S. Sury, Christopher J. Hughes, Jonas Svennebring, Yen-Cheng Liu, Stephen R. Van Doren, David A. Koufaty
Coalescing adjacent gather/scatter operations

Patent number: 10275257

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: May 22, 2017

Date of Patent: April 30, 2019

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY

Publication number: 20190121642

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Application

Filed: December 20, 2018

Publication date: April 25, 2019

Inventors: Christopher J. HUGHES, Mikhail PLOTNIKOV, Andrey NARAIKIN, Robert VALENTINE
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY

Publication number: 20190121643

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Application

Filed: December 20, 2018

Publication date: April 25, 2019

Inventors: Christopher J. HUGHES, Mikhail PLOTNIKOV, Andrey NARAIKIN, Robert VALENTINE
SYSTEMS, APPARATUSES, AND METHODS FOR DATA SPECULATION EXECUTION

Publication number: 20190121644

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for DSX comprises execution hardware to execute instructions to begin and end a data speculative execution (DSX) and speculative instructions during the DSX, and DSX tracking hardware to track speculative memory accesses and detect ordering violations in a DSX of speculative instructions using a sequence number, addresses of instruction accesses, and whether an instruction being tracked is a write, and to trigger a mis-speculation upon an ordering violation.

Type: Application

Filed: December 24, 2014

Publication date: April 25, 2019

Inventors: Elmoustapha OULD-AHMED-VALL, Christopher J. HUGHES, Robert VALENTINE, Milind B. GIRKAR
Hybrid hardware and software implementation of transactional memory access

Patent number: 10268579

Abstract: Embodiments of the invention relate a hybrid hardware and software implementation of transactional memory accesses in a computer system. A processor including a transactional cache and a regular cache is utilized in a computer system that includes a policy manager to select one of a first mode (a hardware mode) or a second mode (a software mode) to implement transactional memory accesses. In the hardware mode the transactional cache is utilized to perform read and write memory operations and in the software mode the regular cache is utilized to perform read and write memory operations.

Type: Grant

Filed: April 1, 2017

Date of Patent: April 23, 2019

Assignee: Intel Corporation

Inventors: Sanjeev Kumar, Christopher J. Hughes, Partha Kundu, Anthony Nguyen
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT

Publication number: 20190102196

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

Type: Application

Filed: September 28, 2018

Publication date: April 4, 2019

Inventors: Raanan SADE, Robert VALENTINE, Bret TOLL, Christopher J. HUGHES, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL, Mark J. CHARNEY
Methods and systems to traverse graph-based networks

Patent number: 10229670

Abstract: Methods and systems to translate input labels of arcs of a network, corresponding to a sequence of states of the network, to a list of output grammar elements of the arcs, corresponding to a sequence of grammar elements. The network may include a plurality of speech recognition models combined with a weighted finite state machine transducer (WFST). Traversal may include active arc traversal, and may include active arc propagation. Arcs may be processed in parallel, including arcs originating from multiple source states and directed to a common destination state. Self-loops associated with states may be modeled within outgoing arcs of the states, which may reduce synchronization operations. Tasks may be ordered with respect to cache-data locality to associate tasks with processing threads based at least in part on whether another task associated with a corresponding data object was previously assigned to the thread.

Type: Grant

Filed: June 24, 2013

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Kisun You, Christopher J. Hughes, Yen-Kuang Chen
No-locality hint vector memory access processors, methods, systems, and instructions

Patent number: 10210091

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

Type: Grant

Filed: February 15, 2017

Date of Patent: February 19, 2019

Assignee: Intel Corporation

Inventor: Christopher J. Hughes

prev … 4 5 6 7 8 9 10 11 12 … next