Patents by Inventor Roger Gramunt

Roger Gramunt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM, APPARATUS AND METHOD FOR THROTTLING FUSION OF MICRO-OPERATIONS IN A PROCESSOR

Publication number: 20230195456

Abstract: In one embodiment, an apparatus includes: a plurality of execution circuits to execute and instruct micro-operations (?ops), where a subset of the plurality of execution circuits are capable of execution of a fused ?op; a fusion circuit coupled to at least the subset of the plurality of execution circuits, wherein the fusion circuit is to fuse at least some pairs of producer-consumer ?ops into fused ?ops; and a fusion throttle circuit coupled to the fusion circuit, wherein the fusion throttle circuit is to prevent a first ?op from being fused with another ?op based at least in part on historical information associated with the first ?op. Other embodiments are described and claimed.

Type: Application

Filed: December 22, 2021

Publication date: June 22, 2023

Inventors: Sufiyan Syed, Roger Gramunt, Jayesh Gaur, Priyank Deshpande
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20220343174

Abstract: Described herein is a graphics processor including a processing resource including a multiplier configured to multiply input associated with the instruction at one of a first plurality of bit widths, an adder configured to add a product output from the multiplier with an accumulator value at one of a second plurality of bit widths, and circuitry to select a first bit width of the first plurality of bit widths for the multiplier and a second bit width of the second plurality of bit widths for the adder.

Type: Application

Filed: May 12, 2022

Publication date: October 27, 2022

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Optimized compute hardware for machine learning operations

Patent number: 11334796

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Type: Grant

Filed: August 3, 2020

Date of Patent: May 17, 2022

Assignee: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
METHOD, SYSTEM, AND APPARATUS FOR PAGE SIZING EXTENSION

Publication number: 20210374069

Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.

Type: Application

Filed: August 17, 2021

Publication date: December 2, 2021

Applicant: Intel Corporation

Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20210019631

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Type: Application

Filed: August 3, 2020

Publication date: January 21, 2021

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Optimized compute hardware for machine learning operations

Patent number: 10776699

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

Type: Grant

Filed: January 12, 2018

Date of Patent: September 15, 2020

Assignee: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
METHOD, SYSTEM, AND APPARATUS FOR PAGE SIZING EXTENSION

Publication number: 20200242046

Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.

Type: Application

Filed: September 4, 2019

Publication date: July 30, 2020

Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
Power noise injection to control rate of change of current

Patent number: 10719320

Abstract: An apparatus is provided which comprises: a component; a voltage generator to supply load current to the component; first one or more circuitries to predict that the load current is to increase from a first time; and second one or more circuitries to, in anticipation of the increase in the load current from the first time, cause the component to execute first instructions during a time period that occurs prior to the first time.

Type: Grant

Filed: July 31, 2017

Date of Patent: July 21, 2020

Assignee: Intel Corporation

Inventors: Federico Ardanaz, Roger Gramunt, Jesus Corbal, Dennis R. Bradford, Jonathan M. Eastep
Method, system, and apparatus for page sizing extension

Patent number: 10445245

Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.

Type: Grant

Filed: December 19, 2016

Date of Patent: October 15, 2019

Assignee: Intel Corporation

Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
Method, system, and apparatus for page sizing extension

Patent number: 10445244

Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.

Type: Grant

Filed: December 19, 2016

Date of Patent: October 15, 2019

Assignee: Intel Corporation

Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
POWER NOISE INJECTION TO CONTROL RATE OF CHANGE OF CURRENT

Publication number: 20190034203

Abstract: An apparatus is provided which comprises: a component; a voltage generator to supply load current to the component; first one or more circuitries to predict that the load current is to increase from a first time; and second one or more circuitries to, in anticipation of the increase in the load current from the first time, cause the component to execute first instructions during a time period that occurs prior to the first time.

Type: Application

Filed: July 31, 2017

Publication date: January 31, 2019

Inventors: Federico Ardanaz, Roger Gramunt, Jesus Corbal, Dennis R. Bradford, Jonathan M. Eastep
Stateless capture of data linear addresses during precise event based sampling

Patent number: 10175986

Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.

Type: Grant

Filed: May 8, 2017

Date of Patent: January 8, 2019

Assignee: Intel Corporation

Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20180322390

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
System and method for cache replacement using conservative set dueling

Patent number: 10007620

Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.

Type: Grant

Filed: September 30, 2016

Date of Patent: June 26, 2018

Assignee: Intel Corporation

Inventors: Seth H. Pugsley, Christopher B. Wilkerson, Roger Gramunt, Jonathan C. Hall, Prabhat Jain
System and Method for Cache Replacement Using Conservative Set Dueling

Publication number: 20180095895

Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.

Type: Application

Filed: September 30, 2016

Publication date: April 5, 2018

Inventors: Seth H. Pugsley, Christopher B. Wilkerson, Roger Gramunt, Jonathan C. Hall, Prabhat Jain
Method, system, and apparatus for page sizing extension

Patent number: 9934155

Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.

Type: Grant

Filed: December 20, 2012

Date of Patent: April 3, 2018

Assignee: Intel Corporation

Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
Method and apparatus for performing an efficient scatter

Patent number: 9891914

Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.

Type: Grant

Filed: April 10, 2015

Date of Patent: February 13, 2018

Assignee: Intel Corporation

Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
Scalable event handling in multi-threaded processor cores

Patent number: 9886396

Abstract: In one embodiment, a processor includes a frontend unit having an instruction decoder to receive and to decode instructions of a plurality of threads, an execution unit coupled to the instruction decoder to receive and execute the decoded instructions, and an instruction retirement unit having a retirement logic to receive the instructions from the execution unit and to retire the instructions associated with one or more of the threads that have an instruction or an event pending to be retired. The instruction retirement unit includes a thread arbitration logic to select one of the threads at a time and to dispatch the selected thread to the retirement logic for retirement processing.

Type: Grant

Filed: December 23, 2014

Date of Patent: February 6, 2018

Assignee: Intel Corporation

Inventors: Roger Gramunt, Rammohan Padmanabhan, Ramon Matas, Neal S. Moyer, Benjamin C. Chaffin, Avinash Sodani, Alexey P. Suprun, Vikram S. Sundaram, Chung-Lun Chan, Gerardo A. Fernandez, Julio Gago, Michael S. Yang, Aditya Kesiraju
Method and apparatus for efficiently managing architectural register state of a processor

Patent number: 9804842

Abstract: An apparatus and method for efficiently managing the architectural state of a processor.

Type: Grant

Filed: December 23, 2014

Date of Patent: October 31, 2017

Assignee: INTEL CORPORATION

Inventors: Jesus Corbal San Adrian, Dennis R. Bradford, Benjamin C. Chaffin, Taraneh Bahrami, Jonathan C. Hall, Thomas B. Maciukenas, Roger Gramunt, Rohan Sharma
STATELESS CAPTURE OF DATA LINEAR ADDRESSES DURING PRECISE EVENT BASED SAMPLING

Publication number: 20170242698

Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.

Type: Application

Filed: May 8, 2017

Publication date: August 24, 2017

Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith

1 2 3 next