Patents by Inventor Roger Gramunt

Roger Gramunt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230195456
    Abstract: In one embodiment, an apparatus includes: a plurality of execution circuits to execute and instruct micro-operations (?ops), where a subset of the plurality of execution circuits are capable of execution of a fused ?op; a fusion circuit coupled to at least the subset of the plurality of execution circuits, wherein the fusion circuit is to fuse at least some pairs of producer-consumer ?ops into fused ?ops; and a fusion throttle circuit coupled to the fusion circuit, wherein the fusion throttle circuit is to prevent a first ?op from being fused with another ?op based at least in part on historical information associated with the first ?op. Other embodiments are described and claimed.
    Type: Application
    Filed: December 22, 2021
    Publication date: June 22, 2023
    Inventors: Sufiyan Syed, Roger Gramunt, Jayesh Gaur, Priyank Deshpande
  • Publication number: 20220343174
    Abstract: Described herein is a graphics processor including a processing resource including a multiplier configured to multiply input associated with the instruction at one of a first plurality of bit widths, an adder configured to add a product output from the multiplier with an accumulator value at one of a second plurality of bit widths, and circuitry to select a first bit width of the first plurality of bit widths for the multiplier and a second bit width of the second plurality of bit widths for the adder.
    Type: Application
    Filed: May 12, 2022
    Publication date: October 27, 2022
    Applicant: Intel Corporation
    Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
  • Patent number: 11334796
    Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.
    Type: Grant
    Filed: August 3, 2020
    Date of Patent: May 17, 2022
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
  • Publication number: 20210374069
    Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.
    Type: Application
    Filed: August 17, 2021
    Publication date: December 2, 2021
    Applicant: Intel Corporation
    Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
  • Publication number: 20210019631
    Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.
    Type: Application
    Filed: August 3, 2020
    Publication date: January 21, 2021
    Applicant: Intel Corporation
    Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
  • Patent number: 10776699
    Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: September 15, 2020
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
  • Publication number: 20200242046
    Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.
    Type: Application
    Filed: September 4, 2019
    Publication date: July 30, 2020
    Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
  • Patent number: 10719320
    Abstract: An apparatus is provided which comprises: a component; a voltage generator to supply load current to the component; first one or more circuitries to predict that the load current is to increase from a first time; and second one or more circuitries to, in anticipation of the increase in the load current from the first time, cause the component to execute first instructions during a time period that occurs prior to the first time.
    Type: Grant
    Filed: July 31, 2017
    Date of Patent: July 21, 2020
    Assignee: Intel Corporation
    Inventors: Federico Ardanaz, Roger Gramunt, Jesus Corbal, Dennis R. Bradford, Jonathan M. Eastep
  • Patent number: 10445245
    Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
  • Patent number: 10445244
    Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
  • Publication number: 20190034203
    Abstract: An apparatus is provided which comprises: a component; a voltage generator to supply load current to the component; first one or more circuitries to predict that the load current is to increase from a first time; and second one or more circuitries to, in anticipation of the increase in the load current from the first time, cause the component to execute first instructions during a time period that occurs prior to the first time.
    Type: Application
    Filed: July 31, 2017
    Publication date: January 31, 2019
    Inventors: Federico Ardanaz, Roger Gramunt, Jesus Corbal, Dennis R. Bradford, Jonathan M. Eastep
  • Patent number: 10175986
    Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.
    Type: Grant
    Filed: May 8, 2017
    Date of Patent: January 8, 2019
    Assignee: Intel Corporation
    Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith
  • Publication number: 20180322390
    Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.
    Type: Application
    Filed: January 12, 2018
    Publication date: November 8, 2018
    Applicant: Intel Corporation
    Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
  • Patent number: 10007620
    Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: June 26, 2018
    Assignee: Intel Corporation
    Inventors: Seth H. Pugsley, Christopher B. Wilkerson, Roger Gramunt, Jonathan C. Hall, Prabhat Jain
  • Publication number: 20180095895
    Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.
    Type: Application
    Filed: September 30, 2016
    Publication date: April 5, 2018
    Inventors: Seth H. Pugsley, Christopher B. Wilkerson, Roger Gramunt, Jonathan C. Hall, Prabhat Jain
  • Patent number: 9934155
    Abstract: A method, system, and apparatus may initialize a fixed plurality of page table entries for a fixed plurality of pages in memory, each page having a first size, wherein a linear address for each page table entry corresponds to a physical address and the fixed plurality of pages are aligned. A bit in each of the page table entries for the aligned pages may be set to indicate whether or not the fixed plurality of pages is to be treated as one combined page having a second page size larger than the first page size. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: April 3, 2018
    Assignee: Intel Corporation
    Inventors: Edward Grochowski, Julio Gago, Roger Gramunt, Roger Espasa, Rolf Kassa
  • Patent number: 9891914
    Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.
    Type: Grant
    Filed: April 10, 2015
    Date of Patent: February 13, 2018
    Assignee: Intel Corporation
    Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
  • Patent number: 9886396
    Abstract: In one embodiment, a processor includes a frontend unit having an instruction decoder to receive and to decode instructions of a plurality of threads, an execution unit coupled to the instruction decoder to receive and execute the decoded instructions, and an instruction retirement unit having a retirement logic to receive the instructions from the execution unit and to retire the instructions associated with one or more of the threads that have an instruction or an event pending to be retired. The instruction retirement unit includes a thread arbitration logic to select one of the threads at a time and to dispatch the selected thread to the retirement logic for retirement processing.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: February 6, 2018
    Assignee: Intel Corporation
    Inventors: Roger Gramunt, Rammohan Padmanabhan, Ramon Matas, Neal S. Moyer, Benjamin C. Chaffin, Avinash Sodani, Alexey P. Suprun, Vikram S. Sundaram, Chung-Lun Chan, Gerardo A. Fernandez, Julio Gago, Michael S. Yang, Aditya Kesiraju
  • Patent number: 9804842
    Abstract: An apparatus and method for efficiently managing the architectural state of a processor.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: October 31, 2017
    Assignee: INTEL CORPORATION
    Inventors: Jesus Corbal San Adrian, Dennis R. Bradford, Benjamin C. Chaffin, Taraneh Bahrami, Jonathan C. Hall, Thomas B. Maciukenas, Roger Gramunt, Rohan Sharma
  • Publication number: 20170242698
    Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.
    Type: Application
    Filed: May 8, 2017
    Publication date: August 24, 2017
    Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith