Patents by Inventor Luke Yen

Luke Yen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11669333
    Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.
    Type: Grant
    Filed: April 26, 2018
    Date of Patent: June 6, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Rodney Wayne Smith, Raghavan Madhavan, Luke Yen, Shivam Priyadarshi, Yusuf Cagatay Tekmen
  • Patent number: 10956163
    Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: March 23, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Jaewoong Chung, David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst, Martin Pohlack, Luke Yen
  • Patent number: 10877895
    Abstract: A method, apparatus, and system for prefetching exclusive cache coherence state for store instructions is disclosed. An apparatus may comprise a cache and a gather buffer coupled to the cache. The gather buffer may be configured to store a plurality of cache lines, each cache line of the plurality of cache lines associated with a store instruction. The gather buffer may be further configured to determine whether a first cache line associated with a first store instruction should be allocated in the cache. If the first cache line associated with the first store instruction is to be allocated in the cache, the gather buffer is configured to issue a pre-write request to acquire exclusive cache coherency state to the first cache line associated with the first store instruction.
    Type: Grant
    Filed: August 27, 2018
    Date of Patent: December 29, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Luke Yen, Niket Choudhary, Pritha Ghoshal, Thomas Philip Speier, Brian Michael Stempel, William James McAvoy, Patrick Eibl
  • Patent number: 10860328
    Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: December 8, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
  • Publication number: 20200097296
    Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).
    Type: Application
    Filed: September 21, 2018
    Publication date: March 26, 2020
    Inventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
  • Publication number: 20200065006
    Abstract: A method, apparatus, and system for prefetching exclusive cache coherence state for store instructions is disclosed. An apparatus may comprise a cache and a gather buffer coupled to the cache. The gather buffer may be configured to store a plurality of cache lines, each cache line of the plurality of cache lines associated with a store instruction. The gather buffer may be further configured to determine whether a first cache line associated with a first store instruction should be allocated in the cache. If the first cache line associated with the first store instruction is to be allocated in the cache, the gather buffer is configured to issue a pre-write request to acquire exclusive cache coherency state to the first cache line associated with the first store instruction.
    Type: Application
    Filed: August 27, 2018
    Publication date: February 27, 2020
    Inventors: Luke YEN, Niket CHOUDHARY, Pritha GHOSHAL, Thomas Philip SPEIER, Brian Michael STEMPEL, William James MCAVOY, Patrick EIBL
  • Publication number: 20190332385
    Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.
    Type: Application
    Filed: April 26, 2018
    Publication date: October 31, 2019
    Inventors: Rodney Wayne SMITH, Raghavan MADHAVAN, Luke YEN, Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN
  • Publication number: 20190155608
    Abstract: Aspects of the present disclosure include a method, a device, and a computer-readable medium for restarting an instruction pipeline of a processor that includes a decoupled fetcher. A method comprises detecting, in a processor, a re-fetch event, wherein the processor includes an instruction unit (IU) configured to fetch instructions from a decoupled fetcher (DCF), and simultaneously flushing the IU and the DCF in response to detecting of the re-fetch event.
    Type: Application
    Filed: November 16, 2018
    Publication date: May 23, 2019
    Inventors: Arthur PERAIS, Michael Scott MCILVAINE, Rami Mohammad A. AL SHEIKH, Robert Douglas CLANCY, Luke YEN, Rodney Wayne SMITH
  • Publication number: 20180121204
    Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.
    Type: Application
    Filed: December 18, 2017
    Publication date: May 3, 2018
    Inventors: Jaewoong CHUNG, David S. CHRISTIE, Michael P. HOHMUTH, Stephan DIESTELHORST, Martin POHLACK, Luke YEN
  • Patent number: 9880848
    Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code as a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for an issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.
    Type: Grant
    Filed: June 11, 2010
    Date of Patent: January 30, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jaewoong Chung, David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst, Martin T. Pohlack, Luke Yen
  • Patent number: 9710276
    Abstract: In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: July 18, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David N. Suggs, Luke Yen, Steven Beigelmacher
  • Publication number: 20170046158
    Abstract: Systems and methods for identifying candidate load instructions for prefetch operations based on at least instruction encoding of the load instructions, include an identifier based on a function of at least one or more fields of a load instruction and optionally, a subset of bits of the PC value of the load instruction, wherein the one or more fields exclude a full address or program counter (PC) value of the load instruction. Prefetch mechanisms, including a prefetch table indexed by the identifier, can determine whether the load instruction is a candidate load instruction for prefetching load data, based on the identifier. The function may be a hash, a concatenation, or a combination thereof, of one or more bits of the one or more fields. The fields include one or more of a base register, a destination register, an immediate offset, an offset register, or other bits of instruction encoding of the load instruction.
    Type: Application
    Filed: August 14, 2015
    Publication date: February 16, 2017
    Inventors: Luke YEN, Michael William MORROW, Thomas Philip SPEIER, James Norris DIEFFENDERFER
  • Publication number: 20170046167
    Abstract: Predicting memory instruction punts in a computer processor using a punt avoidance table (PAT) are disclosed. In one aspect, an instruction processing circuit accesses a PAT containing entries each comprising an address of a memory instruction. Upon detecting a memory instruction in an instruction stream, the instruction processing circuit determines whether the PAT contains an entry having an address of the memory instruction. If so, the instruction processing circuit prevents the detected memory instruction from taking effect before at least one pending memory instruction older than the detected memory instruction, to preempt a memory instruction punt. In some aspects, the instruction processing circuit may determine, upon execution of a pending memory instruction, whether a hazard associated with the detected memory instruction has occurred. If so, an entry for the detected memory instruction is generated in the PAT.
    Type: Application
    Filed: September 24, 2015
    Publication date: February 16, 2017
    Inventors: Luke Yen, Michael William Morrow, Jeffery Michael Schottmiller, James Norris Dieffenderfer
  • Patent number: 9459877
    Abstract: An apparatus, computer readable medium, and method of performing nested speculative regions are presented. The method includes responding to entering a speculative region by storing link information to an abort handler and responding to a commit command by removing link information from the abort handler. The method may include storing link information to the abort handler associated with the speculative region. When the speculative region is nested, the method may include storing link information to an abort handler associated with a previous speculative region. Removing link information may include removing link information from the abort handler associated with the corresponding speculative region. The method may include restoring link information to the abort handler associated with a previous speculative region. Responding to an abort command may include running the abort handler associated with the aborted speculative region.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: October 4, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen
  • Patent number: 9367310
    Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the location of the data accessed by the instruction at the processor stack relative to a base location. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.
    Type: Grant
    Filed: June 20, 2013
    Date of Patent: June 14, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Luke Yen
  • Patent number: 9292292
    Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.
    Type: Grant
    Filed: June 20, 2013
    Date of Patent: March 22, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Luke Yen
  • Patent number: 9152509
    Abstract: A computing device initiates a transaction, corresponding to an application, which includes operations for accessing data stored in a shared memory and buffering alterations to the data as speculative alterations to the shared memory. The computing device detects a transaction abort scenario corresponding to the transaction and notifies the application regarding the transaction abort scenario. The computing device determines whether to abort the transaction based on instructions received from the application regarding the transaction abort scenario. When the transaction is to be aborted, the computing device restores the transaction to an operation prior to accessing the data stored in the shared memory and buffering alterations to the data as speculative alterations to the shared memory. When the transaction is not to be aborted, the computing device enables the transaction to continue.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: October 6, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen
  • Publication number: 20140379986
    Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.
    Type: Application
    Filed: June 20, 2013
    Publication date: December 25, 2014
    Inventors: Kai Troester, Luke Yen
  • Publication number: 20140380022
    Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the location of the data accessed by the instruction at the processor stack relative to a base location. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.
    Type: Application
    Filed: June 20, 2013
    Publication date: December 25, 2014
    Inventors: Kai Troester, Luke Yen
  • Publication number: 20140181480
    Abstract: An apparatus, computer readable medium, and method of performing nested speculative regions are presented. The method includes responding to entering a speculative region by storing link information to an abort handler and responding to a commit command by removing link information from the abort handler. The method may include storing link information to the abort handler associated with the speculative region. When the speculative region is nested, the method may include storing link information to an abort handler associated with a previous speculative region. Removing link information may include removing link information from the abort handler associated with the corresponding speculative region. The method may include restoring link information to the abort handler associated with a previous speculative region. Responding to an abort command may include running the abort handler associated with the aborted speculative region.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen