Patents by Inventor Luke Yen
Luke Yen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11669333Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.Type: GrantFiled: April 26, 2018Date of Patent: June 6, 2023Assignee: Qualcomm IncorporatedInventors: Rodney Wayne Smith, Raghavan Madhavan, Luke Yen, Shivam Priyadarshi, Yusuf Cagatay Tekmen
-
Patent number: 10956163Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.Type: GrantFiled: December 18, 2017Date of Patent: March 23, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Jaewoong Chung, David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst, Martin Pohlack, Luke Yen
-
Method, apparatus, and system for prefetching exclusive cache coherence state for store instructions
Patent number: 10877895Abstract: A method, apparatus, and system for prefetching exclusive cache coherence state for store instructions is disclosed. An apparatus may comprise a cache and a gather buffer coupled to the cache. The gather buffer may be configured to store a plurality of cache lines, each cache line of the plurality of cache lines associated with a store instruction. The gather buffer may be further configured to determine whether a first cache line associated with a first store instruction should be allocated in the cache. If the first cache line associated with the first store instruction is to be allocated in the cache, the gather buffer is configured to issue a pre-write request to acquire exclusive cache coherency state to the first cache line associated with the first store instruction.Type: GrantFiled: August 27, 2018Date of Patent: December 29, 2020Assignee: Qualcomm IncorporatedInventors: Luke Yen, Niket Choudhary, Pritha Ghoshal, Thomas Philip Speier, Brian Michael Stempel, William James McAvoy, Patrick Eibl -
Patent number: 10860328Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).Type: GrantFiled: September 21, 2018Date of Patent: December 8, 2020Assignee: Qualcomm IncorporatedInventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
-
Publication number: 20200097296Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).Type: ApplicationFiled: September 21, 2018Publication date: March 26, 2020Inventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
-
METHOD, APPARATUS, AND SYSTEM FOR PREFETCHING EXCLUSIVE CACHE COHERENCE STATE FOR STORE INSTRUCTIONS
Publication number: 20200065006Abstract: A method, apparatus, and system for prefetching exclusive cache coherence state for store instructions is disclosed. An apparatus may comprise a cache and a gather buffer coupled to the cache. The gather buffer may be configured to store a plurality of cache lines, each cache line of the plurality of cache lines associated with a store instruction. The gather buffer may be further configured to determine whether a first cache line associated with a first store instruction should be allocated in the cache. If the first cache line associated with the first store instruction is to be allocated in the cache, the gather buffer is configured to issue a pre-write request to acquire exclusive cache coherency state to the first cache line associated with the first store instruction.Type: ApplicationFiled: August 27, 2018Publication date: February 27, 2020Inventors: Luke YEN, Niket CHOUDHARY, Pritha GHOSHAL, Thomas Philip SPEIER, Brian Michael STEMPEL, William James MCAVOY, Patrick EIBL -
Publication number: 20190332385Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.Type: ApplicationFiled: April 26, 2018Publication date: October 31, 2019Inventors: Rodney Wayne SMITH, Raghavan MADHAVAN, Luke YEN, Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN
-
Publication number: 20190155608Abstract: Aspects of the present disclosure include a method, a device, and a computer-readable medium for restarting an instruction pipeline of a processor that includes a decoupled fetcher. A method comprises detecting, in a processor, a re-fetch event, wherein the processor includes an instruction unit (IU) configured to fetch instructions from a decoupled fetcher (DCF), and simultaneously flushing the IU and the DCF in response to detecting of the re-fetch event.Type: ApplicationFiled: November 16, 2018Publication date: May 23, 2019Inventors: Arthur PERAIS, Michael Scott MCILVAINE, Rami Mohammad A. AL SHEIKH, Robert Douglas CLANCY, Luke YEN, Rodney Wayne SMITH
-
Publication number: 20180121204Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.Type: ApplicationFiled: December 18, 2017Publication date: May 3, 2018Inventors: Jaewoong CHUNG, David S. CHRISTIE, Michael P. HOHMUTH, Stephan DIESTELHORST, Martin POHLACK, Luke YEN
-
Patent number: 9880848Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code as a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for an issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.Type: GrantFiled: June 11, 2010Date of Patent: January 30, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Jaewoong Chung, David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst, Martin T. Pohlack, Luke Yen
-
Patent number: 9710276Abstract: In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.Type: GrantFiled: November 9, 2012Date of Patent: July 18, 2017Assignee: Advanced Micro Devices, Inc.Inventors: David N. Suggs, Luke Yen, Steven Beigelmacher
-
Publication number: 20170046158Abstract: Systems and methods for identifying candidate load instructions for prefetch operations based on at least instruction encoding of the load instructions, include an identifier based on a function of at least one or more fields of a load instruction and optionally, a subset of bits of the PC value of the load instruction, wherein the one or more fields exclude a full address or program counter (PC) value of the load instruction. Prefetch mechanisms, including a prefetch table indexed by the identifier, can determine whether the load instruction is a candidate load instruction for prefetching load data, based on the identifier. The function may be a hash, a concatenation, or a combination thereof, of one or more bits of the one or more fields. The fields include one or more of a base register, a destination register, an immediate offset, an offset register, or other bits of instruction encoding of the load instruction.Type: ApplicationFiled: August 14, 2015Publication date: February 16, 2017Inventors: Luke YEN, Michael William MORROW, Thomas Philip SPEIER, James Norris DIEFFENDERFER
-
Publication number: 20170046167Abstract: Predicting memory instruction punts in a computer processor using a punt avoidance table (PAT) are disclosed. In one aspect, an instruction processing circuit accesses a PAT containing entries each comprising an address of a memory instruction. Upon detecting a memory instruction in an instruction stream, the instruction processing circuit determines whether the PAT contains an entry having an address of the memory instruction. If so, the instruction processing circuit prevents the detected memory instruction from taking effect before at least one pending memory instruction older than the detected memory instruction, to preempt a memory instruction punt. In some aspects, the instruction processing circuit may determine, upon execution of a pending memory instruction, whether a hazard associated with the detected memory instruction has occurred. If so, an entry for the detected memory instruction is generated in the PAT.Type: ApplicationFiled: September 24, 2015Publication date: February 16, 2017Inventors: Luke Yen, Michael William Morrow, Jeffery Michael Schottmiller, James Norris Dieffenderfer
-
Patent number: 9459877Abstract: An apparatus, computer readable medium, and method of performing nested speculative regions are presented. The method includes responding to entering a speculative region by storing link information to an abort handler and responding to a commit command by removing link information from the abort handler. The method may include storing link information to the abort handler associated with the speculative region. When the speculative region is nested, the method may include storing link information to an abort handler associated with a previous speculative region. Removing link information may include removing link information from the abort handler associated with the corresponding speculative region. The method may include restoring link information to the abort handler associated with a previous speculative region. Responding to an abort command may include running the abort handler associated with the aborted speculative region.Type: GrantFiled: December 21, 2012Date of Patent: October 4, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen
-
Patent number: 9367310Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the location of the data accessed by the instruction at the processor stack relative to a base location. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.Type: GrantFiled: June 20, 2013Date of Patent: June 14, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Kai Troester, Luke Yen
-
Patent number: 9292292Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.Type: GrantFiled: June 20, 2013Date of Patent: March 22, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Kai Troester, Luke Yen
-
Patent number: 9152509Abstract: A computing device initiates a transaction, corresponding to an application, which includes operations for accessing data stored in a shared memory and buffering alterations to the data as speculative alterations to the shared memory. The computing device detects a transaction abort scenario corresponding to the transaction and notifies the application regarding the transaction abort scenario. The computing device determines whether to abort the transaction based on instructions received from the application regarding the transaction abort scenario. When the transaction is to be aborted, the computing device restores the transaction to an operation prior to accessing the data stored in the shared memory and buffering alterations to the data as speculative alterations to the shared memory. When the transaction is not to be aborted, the computing device enables the transaction to continue.Type: GrantFiled: December 13, 2011Date of Patent: October 6, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen
-
Publication number: 20140379986Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.Type: ApplicationFiled: June 20, 2013Publication date: December 25, 2014Inventors: Kai Troester, Luke Yen
-
Publication number: 20140380022Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the location of the data accessed by the instruction at the processor stack relative to a base location. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.Type: ApplicationFiled: June 20, 2013Publication date: December 25, 2014Inventors: Kai Troester, Luke Yen
-
Publication number: 20140181480Abstract: An apparatus, computer readable medium, and method of performing nested speculative regions are presented. The method includes responding to entering a speculative region by storing link information to an abort handler and responding to a commit command by removing link information from the abort handler. The method may include storing link information to the abort handler associated with the speculative region. When the speculative region is nested, the method may include storing link information to an abort handler associated with a previous speculative region. Removing link information may include removing link information from the abort handler associated with the corresponding speculative region. The method may include restoring link information to the abort handler associated with a previous speculative region. Responding to an abort command may include running the abort handler associated with the aborted speculative region.Type: ApplicationFiled: December 21, 2012Publication date: June 26, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Luke Yen