Patents by Inventor Deepankar Duggal

Deepankar Duggal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11416254
    Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: August 16, 2022
    Assignee: Apple Inc.
    Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
  • Patent number: 11200062
    Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: December 14, 2021
    Assignee: Apple Inc.
    Inventors: Deepankar Duggal, Conrado Blasco, Muawya M. Al-Otoom, Richard F. Russo
  • Publication number: 20210173654
    Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.
    Type: Application
    Filed: December 5, 2019
    Publication date: June 10, 2021
    Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
  • Publication number: 20210064376
    Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.
    Type: Application
    Filed: August 26, 2019
    Publication date: March 4, 2021
    Inventors: Deepankar Duggal, Conrado Blasco, Muawya M. Al-Otoom, Richard F. Russo
  • Patent number: 10846091
    Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.
    Type: Grant
    Filed: February 26, 2019
    Date of Patent: November 24, 2020
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Deepankar Duggal, Ran A. Chachick
  • Patent number: 10838723
    Abstract: Techniques are disclosed relating to speculative writes to special-purpose registers (SPRs). In some embodiments, the disclosed techniques may reduce or avoid system instruction stalls while waiting for SPR writes, which may improve processor performance. In some embodiments, a processor includes a first storage element configured to store a non-speculative value of a special-purpose register and speculative storage circuitry configured to store one or more speculative values of the special-purpose register based on one or more speculatively-performed writes to the special-purpose register. In some embodiments, the processor includes control circuitry configured to: propagate the non-speculative value of the special-purpose register to control other circuitry and provide a youngest speculative value of the special-purpose register in the speculative storage circuitry as a speculative read of the special-purpose register.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: November 17, 2020
    Assignee: Apple Inc.
    Inventors: Christopher M. Tsay, Conrado Blasco, Deepankar Duggal, Richard F. Russo
  • Patent number: 10838729
    Abstract: A system and method for efficiently reducing the latency and power of memory access operations. A processor includes a stack pointer (SP) load-store dependence (LSD) predictor which predicts whether a memory dependence exists on a store instruction. The processor also includes a register file (RF) LSD predictor which predicts whether a memory dependence exists on a store instruction or a load instruction by a subsequent load instruction in program order. Each of the SP-LSD predictor and the RF-LSD predictor predicts and performs register renaming in a pipeline stage earlier than a renaming pipeline stage. The RF-LSD predictor also determines whether any intervening instructions between a producer memory instruction and a consumer memory instruction modify a predicted dependence.
    Type: Grant
    Filed: March 21, 2018
    Date of Patent: November 17, 2020
    Assignee: Apple Inc.
    Inventors: Muawya M. Al-Otoom, Conrado Blasco, Deepankar Duggal, Kulin N. Kothari, Richard F. Russo
  • Publication number: 20200272467
    Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.
    Type: Application
    Filed: February 26, 2019
    Publication date: August 27, 2020
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Deepankar Duggal, Ran A. Chachick
  • Patent number: 10628164
    Abstract: A system and method for efficiently handling speculative execution. A load store unit (LSU) of a processor stores a commit candidate pointer, which points to a given store instruction buffered in the store queue. The given store instruction is an oldest store instruction not currently permitted to commit to the data cache. The LSU receives a first pointer from the mapping unit, which points to an oldest instruction of non-dispatched branches and unresolved system instructions. The LSU receives a second pointer from the execution unit, which points to an oldest unresolved, issued branch instruction. When the LSU determines the commit candidate pointer is older than each of the first pointer and the second pointer, the commit candidate pointer is updated to point to an oldest store instruction younger than the given store instruction stored in the store queue. The given store instruction is permitted to commit to the data cache.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: April 21, 2020
    Assignee: Apple Inc.
    Inventors: Kulin N. Kothari, Mridul Agarwal, Aditya Kesiraju, Deepankar Duggal, Sean M. Reynolds
  • Patent number: 9952863
    Abstract: Techniques are disclosed relating to capturing information related to instructions executing on in a processor. In one embodiment, an integrated circuit is disclosed that includes an execution pipeline configured to execute a sequence of instructions. The integrated circuit includes monitoring circuitry configured to monitor the execution pipeline for occurrences of an event associated with the sequence of instructions, and in response to detecting a particular number of occurrences of the event, capture a value of a program counter corresponding to an instruction of the sequence of instructions that is associated with an occurrence of the event. The monitoring circuitry stores the captured value of the program counter in a distinct capture register and signals an interrupt indicating that the captured value of the program counter is retrievable from the capture register. In some embodiments, a debugging application may retrieve the value and present it to a developer attempting perform code profiling.
    Type: Grant
    Filed: September 1, 2015
    Date of Patent: April 24, 2018
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Deepankar Duggal, Richard F. Russo