Patents by Inventor Deepankar Duggal
Deepankar Duggal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12288070Abstract: An apparatus includes a processor core that includes an instruction decode circuit and a control circuit. The instruction decode circuit is configured to decode instructions, including a plurality of store instructions used to store information in a memory hierarchy. The control circuit is configured, after a particular store instruction is decoded, to preserve store information related to the particular store instruction, including a first program counter value for the particular store instruction. In response to decoding a subsequent load instruction with a corresponding second program counter value, the control circuit is configured to determine, using the first and second program counter values, whether a dependency has been established between the subsequent load instruction and the particular store instruction. In response to a determination that the dependency has been established, the control circuit is configured to use the preserved store information to perform the subsequent load instruction.Type: GrantFiled: September 9, 2022Date of Patent: April 29, 2025Assignee: Apple Inc.Inventors: Muawya M. Al-Otoom, Conrado Blasco, Deepankar Duggal, Ethan R. Schuchman, Ian D. Kountanis, Kulin N. Kothari, Nikhil Gupta
-
Publication number: 20250094567Abstract: In an embodiment, a processor includes hardware circuitry which may be used to authenticate instruction operands. The processor may execute instructions that perform operand authentication both speculatively and non-speculatively. During speculative execution of such instructions, the processor may execute authentication such that no differences in observable state of the processor, relative to authentication result, are detectable via a side channel. During speculative execution, a result of authentication may be deferred until speculative execution of the instruction, and additional instructions, may be completed. Upon resolution of a condition that indicates acceptance of the speculative execution, a speculative execution result may cause a processor exception and stalling of execution at the instruction to be performed.Type: ApplicationFiled: November 15, 2023Publication date: March 20, 2025Applicant: Apple Inc.Inventors: John D Pape, Deepankar Duggal, Christopher M Tsay, Andrew H Lin, Corey C Stappenbeck
-
Publication number: 20250036415Abstract: A processor may include a conditional instruction prediction tracking circuit. During fetch of a conditional instruction from memory to an instruction cache of the processor, the conditional instruction prediction tracking circuit may predict whether the conditional instruction is biased. Responsive to a prediction that the conditional instruction is biased, the conditional instruction prediction tracking circuit may cause the conditional instruction to be executed according to the predicted bias. Sometimes the conditional prediction tracking circuit may cause the conditional instruction to be re-coded such that it may be executed as an unconditional instruction.Type: ApplicationFiled: July 25, 2023Publication date: January 30, 2025Applicant: Apple Inc.Inventors: Deepankar Duggal, Pruthivi Vuyyuru, Ian D. Kountanis
-
Patent number: 12175248Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a control transfer instruction is mispredicted, a load instruction may have been executed on the wrong path. In disclosed embodiments, result storage circuitry records information that indicates destination registers of speculatively-executed load instructions including a first load instruction. Control flow tracker circuitry may store information indicating a reconvergence point for the control transfer instruction.Type: GrantFiled: April 21, 2023Date of Patent: December 24, 2024Assignee: Apple Inc.Inventors: Yuan C. Chou, Deepankar Duggal, Debasish Chandra, Niket K Choudhary, Richard F. Russo
-
Patent number: 12159142Abstract: Techniques are disclosed relating to predicting values for load operations. In some embodiments, front-end circuitry is configured to predict values of load operations based on multiple value tagged geometric length predictor (VTAGE) prediction tables (based on program counter information and branch history information). Training circuitry may adjust multiple VTAGE learning tables based on completed load operations. Control circuitry may pre-compute access information (e.g., an index) for a VTAGE learning table for a load based on branch history information that is available to the front-end circuitry but that is unavailable to the training circuitry, store the pre-computed access information, and provide the pre-computed access information from the first storage circuitry to the training circuitry to access the VTAGE learning table based on completion of the load. This may facilitate VTAGE training without pipelining the branch history information.Type: GrantFiled: May 2, 2023Date of Patent: December 3, 2024Assignee: Apple Inc.Inventors: Yuan C. Chou, Chang Xu, Deepankar Duggal, Debasish Chandra
-
Publication number: 20240354109Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a control transfer instruction is mispredicted, a load instruction may have been executed on the wrong path. In disclosed embodiments, result storage circuitry records information that indicates destination registers of speculatively-executed load instructions including a first load instruction. Control flow tracker circuitry may store information indicating a reconvergence point for the control transfer instruction.Type: ApplicationFiled: April 21, 2023Publication date: October 24, 2024Inventors: Yuan C. Chou, Deepankar Duggal, Debasish Chandra, Niket K. Choudhary, Richard F. Russo
-
Publication number: 20240354111Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a first control transfer instruction is mispredicted, a second control transfer instruction may have been executed on the wrong path because of the misprediction. Result storage circuitry may record information indicating a determined direction for the second control transfer instruction. Control flow tracker circuitry may store, for the first control transfer instruction, information indicating a reconvergence point. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine, based on the tracked registers, that the second control transfer instruction does not depend on data from any instruction between the first control transfer instruction and the reconvergence point, and use the recorded determined direction for the second control transfer instruction, notwithstanding the misprediction of the first control transfer instruction.Type: ApplicationFiled: April 21, 2023Publication date: October 24, 2024Inventors: Yuan C. Chou, Deepankar Duggal, Debasish Chandra, Niket K. Choudhary, Richard F. Russo
-
Publication number: 20240329990Abstract: A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.Type: ApplicationFiled: June 11, 2024Publication date: October 3, 2024Applicant: Apple Inc.Inventors: Deepankar Duggal, Kulin N Kothari, Mridul Agarwal, Chang Xu, Yanran Yang, Richard F Russo, Yuan C Chou, Douglas C Holman
-
Patent number: 12045615Abstract: A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.Type: GrantFiled: September 16, 2022Date of Patent: July 23, 2024Assignee: Apple Inc.Inventors: Deepankar Duggal, Kulin N Kothari, Mridul Agarwal, Chang Xu, Yanran Yang, Richard F Russo, Yuan C Chou, Douglas C Holman
-
Patent number: 11416254Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.Type: GrantFiled: December 5, 2019Date of Patent: August 16, 2022Assignee: Apple Inc.Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
-
Patent number: 11200062Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.Type: GrantFiled: August 26, 2019Date of Patent: December 14, 2021Assignee: Apple Inc.Inventors: Deepankar Duggal, Conrado Blasco, Muawya M. Al-Otoom, Richard F. Russo
-
Publication number: 20210173654Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.Type: ApplicationFiled: December 5, 2019Publication date: June 10, 2021Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
-
Publication number: 20210064376Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.Type: ApplicationFiled: August 26, 2019Publication date: March 4, 2021Inventors: Deepankar Duggal, Conrado Blasco, Muawya M. Al-Otoom, Richard F. Russo
-
Patent number: 10846091Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.Type: GrantFiled: February 26, 2019Date of Patent: November 24, 2020Assignee: Apple Inc.Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Deepankar Duggal, Ran A. Chachick
-
Patent number: 10838723Abstract: Techniques are disclosed relating to speculative writes to special-purpose registers (SPRs). In some embodiments, the disclosed techniques may reduce or avoid system instruction stalls while waiting for SPR writes, which may improve processor performance. In some embodiments, a processor includes a first storage element configured to store a non-speculative value of a special-purpose register and speculative storage circuitry configured to store one or more speculative values of the special-purpose register based on one or more speculatively-performed writes to the special-purpose register. In some embodiments, the processor includes control circuitry configured to: propagate the non-speculative value of the special-purpose register to control other circuitry and provide a youngest speculative value of the special-purpose register in the speculative storage circuitry as a speculative read of the special-purpose register.Type: GrantFiled: February 27, 2019Date of Patent: November 17, 2020Assignee: Apple Inc.Inventors: Christopher M. Tsay, Conrado Blasco, Deepankar Duggal, Richard F. Russo
-
Patent number: 10838729Abstract: A system and method for efficiently reducing the latency and power of memory access operations. A processor includes a stack pointer (SP) load-store dependence (LSD) predictor which predicts whether a memory dependence exists on a store instruction. The processor also includes a register file (RF) LSD predictor which predicts whether a memory dependence exists on a store instruction or a load instruction by a subsequent load instruction in program order. Each of the SP-LSD predictor and the RF-LSD predictor predicts and performs register renaming in a pipeline stage earlier than a renaming pipeline stage. The RF-LSD predictor also determines whether any intervening instructions between a producer memory instruction and a consumer memory instruction modify a predicted dependence.Type: GrantFiled: March 21, 2018Date of Patent: November 17, 2020Assignee: Apple Inc.Inventors: Muawya M. Al-Otoom, Conrado Blasco, Deepankar Duggal, Kulin N. Kothari, Richard F. Russo
-
Publication number: 20200272467Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.Type: ApplicationFiled: February 26, 2019Publication date: August 27, 2020Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Deepankar Duggal, Ran A. Chachick
-
Patent number: 10628164Abstract: A system and method for efficiently handling speculative execution. A load store unit (LSU) of a processor stores a commit candidate pointer, which points to a given store instruction buffered in the store queue. The given store instruction is an oldest store instruction not currently permitted to commit to the data cache. The LSU receives a first pointer from the mapping unit, which points to an oldest instruction of non-dispatched branches and unresolved system instructions. The LSU receives a second pointer from the execution unit, which points to an oldest unresolved, issued branch instruction. When the LSU determines the commit candidate pointer is older than each of the first pointer and the second pointer, the commit candidate pointer is updated to point to an oldest store instruction younger than the given store instruction stored in the store queue. The given store instruction is permitted to commit to the data cache.Type: GrantFiled: July 30, 2018Date of Patent: April 21, 2020Assignee: Apple Inc.Inventors: Kulin N. Kothari, Mridul Agarwal, Aditya Kesiraju, Deepankar Duggal, Sean M. Reynolds
-
Patent number: 9952863Abstract: Techniques are disclosed relating to capturing information related to instructions executing on in a processor. In one embodiment, an integrated circuit is disclosed that includes an execution pipeline configured to execute a sequence of instructions. The integrated circuit includes monitoring circuitry configured to monitor the execution pipeline for occurrences of an event associated with the sequence of instructions, and in response to detecting a particular number of occurrences of the event, capture a value of a program counter corresponding to an instruction of the sequence of instructions that is associated with an occurrence of the event. The monitoring circuitry stores the captured value of the program counter in a distinct capture register and signals an interrupt indicating that the captured value of the program counter is retrievable from the capture register. In some embodiments, a debugging application may retrieve the value and present it to a developer attempting perform code profiling.Type: GrantFiled: September 1, 2015Date of Patent: April 24, 2018Assignee: Apple Inc.Inventors: Conrado Blasco, Deepankar Duggal, Richard F. Russo