Patents by Inventor Shivam Priyadarshi

Shivam Priyadarshi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11755330
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Grant
    Filed: September 13, 2022
    Date of Patent: September 12, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
  • Patent number: 11698789
    Abstract: Restoring speculative history used for making speculative predictions for instructions processed in a processor. The processor can be configured to speculatively predict an outcome of a condition or predicate of a conditional control instruction before its condition is fully evaluated in execution. Predictions are made by the processor based on a history that is updated based on outcomes of past predictions. If a conditional control instruction is mispredicted in execution, the processor can perform a misprediction recovery by stalling the instruction pipeline, flushing younger instructions in the instruction pipeline back to the mispredicted conditional control instruction, and then re-fetching instructions in the correct instruction flow path for execution.
    Type: Grant
    Filed: October 12, 2020
    Date of Patent: July 11, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Rami Mohammad Al Sheikh, Shivam Priyadarshi, Arthur Perais
  • Patent number: 11669333
    Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.
    Type: Grant
    Filed: April 26, 2018
    Date of Patent: June 6, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Rodney Wayne Smith, Raghavan Madhavan, Luke Yen, Shivam Priyadarshi, Yusuf Cagatay Tekmen
  • Publication number: 20230004397
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Application
    Filed: September 13, 2022
    Publication date: January 5, 2023
    Inventors: Vignyan Reddy KOTHINTI NARESH, Shivam PRIYADARSHI
  • Publication number: 20220398100
    Abstract: Processors employing memory bypassing in memory data dependent instructions as a store data forwarding mechanism, and related methods. To reduce stalls of memory data dependent, load-based instructions, a memory data dependency detection circuit is configured to detect a memory hazard between a store-based instruction and a load-based instruction based on their opcodes and designation/source operands. Some store-based and load-based instructions have opcodes identifying these instructions as having respective store and load address operand types that can be compared without resolution of their respective store and load addresses. For these detected types of instructions, the memory data dependency detection circuit is configured to determine if a source operand of a load-based instruction matches a target operand of a store-based instruction to detect a memory hazard earlier in the instruction pipeline.
    Type: Application
    Filed: June 9, 2021
    Publication date: December 15, 2022
    Inventors: Yusuf Cagatay Tekmen, Rodney Wayne Smith, Shivam Priyadarshi, Milind A. Choudhary, Kiran Ravi Seth
  • Publication number: 20220374241
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Application
    Filed: May 18, 2021
    Publication date: November 24, 2022
    Inventors: Vignyan Reddy KOTHINTI NARESH, Shivam PRIYADARSHI
  • Patent number: 11494191
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: November 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
  • Patent number: 11392410
    Abstract: Operand pool instruction reservation clustering in a scheduler circuit in a processor is disclosed. The scheduler circuit includes a plurality of operand pool reservation circuits each having an assigned number of source operands for an instruction stored that must be ready before the instruction is issued. Instructions having the same number of source operands that are not yet ready for its issuance can be stored in an operand pool reservation circuit having the same assigned number of source operands. In this manner, the number of reservation entries and associated comparator circuits in the clustered scheduler circuit is distributed among the plurality of operand pool reservation circuits to avoid or reduce an increase in the number of scheduling path connections and complexity in each reservation circuit. This can avoid or reduce an increase in scheduling latency for a given number of reservation entries in the clustered scheduler circuit.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: July 19, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shivam Priyadarshi, Yusuf Cagatay Tekmen, Rodney Wayne Smith, Vignyan Reddy Kothinti Naresh
  • Patent number: 11392387
    Abstract: Predicting load-based control independent (CI), register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor. The processor predicts if a source of a load-based CIRDI instruction will be forwarded by a store-based instruction (i.e. “store-forwarded”). If a load-based CIRDI instruction is predicted as store-forwarded, the load-based CIRDI instruction is considered a CIMDD instruction and is replayed in misprediction recovery. If a load-based CIRDI instruction is not predicted as store-forwarded, the processor considers such load-based CIRDI instruction as a pending load-based CIRDI instruction. If this pending load-based CIRDI instruction is determined in execution to be store-forwarded, the instruction pipeline is flushed and the pending load-based CIRDI instruction is also replayed in misprediction recovery.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: July 19, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Arthur Perais, Rami Mohammad Al Sheikh, Shivam Priyadarshi
  • Patent number: 11327763
    Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: May 10, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arthur Perais, Shivam Priyadarshi, Yusuf Cagatay Tekmen, Rami Mohammad Al Sheikh, Vignyan Reddy Kothinti Naresh
  • Publication number: 20220137977
    Abstract: Predicting load-based control independent (CI), register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor. The processor predicts if a source of a load-based CIRDI instruction will be forwarded by a store-based instruction (i.e. “store-forwarded”). If a load-based CIRDI instruction is predicted as store-forwarded, the load-based CIRDI instruction is considered a CIMDD instruction and is replayed in misprediction recovery. If a load-based CIRDI instruction is not predicted as store-forwarded, the processor considers such load-based CIRDI instruction as a pending load-based CIRDI instruction. If this pending load-based CIRDI instruction is determined in execution to be store-forwarded, the instruction pipeline is flushed and the pending load-based CIRDI instruction is also replayed in misprediction recovery.
    Type: Application
    Filed: November 4, 2020
    Publication date: May 5, 2022
    Inventors: Vignyan Reddy KOTHINTI NARESH, Arthur PERAIS, Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI
  • Publication number: 20220113976
    Abstract: Restoring speculative history used for making speculative predictions for instructions processed in a processor. The processor can be configured to speculatively predict an outcome of a condition or predicate of a conditional control instruction before its condition is fully evaluated in execution. Predictions are made by the processor based on a history that is updated based on outcomes of past predictions. If a conditional control instruction is mispredicted in execution, the processor can perform a misprediction recovery by stalling the instruction pipeline, flushing younger instructions in the instruction pipeline back to the mispredicted conditional control instruction, and then re-fetching instructions in the correct instruction flow path for execution.
    Type: Application
    Filed: October 12, 2020
    Publication date: April 14, 2022
    Inventors: Vignyan Reddy KOTHINTI NARESH, Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Arthur PERAIS
  • Publication number: 20210389951
    Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Arthur PERAIS, Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN, Rami Mohammad AL SHEIKH, Vignyan Reddy KOTHINTI NARESH
  • Publication number: 20210318905
    Abstract: Operand pool instruction reservation clustering in a scheduler circuit in a processor is disclosed. The scheduler circuit includes a plurality of operand pool reservation circuits each having an assigned number of source operands for an instruction stored that must be ready before the instruction is issued. Instructions having the same number of source operands that are not yet ready for its issuance can be stored in an operand pool reservation circuit having the same assigned number of source operands. In this manner, the number of reservation entries and associated comparator circuits in the clustered scheduler circuit is distributed among the plurality of operand pool reservation circuits to avoid or reduce an increase in the number of scheduling path connections and complexity in each reservation circuit. This can avoid or reduce an increase in scheduling latency for a given number of reservation entries in the clustered scheduler circuit.
    Type: Application
    Filed: April 8, 2020
    Publication date: October 14, 2021
    Inventors: Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN, Rodney Wayne SMITH, Vignyan Reddy KOTHINTI NARESH
  • Patent number: 11113068
    Abstract: Performing flush recovery using parallel walks of sliced reorder buffers (SROBs) is disclosed herein. In one exemplary embodiment, a register mapping circuit provides a rename mapping table (RMT) comprising RMT entries representing logical register number (LRN) to physical register number (PRN) mappings. The register mapping circuit also provides an SROB comprising multiple SROB slices that each corresponds to a respective LRN. Each SROB slice tracks uncommitted instructions that write to the LRN corresponding to that SROB slice, and maintains those instructions in program order with respect to each other. Upon detecting an uncommitted instruction writing to an LRN, the register mapping circuit allocates an SROB slice entry in the SROB slice corresponding to the LRN. When an pipeline flush from a target instruction occurs, the register mapping circuit restores RMT entries of the RMT to their prior mapping states based on parallel walks of the SROB slices of the SROB.
    Type: Grant
    Filed: August 6, 2020
    Date of Patent: September 7, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yusuf Cagatay Tekmen, Rodney Wayne Smith, Kiran Ravi Seth, Shivam Priyadarshi
  • Patent number: 11061683
    Abstract: Limiting replay of load-based control independent (CI) instructions in speculative misprediction recovery in a processor. In misprediction recovery, load-based CI instructions are designated as load-based CI, data dependent (CIDD) instructions if a load-based CI instruction consumed forwarded-stored data of a store-based instruction. During the misprediction recovery, replayed load-based CIDD instructions will reevaluate an accurate source of memory load the correct data instead of consuming potentially faulty data that may have been forwarded by a store-based instruction that may have only existed in a mispredicted instruction control flow path. Limiting the replay of load-based CI instructions to only determined CIDD load-based instructions can reduce execution delay and power consumption in an instruction pipeline.
    Type: Grant
    Filed: June 13, 2019
    Date of Patent: July 13, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
  • Patent number: 11061677
    Abstract: A register mapping circuit for recovering a register mapping state associated with a flushed instruction by traversing ROB entries from a snapshot of another register mapping state. The register mapping circuit includes a ROB control circuit, a snapshot circuit, and a register rename recovery circuit (RRRC). The ROB control circuit allocates ROB entries to instructions entering a processor pipeline, including a target ROB entry allocated to a target instruction and other ROB entries allocated to other instructions. The snapshot circuit captures snapshots of logical register-to-physical register mapping states in the rename map table in association with a subset of instructions that could be flushed. If the target instruction is flushed, the RRRC restores the rename map table register mapping state corresponding to the target instruction based on a snapshot in a ROB entry allocated to another instruction, and traverses register mapping updates in the intervening ROB entries.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: July 13, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kiran Ravi Seth, Yusuf Cagatay Tekmen, Rodney Wayne Smith, Shivam Priyadarshi, Vignyan Reddy Kothinti Naresh
  • Patent number: 11061824
    Abstract: Deferring cache state updates in a non-speculative cache memory in a processor-based system in response to a speculative data request until the speculative data request becomes non-speculative is disclosed. The updating of at least one cache state in the cache memory resulting from a data request is deferred until the data request becomes non-speculative. Thus, a cache state in the cache memory is not updated for requests resulting from mispredictions. Deferring the updating of a cache state in the cache memory can include deferring the storing of received speculative requested data in the main data array of the cache memory as a result of a cache miss until the data request becomes non-speculative. The received speculative requested data can first be stored in a speculative buffer memory associated with a cache memory, and then stored in the main data array if the data request becomes non-speculative.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: July 13, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Arthur Perais, Rami Mohammad Al Sheikh, Shivam Priyadarshi
  • Patent number: 11036512
    Abstract: A processor element in a processor-based system is configured to fetch one or more instructions associated with a program binary, where the one or more instructions include an instruction having an immediate operand. The processor element is configured to determine if the immediate operand is a reference to a wide immediate operand. In response to determining that the immediate operand is a reference to a wide immediate operand, the processor element is configured to retrieve the wide immediate operand from a common intermediate lookup table (CILT) in the program binary, where the immediate operand indexes the wide immediate operand in the CILT. The processor element is then configured to process the instruction having the immediate operand such that the immediate operand is replaced with the wide immediate operand from the CILT.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: June 15, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arthur Perais, Rodney Wayne Smith, Shivam Priyadarshi, Rami Mohammad Al Sheikh, Vignyan Reddy Kothinti Naresh
  • Patent number: 11023243
    Abstract: Latency-based instruction reservation clustering in a scheduler circuit in a processor is disclosed. The scheduler circuit includes a plurality of latency-based reservation circuits each having an assigned producer instruction cycle latency. Producer instructions with the same cycle latency can be clustered in the same latency-based reservation circuit. Thus, the number of reservation entries is distributed among the plurality of latency-based reservation circuits to avoid or reduce an increase in the number of scheduling path connections and complexity in each reservation circuit to avoid or reduce an increase in scheduling latency. The scheduling path connections are reduced for a given number of reservation entries over a non-clustered pick circuit, because signals (e.g., wake-up signals, pick-up signals) used for scheduling instructions in each latency-based reservation circuit do not have to have the same clock cycle latency so as to not impact performance.
    Type: Grant
    Filed: July 22, 2019
    Date of Patent: June 1, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yusuf Cagatay Tekmen, Shivam Priyadarshi, Rodney Wayne Smith