Patents by Inventor Rami Mohammad Al Sheikh

Rami Mohammad Al Sheikh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220137977
    Abstract: Predicting load-based control independent (CI), register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor. The processor predicts if a source of a load-based CIRDI instruction will be forwarded by a store-based instruction (i.e. “store-forwarded”). If a load-based CIRDI instruction is predicted as store-forwarded, the load-based CIRDI instruction is considered a CIMDD instruction and is replayed in misprediction recovery. If a load-based CIRDI instruction is not predicted as store-forwarded, the processor considers such load-based CIRDI instruction as a pending load-based CIRDI instruction. If this pending load-based CIRDI instruction is determined in execution to be store-forwarded, the instruction pipeline is flushed and the pending load-based CIRDI instruction is also replayed in misprediction recovery.
    Type: Application
    Filed: November 4, 2020
    Publication date: May 5, 2022
    Inventors: Vignyan Reddy KOTHINTI NARESH, Arthur PERAIS, Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI
  • Publication number: 20220113976
    Abstract: Restoring speculative history used for making speculative predictions for instructions processed in a processor. The processor can be configured to speculatively predict an outcome of a condition or predicate of a conditional control instruction before its condition is fully evaluated in execution. Predictions are made by the processor based on a history that is updated based on outcomes of past predictions. If a conditional control instruction is mispredicted in execution, the processor can perform a misprediction recovery by stalling the instruction pipeline, flushing younger instructions in the instruction pipeline back to the mispredicted conditional control instruction, and then re-fetching instructions in the correct instruction flow path for execution.
    Type: Application
    Filed: October 12, 2020
    Publication date: April 14, 2022
    Inventors: Vignyan Reddy KOTHINTI NARESH, Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Arthur PERAIS
  • Patent number: 11243772
    Abstract: Certain aspects of the present disclosure provide techniques for training load value predictors, comprising: determining if a prediction has been made by one or more of a plurality of load value predictors; determining a misprediction has been made by one or more load value predictors of the plurality of load value predictors; training each of the one or more load value predictors that made the misprediction; and resetting a confidence value associated with each of the one or more load value predictors that made the misprediction.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: February 8, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Rami Mohammad A. Al Sheikh, Derek Robert Hower
  • Publication number: 20210397453
    Abstract: Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching is disclosed. An instruction processing circuit is configured to detect fetched performance degrading instructions (Pals) in a pre-execution stage in an instruction pipeline that may cause a precise interrupt that would cause flushing of the instruction pipeline. In response to detecting a PDI in an instruction pipeline, the instruction processing circuit is configured to capture the fetched PDI and/or its successor, younger fetched instructions that are processed in the instruction pipeline behind the PDI, in a pipeline refill circuit.
    Type: Application
    Filed: June 22, 2020
    Publication date: December 23, 2021
    Inventors: Rami Mohammad AL SHEIKH, Michael Scott MCILVAINE
  • Publication number: 20210390054
    Abstract: A cache management circuit that includes a predictive adjustment circuit configured to predictively generate cache control information based on a cache hit-miss indicator and the retention ranks of accessed cache lines to improve cache efficiency is disclosed. The predictive adjustment circuit stores the cache control information persistently, independent of whether the data remains in cache memory. The stored cache control information is indicative of prior cache access activity for data from a memory address, which is indicative of the data's “usefulness.” Based on the cache control information, the predictive adjustment circuit controls generation of retention ranks for data in the cache lines when the data is inserted, accessed, and evicted.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Rami Mohammad AL SHEIKH, Arthur PERAIS, Michael Scott MCILVAINE
  • Publication number: 20210389951
    Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Arthur PERAIS, Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN, Rami Mohammad AL SHEIKH, Vignyan Reddy KOTHINTI NARESH
  • Patent number: 11074077
    Abstract: Reusing executed, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-execution is disclosed. An instruction processing circuit detects fetched performance degrading instructions (PDIs) in an instruction pipeline that may cause a flushing of the instruction pipeline. In response to detecting a PDI, the instruction processing circuit is configured to store the PDI and/or its successor younger instructions in a pipeline execution refill circuit. In response to successful execution of such PDI and/or younger instructions, information about their input value(s) and produced output value(s) when executed are captured in the pipeline execution refill circuit.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: July 27, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
  • Patent number: 11068273
    Abstract: Swapping and restoring context-specific branch predictor states on context switches in a processor. A branch prediction circuit in an instruction processing circuit of a processor includes a private branch prediction memory configured to store branch prediction states for a context of a process being executed. The branch prediction states are accessed by the branch prediction circuit to predict outcomes of its branch instructions of the process. In certain aspects, when a context switch occurs in the processor, branch prediction states stored in a private branch prediction memory and associated with the current, to-be-swapped-out context, are swapped out of the private branch prediction memory to the shared branch prediction memory. Branch prediction states in the shared branch prediction memory previously stored (i.e., swapped out) and associated with to-be-swapped-in context for execution are restored in the private branch prediction memory to be used for branch prediction.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: July 20, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
  • Patent number: 11061824
    Abstract: Deferring cache state updates in a non-speculative cache memory in a processor-based system in response to a speculative data request until the speculative data request becomes non-speculative is disclosed. The updating of at least one cache state in the cache memory resulting from a data request is deferred until the data request becomes non-speculative. Thus, a cache state in the cache memory is not updated for requests resulting from mispredictions. Deferring the updating of a cache state in the cache memory can include deferring the storing of received speculative requested data in the main data array of the cache memory as a result of a cache miss until the data request becomes non-speculative. The received speculative requested data can first be stored in a speculative buffer memory associated with a cache memory, and then stored in the main data array if the data request becomes non-speculative.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: July 13, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Arthur Perais, Rami Mohammad Al Sheikh, Shivam Priyadarshi
  • Patent number: 11036512
    Abstract: A processor element in a processor-based system is configured to fetch one or more instructions associated with a program binary, where the one or more instructions include an instruction having an immediate operand. The processor element is configured to determine if the immediate operand is a reference to a wide immediate operand. In response to determining that the immediate operand is a reference to a wide immediate operand, the processor element is configured to retrieve the wide immediate operand from a common intermediate lookup table (CILT) in the program binary, where the immediate operand indexes the wide immediate operand in the CILT. The processor element is then configured to process the instruction having the immediate operand such that the immediate operand is replaced with the wide immediate operand from the CILT.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: June 15, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arthur Perais, Rodney Wayne Smith, Shivam Priyadarshi, Rami Mohammad Al Sheikh, Vignyan Reddy Kothinti Naresh
  • Publication number: 20210089308
    Abstract: A processor element in a processor-based system is configured to fetch one or more instructions associated with a program binary, where the one or more instructions include an instruction having an immediate operand. The processor element is configured to determine if the immediate operand is a reference to a wide immediate operand. In response to determining that the immediate operand is a reference to a wide immediate operand, the processor element is configured to retrieve the wide immediate operand from a common intermediate lookup table (CILT) in the program binary, where the immediate operand indexes the wide immediate operand in the CILT. The processor element is then configured to process the instruction having the immediate operand such that the immediate operand is replaced with the wide immediate operand from the CILT.
    Type: Application
    Filed: September 23, 2019
    Publication date: March 25, 2021
    Inventors: Arthur PERAIS, Rodney Wayne SMITH, Shivam PRIYADARSHI, Rami Mohammad AL SHEIKH, Vignyan Reddy KOTHINTI NARESH
  • Publication number: 20210064541
    Abstract: Deferring cache state updates in a non-speculative cache memory in a processor-based system in response to a speculative data request until the speculative data request becomes non-speculative is disclosed. The updating of at least one cache state in the cache memory resulting from a data request is deferred until the data request becomes non-speculative. Thus, a cache state in the cache memory is not updated for requests resulting from mispredictions. Deferring the updating of a cache state in the cache memory can include deferring the storing of received speculative requested data in the main data array of the cache memory as a result of a cache miss until the data request becomes non-speculative. The received speculative requested data can first be stored in a speculative buffer memory associated with a cache memory, and then stored in the main data array if the data request becomes non-speculative.
    Type: Application
    Filed: September 3, 2019
    Publication date: March 4, 2021
    Inventors: Vignyan Reddy KOTHINTI NARESH, Arthur PERAIS, Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI
  • Publication number: 20210064353
    Abstract: Instructions for execution by a processor in a processor-based system are generated such that multiple compiler generated instruction blocks are provided for a given functional block of source code. Control flow instructions are provided to control the flow of execution between the compiler generated instruction blocks based on runtime information about the processor-based system. By providing multiple compiler generated instruction blocks and the control flow instructions, the one of the compiler generated instruction blocks that will provide the best performance for the processor-based system at the time of execution can be executed. Accordingly, the performance of the processor-based system can be improved.
    Type: Application
    Filed: September 4, 2019
    Publication date: March 4, 2021
    Inventors: Rami Mohammad AL SHEIKH, Michael Scott MCILVAINE
  • Publication number: 20210064378
    Abstract: Swapping and restoring context-specific branch predictor states on context switches in a processor. A branch prediction circuit in an instruction processing circuit of a processor includes a private branch prediction memory configured to store branch prediction states for a context of a process being executed. The branch prediction states are accessed by the branch prediction circuit to predict outcomes of its branch instructions of the process. In certain aspects, when a context switch occurs in the processor, branch prediction states stored in a private branch prediction memory and associated with the current, to-be-swapped-out context, are swapped out of the private branch prediction memory to the shared branch prediction memory. Branch prediction states in the shared branch prediction memory previously stored (i.e., swapped out) and associated with to-be-swapped-in context for execution are restored in the private branch prediction memory to be used for branch prediction.
    Type: Application
    Filed: September 3, 2019
    Publication date: March 4, 2021
    Inventors: Rami Mohammad AL SHEIKH, Michael Scott MCILVAINE
  • Patent number: 10896041
    Abstract: Enabling early execution of move-immediate instructions having variable immediate value sizes in processor-based devices is disclosed. In one exemplary embodiment, a processor-based device provides a move-immediate logic circuit that detects a move-immediate instruction comprising an immediate value and a destination register. For frequently encountered immediate values, the move-immediate logic circuit allocates a physical register from an immediate physical register file (IPRF), and writes an IPRF tag corresponding to the allocated IPRF register into a most-recent mapping table (MRT) entry for the destination register. Subsequent move-immediate instructions embedding the same immediate value, as well as other dependent instructions, may then obtain the immediate value from the IPRF register by accessing the MRT entry.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: January 19, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shivam Priyadarshi, Arthur Perais, Vignyan Reddy Kothinti Naresh, Yusuf Cagatay Tekmen, Rami Mohammad Al Sheikh, Rodney Wayne Smith
  • Publication number: 20200364055
    Abstract: Certain aspects of the present disclosure provide techniques for training load value predictors, comprising: determining if a prediction has been made by one or more of a plurality of load value predictors; determining a misprediction has been made by one or more load value predictors of the plurality of load value predictors; training each of the one or more load value predictors that made the misprediction; and resetting a confidence value associated with each of the one or more load value predictors that made the misprediction.
    Type: Application
    Filed: May 16, 2019
    Publication date: November 19, 2020
    Inventors: Rami Mohammad A. AL SHEIKH, Derek Robert HOWER
  • Patent number: 10838731
    Abstract: Branch prediction methods and systems include, for a branch instruction fetched by a processor, indexing a branch identification (ID) table based on a function of a program counter (PC) value of the branch instruction, wherein each entry of the branch ID table comprises at least a tag field, and an accuracy counter. For a tag hit at an entry indexed by the PC value, if a value of the corresponding accuracy counter is greater than or equal to zero, a prediction counter from a prediction counter pool is selected based on a function of the PC value and a load-path history, wherein the prediction counters comprise respective confidence values and prediction values. A memory-dependent branch prediction of the branch instruction is assigned as the prediction value of the selected prediction counter if the associated confidence value is greater than zero, while branch prediction from a conventional branch predictor is overridden.
    Type: Grant
    Filed: September 19, 2018
    Date of Patent: November 17, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Rami Mohammad A. Al Sheikh, Michael Scott McIlvaine, Robert Douglas Clancy, Derek Hower
  • Publication number: 20200356372
    Abstract: Providing early instruction execution in a processor, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, an apparatus comprises an early execution engine communicatively coupled to a front-end instruction pipeline and a local register file. The apparatus may be configured to use value prediction wherein all input values of the instructions are actually available early in the pipeline (in the front-end), even before the value producers have executed, thus, providing an opportunity for early executing such instructions, and avoid sending these early executed instructions to the power hungry out-of-order engine that may improve performance as well as energy efficiency. In other aspects, the front-end of the pipeline is augmented with a component for early executing simple operations (e.g.
    Type: Application
    Filed: May 8, 2019
    Publication date: November 12, 2020
    Inventors: Rami Mohammad A. AL SHEIKH, Derek Robert HOWER
  • Publication number: 20200089504
    Abstract: Branch prediction methods and systems include, for a branch instruction fetched by a processor, indexing a branch identification (ID) table based on a function of a program counter (PC) value of the branch instruction, wherein each entry of the branch ID table comprises at least a tag field, and an accuracy counter. For a tag hit at an entry indexed by the PC value, if a value of the corresponding accuracy counter is greater than or equal to zero, a prediction counter from a prediction counter pool is selected based on a function of the PC value and a load-path history, wherein the prediction counters comprise respective confidence values and prediction values. A memory-dependent branch prediction of the branch instruction is assigned as the prediction value of the selected prediction counter if the associated confidence value is greater than zero, while branch prediction from a conventional branch predictor is overridden.
    Type: Application
    Filed: September 19, 2018
    Publication date: March 19, 2020
    Inventors: Rami Mohammad A. AL SHEIKH, Michael Scott MCILVAINE, Robert Douglas CLANCY, Derek HOWER
  • Patent number: 10474462
    Abstract: Systems and methods for operating a processor include determining confidence levels, such as high, low, and medium confidence levels, associated with in-flight branch instructions in an instruction pipeline of the processor, based on counters used for predicting directions of the in-flight branch instructions. Numbers of in-flight branch instructions associated with each of confidence levels are determined. A weighted sum of the numbers weighted with weights corresponding to the confidence levels is calculated and the weighted sum is compared with a threshold. A throttling signal may be asserted to indicate that instructions are to be throttled in a pipeline stage of the instruction pipeline based on the comparison.
    Type: Grant
    Filed: February 29, 2016
    Date of Patent: November 12, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Shivam Priyadarshi, Rami Mohammad Al Sheikh, Raguram Damodaran, Michael Scott McIlvaine, Jeffrey Todd Bridges