Patents by Inventor Rami Mohammad Al Sheikh

Rami Mohammad Al Sheikh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Slice construction for pre-executing data dependent loads

Patent number: 10379863

Abstract: Systems and methods for constructing an instruction slice for prefetching data of a data-dependent load instruction include a slicer for identifying a load instruction in an instruction sequence as a first occurrence of a qualified load instruction which will miss in a last-level cache. A commit buffer stores information pertaining to the first occurrence of the qualified load instruction and shadow instructions which follow. For a second occurrence of the qualified load instruction, an instruction slice is constructed from the information in the commit buffer to form a slice payload. A pre-execution engine pre-executes the instruction slice based on the slice payload to determine an address from which data is to be fetched for execution of a third and any subsequent occurrences of the qualified load instruction. The data is prefetched from the determined address for the third and any subsequent occurrence of the qualified load instruction.

Type: Grant

Filed: September 21, 2017

Date of Patent: August 13, 2019

Assignee: QUALCOMM Incorporated

Inventors: Shivam Priyadarshi, Rami Mohammad A. Al Sheikh, Brandon Dwiel, Derek Hower
Next line prefetchers employing initial high prefetch prediction confidence states for throttling next line prefetches in a processor-based system

Patent number: 10353819

Abstract: Next line prefetchers employing initial high prefetch prediction confidence states for throttling next line prefetches in processor-based system are disclosed. Next line prefetcher prefetches a next memory line into cache memory in response to read operation. To mitigate prefetch mispredictions, next line prefetcher is throttled to cease prefetching after prefetch prediction confidence state becomes a no next line prefetch state indicating number of incorrect predictions. Instead of initial prefetch prediction confidence state being set to no next line prefetch state, which is built up in response to correct predictions before performing a next line prefetch, initial prefetch prediction confidence state is set to next line prefetch state to allow next line prefetching. Thus, next line prefetcher starts prefetching next lines before requiring correct predictions to be “built up” in prefetch prediction confidence state.

Type: Grant

Filed: June 24, 2016

Date of Patent: July 16, 2019

Assignee: QUALCOMM Incorporated

Inventors: Brandon Dwiel, Rami Mohammad Al Sheikh
Intelligent data prefetching using address delta prediction

Patent number: 10303608

Abstract: A first load instruction specifying a first virtual address misses in a data cache. A delta value is received based on a program counter value of the first load instruction. A second virtual address is computed based on the delta value and the first virtual address. Data associated with the second virtual address is then prefetched from a main memory to the data cache prior to a second load instruction specifying the second virtual address missing in the data cache.

Type: Grant

Filed: August 22, 2017

Date of Patent: May 28, 2019

Assignee: QUALCOMM Incorporated

Inventors: Rami Mohammad Al Sheikh, Shivam Priyadarshi, Brandon Dwiel, David John Palframan, Derek Hower, Muntaquim Faruk Chowdhury
FAST PIPELINE RESTART IN PROCESSOR WITH DECOUPLED FETCHER

Publication number: 20190155608

Abstract: Aspects of the present disclosure include a method, a device, and a computer-readable medium for restarting an instruction pipeline of a processor that includes a decoupled fetcher. A method comprises detecting, in a processor, a re-fetch event, wherein the processor includes an instruction unit (IU) configured to fetch instructions from a decoupled fetcher (DCF), and simultaneously flushing the IU and the DCF in response to detecting of the re-fetch event.

Type: Application

Filed: November 16, 2018

Publication date: May 23, 2019

Inventors: Arthur PERAIS, Michael Scott MCILVAINE, Rami Mohammad A. AL SHEIKH, Robert Douglas CLANCY, Luke YEN, Rodney Wayne SMITH
Selective flushing of instructions in an instruction pipeline in a processor back to an execution-resolved target address, in response to a precise interrupt

Patent number: 10255074

Abstract: Selective flushing of instructions in an instruction pipeline in a processor back to an execution-determined target address in response to a precise interrupt is disclosed. A selective instruction pipeline flush controller determines if a precise interrupt has occurred for an executed instruction in the instruction pipeline. The selective instruction pipeline flush controller determines if an instruction at the correct resolved target address of the instruction that caused the precise interrupt is contained in the instruction pipeline. If so, the selective instruction pipeline flush controller can selectively flush instructions back to the instruction in the pipeline that contains the correct resolved target address to reduce the amount of new instruction fetching.

Type: Grant

Filed: September 11, 2015

Date of Patent: April 9, 2019

Assignee: QUALCOMM Incorporated

Inventors: Vignyan Reddy Kothinti Naresh, Rami Mohammad Al Sheikh, Harold Wade Cain, III
METHODS FOR MITIGATING FAULT ATTACKS IN MICROPROCESSORS USING VALUE PREDICTION

Publication number: 20190095621

Abstract: Aspects of the present disclosure are directed to detecting and responding to injected faults. In some examples, fault injections are detected in a pipeline processor using transactional memory by comparing a predicted value (e.g. from a Value Predictor) against a subsequently loaded or computed reference value, and then detecting the fault based on the result of the comparison. If the predicted value is found to differ from the subsequently loaded or calculated value, the difference is deemed to be due to a fault and actions are taken to address the fault, such as by using deception or blinding of observable values. In some examples, the Value Predictor is modified to perform the comparison to detect the fault. The Value Predictor then notifies Transactional Hardware, which responds to the fault. In other examples described herein, the Value Predictor is unchanged and the Transactional Hardware detects and corrects the fault.

Type: Application

Filed: September 27, 2017

Publication date: March 28, 2019

Inventors: Rosario CAMMAROTA, Rami Mohammad A. AL SHEIKH, Wenjia RUAN
SLICE CONSTRUCTION FOR PRE-EXECUTING DATA DEPENDENT LOADS

Publication number: 20190087192

Abstract: Systems and methods for constructing an instruction slice for prefetching data of a data-dependent load instruction include a slicer for identifying a load instruction in an instruction sequence as a first occurrence of a qualified load instruction which will miss in a last-level cache. A commit buffer stores information pertaining to the first occurrence of the qualified load instruction and shadow instructions which follow. For a second occurrence of the qualified load instruction, an instruction slice is constructed from the information in the commit buffer to form a slice payload. A pre-execution engine pre-executes the instruction slice based on the slice payload to determine an address from which data is to be fetched for execution of a third and any subsequent occurrences of the qualified load instruction. The data is prefetched from the determined address for the third and any subsequent occurrence of the qualified load instruction.

Type: Application

Filed: September 21, 2017

Publication date: March 21, 2019

Inventors: Shivam PRIYADARSHI, Rami Mohammad A. AL SHEIKH, Brandon DWIEL, Derek HOWER
Selective bypassing of allocation in a cache

Patent number: 10223278

Abstract: Systems and methods are directed to selectively bypassing allocation of cache lines in a cache. A bypass predictor table is provided with reuse counters to track reuse characteristics of cache lines, based on memory regions to which the cache lines belong in memory. A contender reuse counter provides an indication of a likelihood of reuse of a contender cache line in the cache pursuant to a miss in the cache for the contender cache line, and a victim reuse counter provides an indication of a likelihood of reuse for a victim cache line that will be evicted if the contender cache line is allocated in the cache. A decision whether to allocate the contender cache line in the cache or bypass allocation of the contender cache line in the cache is based on the contender reuse counter value and the victim reuse counter value.

Type: Grant

Filed: September 22, 2016

Date of Patent: March 5, 2019

Assignee: QUALCOMM Incorporated

Inventors: Shivam Priyadarshi, Brandon Harley Anthony Dwiel, Rami Mohammad A. Al Sheikh, Harold Wade Cain, III
METHOD AND APPARATUS FOR LOAD VALUE PREDICTION

Publication number: 20190065964

Abstract: A method and apparatus for predicting instruction load values in a processor. While a program is executing the processor is used to train predictors in order to predict load values. In particular 4 differing kinds of predictors are trained. The four predictors are the Last Value Predictor (LVP) which captures loads that encounter very few values, the Stride Address Predictor (SAP) which captures loads based on stride (offset) addresses, a Content Address Predictor (CAP) which captures load addresses that are non-stride and the Context Value Predictor (CVP) which captures load values in a particular context that are non-stride. Training methods and the use of such predictors are disclosed.

Type: Application

Filed: August 30, 2017

Publication date: February 28, 2019

Inventors: Rami Mohammad A. AL SHEIKH, Derek HOWER
EXPEDITING CACHE MISSES THROUGH CACHE HIT PREDICTION

Publication number: 20190065384

Abstract: A request to access data at a first physical address misses in a private cache of a processor. A confidence value is received for the first physical address based on a hash value of the first physical address. A determination is made that the received confidence value exceeds a threshold value. In response, a speculative read request specifying the first physical address is issued to a memory controller of a main memory to expedite a miss for the data at the first physical address in a shared cache.

Type: Application

Filed: August 22, 2017

Publication date: February 28, 2019

Inventors: Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Brandon DWIEL, David John PALFRAMAN, Derek HOWER
INTELLIGENT DATA PREFETCHING USING ADDRESS DELTA PREDICTION

Publication number: 20190065375

Abstract: A first load instruction specifying a first virtual address misses in a data cache. A delta value is received based on a program counter value of the first load instruction. A second virtual address is computed based on the delta value and the first virtual address. Data associated with the second virtual address is then prefetched from a main memory to the data cache prior to a second load instruction specifying the second virtual address missing in the data cache.

Type: Application

Filed: August 22, 2017

Publication date: February 28, 2019

Inventors: Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Brandon DWIEL, David John PALFRAMAN, Derek HOWER, Muntaquim Faruk CHOWDHURY
Apparatus and method for dynamic power reduction in a unified scheduler

Patent number: 10203745

Abstract: A scheduler and method for dynamic power reduction, e.g., in a processor core, is proposed. In conventional processor cores for example, the scheduler precharges grant lines of many instructions only to discharge a great majority of the precharged lines in one cycle. To reduce power consumption, selective precharge and/or selective evaluation are proposed. In the selective precharge, the grant lines of instructions that will evaluate to false (e.g., invalid instructions) are not precharged in a cycle. In the selective evaluation, among the precharged instructions, instructions that are not ready are not evaluated in the same cycle. In this way, power consumption is reduced by avoiding unnecessary precharge and discharge.

Type: Grant

Filed: March 30, 2016

Date of Patent: February 12, 2019

Assignee: QUALCOMM Incorporated

Inventors: Milind Ram Kulkarni, Rami Mohammad A. Al Sheikh, Raguram Damodaran
Cost-aware cache replacement

Patent number: 10185668

Abstract: Systems and methods relate to cost-aware cache management policies. In a cost-aware least recently used (LRU) replacement policy, temporal locality as well as miss cost is taken into account in selecting a cache line for replacement, wherein the miss cost is based on an associated operation type including instruction cache read, data cache read, data cache write, prefetch, and write back. In a cost-aware dynamic re-reference interval prediction (DRRIP) based cache management policy, miss costs associated with operation types pertaining to a cache line are considered for assigning re-reference interval prediction values (RRPV) for inserting the cache line, pursuant to a cache miss and for updating the RRPV upon a hit for the cache line. The operation types comprise instruction cache access, data cache access, prefetch, and write back. These policies improve victim selection, while minimizing cache thrashing and scans.

Type: Grant

Filed: September 20, 2016

Date of Patent: January 22, 2019

Assignee: QUALCOMM Incorporated

Inventors: Rami Mohammad A. Al Sheikh, Shivam Priyadarshi, Harold Wade Cain, III
COST-AWARE CACHE REPLACEMENT

Publication number: 20190018798

Abstract: Systems and methods relate to cost-aware cache management policies. In a cost-aware least recently used (LRU) replacement policy, temporal locality as well as miss cost is taken into account in selecting a cache line for replacement, wherein the miss cost is based on an associated operation type including instruction cache read, data cache read, data cache write, prefetch, and write back. In a cost-aware dynamic re-reference interval prediction (DRRIP) based cache management policy, miss costs associated with operation types pertaining to a cache line are considered for assigning re-reference interval prediction values (RRPV) for inserting the cache line, pursuant to a cache miss and for updating the RRPV upon a hit for the cache line. The operation types comprise instruction cache access, data cache access, prefetch, and write back. These policies improve victim selection, while minimizing cache thrashing and scans.

Type: Application

Filed: September 18, 2018

Publication date: January 17, 2019

Inventors: Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Harold Wade CAIN III
BRANCH PREDICTION FOR FIXED DIRECTION BRANCH INSTRUCTIONS

Publication number: 20190004806

Abstract: Systems and methods for branch prediction of fixed direction branch instructions involve Bloom Filters. A taken Bloom Filter records instances of a branch instruction being taken or having resolved in a taken direction; while a not-taken Bloom Filter records instances of a branch instruction not being taken, or having resolved in a not-taken direction. For a branch instruction to be executed, the taken Bloom Filter and the not-taken Bloom Filter are accessed and a direction of execution for the branch instruction is predicted using at least one of the taken Bloom Filter or the not-taken Bloom Filter.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventor: Rami Mohammad A. AL SHEIKH
MULTI-TAGGED BRANCH PREDICTION TABLE

Publication number: 20190004805

Abstract: Systems and methods pertain to a branch prediction table comprising one or more entries. Each entry comprises one or more branch prediction counters corresponding to one or more instructions in a fetch group of instructions fetched for processing in a processor. Each of the two or more fetch groups comprises at least one branch instruction for which at least one of the one or more branch prediction counters is used for making a branch prediction. Two or more tag fields are associated with each entry, wherein the two or more tag fields correspond to two or more fetch groups. In the event of a miss in the branch prediction table, updating the branch prediction counters and the two or more tag fields is performed in a manner which enables constructive aliasing and prevents destructive aliasing.

Type: Application

Filed: June 28, 2017

Publication date: January 3, 2019

Inventor: Rami Mohammad A. AL SHEIKH
STATISTICAL CORRECTION FOR BRANCH PREDICTION MECHANISMS

Publication number: 20190004803

Abstract: Systems and methods for branch prediction include a processor configured to execute at least one branch instruction. The processor includes a branch prediction mechanism configured to provide a branch prediction for the at least one branch instruction and a statistical correction table (SCT) configured to indicate whether a branch prediction accuracy of the branch prediction provided by the branch prediction mechanism is worse than a statistical bias for a branch instruction. An execution pipeline of the processor is configured to speculatively executing the branch instruction in a direction corresponding to the statistical bias if, at least, the branch prediction accuracy is worse than the statistical bias.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventor: Rami Mohammad A. AL SHEIKH
Multiple instruction issuance with parallel inter-group and intra-group picking

Patent number: 10089114

Abstract: A scheduler with a picker block capable of dispatching multiple instructions per cycle is disclosed. The picker block may comprise an inter-group picker and an intra-group picker. The inter-group picker may be configured to pick multiple ready groups when there are two or more ready groups among a plurality of groups of instructions, and pick a single ready group when the single ready group is the only ready group among the plurality of groups. The intra-group picker may be configured to pick one ready instruction from each of the multiple ready groups when the inter-group picker picks the multiple ready groups, and to pick multiple ready instructions from the single ready group when the inter-group picker picks the single ready group.

Type: Grant

Filed: March 30, 2016

Date of Patent: October 2, 2018

Assignee: QUALCOMM Incorporated

Inventors: Milind Ram Kulkarni, Rami Mohammad A. Al Sheikh, Raguram Damodaran
DYNAMIC CACHE PARTITIONING THROUGH HILL-CLIMBING

Publication number: 20180081811

Abstract: Systems and methods for dynamically partitioning a shared cache, include dynamically determining a probability to be associated with each one of two or more processors configured to access the shared cache. Based on the probability for a processor, a first cache line of the processor is inserted in a most recently used (MRU) position of a least recently used (LRU) stack associated with the shared cache, pursuant to a miss in the shared cache for the first cache line. Based on the probability for the processor, a second cache line is promoted to the MRU position of the LRU stack, pursuant to a hit in the shared cache for the second cache line. The probability for the processor is determined based on hill-climbing, wherein fluctuations in the probability are reduced, local maxima are prevented, and the probability is prevented from falling below a threshold.

Type: Application

Filed: September 20, 2016

Publication date: March 22, 2018

Inventors: Rami Mohammad A. AL SHEIKH, Harold Wade CAIN, III
REPLAYING SPECULATIVELY DISPATCHED LOAD-DEPENDENT INSTRUCTIONS IN RESPONSE TO A CACHE MISS FOR A PRODUCING LOAD INSTRUCTION IN AN OUT-OF-ORDER PROCESSOR (OoP)

Publication number: 20180081691

Abstract: Replaying speculatively dispatched load-dependent instructions in response to a cache miss for a producing load instruction in an out-of-order processor (OoP) is disclosed. To allow for a scheduler circuit to restore register dependencies in a register dependency tracking circuit for a replay operation in response to a cache miss for execution of a load instruction, the scheduler circuit includes a replay circuit. The replay circuit includes a load dependency tracking circuit. The replay circuit is configured to track dependencies of dispatched load instructions in the load dependency tracking circuit. The replay circuit uses these tracked dependencies to restore register dependencies for the dispatched load instructions in the register dependency tracking circuit in response to a replay operation. Thus, the load instruction does not have to be re-allocated to restore register dependencies in the register dependency tracking circuit used for re-dispatching load-dependent instructions.

Type: Application

Filed: September 21, 2016

Publication date: March 22, 2018

Inventors: Milind Ram Kulkarni, Rami Mohammad Al Sheikh, Raguram Damodaran

prev 1 2 3 4 next