PROCESSING METHOD INCLUDING PRE-ISSUE LOAD-HIT-STORE (LHS) HAZARD PREDICTION TO REDUCE REJECTION OF LOAD INSTRUCTIONS
A processing method supporting out-of-order execution (OOE) includes load-hit-store (LHS) hazard prediction at the instruction execution phase, reducing load instruction rejections and queue flushes at the dispatch phase. The instruction dispatch unit (IDU) detects likely LHS hazards by generating entries for pending stores in a LHS detection table. The entries in the table contain an address field (generally the immediate field) of the store instruction and the register number of the store. The ISU compares the address field and register number for each load with entries in the table to determine if a likely LHS hazard exists and if an LHS hazard is detected, the load is dispatched to the issue queue of the load-store unit (LSU) with a tag corresponding to the matching store instruction, causing the LSU to dispatch the load only after the corresponding store has been dispatched for execution.
The present Application is a Continuation of U.S. patent application Ser. No. 14/522,811, filed on Oct. 24, 2014 and claims priority thereto under 35 U.S.C. §120. The disclosure of the above-referenced parent U.S. Patent Application is incorporated herein by reference.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention is related to processing systems and processors, and more specifically to techniques for predicting load-hit-store hazards at dispatch times to reduce rejection of dispatched load instructions.
2. Description of Related Art
In pipelined processors supporting out-of-order execution (OOE), overlaps between store and load instructions causing load-hit-store hazards represent a serious bottleneck in the data flow between the load store unit (LSU) and the instruction dispatch unit (IDU). In particular, in a typical pipelined processor, when a load-hit-store hazard is detected by the LSU, the load instruction that is dependent on the result of the store instruction is rejected, generally several times, and reissues the load instruction along with flushing all newer instructions following the load instruction. The above-described reject and reissue operation not only consumes resources of the load-store data path(s) within the processor, but can also consume issue queue space in the load-store execution path(s) by filling the load-store issue queue with rejected load instructions that must be reissued. When such an LHS hazard occurs in a program loop, the reject and reissue operation can lead to a dramatic reduction in system performance.
In some systems, the reissued load instruction entries are tagged with dependency flags, so that subsequent reissues will only occur after the store operation on which the load instruction depends, preventing recurrence of the reissue operations. However, rejection of the first issue of the load instruction and the consequent flushing of newer instructions still represents a significant performance penalty in OOE processors.
It would therefore be desirable to provide a method for managing load-store operations with reduced rejection and reissue of operations, in particular load rejections due to load-hit-store hazards.
BRIEF SUMMARY OF THE INVENTIONThe invention is embodied in a method that reduces rejection of load instructions by predicting likely load-hit-store hazards. The method is a method of operation of a processor core.
The processor core is embodied in a processor core supporting out-of-order execution that detects likely load-hit-store hazards. When an instruction dispatch unit decodes a fetched instruction, if the instruction is a store instruction, address information is stored in a load-hit-store detection table. The address information is generally the base registers used to generate the effective address of the store operation in register-based addressing and/or the immediate field of the instruction for immediate addressing. When a subsequent load instruction is encountered, the instruction dispatch unit checks the load-hit-store detection table to determine whether or not an entry in the table has matching address information. If a matching entry exists in the table, the instruction dispatch unit forwards the load instruction with a tag corresponding to the entry, so that the load-store unit will execute the load instruction after the corresponding store has been executed. If no matching entry exists in the table, the load instruction is issued untagged.
The foregoing and other objectives, features, and advantages of the invention will be apparent from the following, more particular, description of the preferred embodiment of the invention, as illustrated in the accompanying drawings.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives, and advantages thereof, will best be understood by reference to the following detailed description of the invention when read in conjunction with the accompanying Figures, wherein like reference numerals indicate like components, and:
The present invention relates to processors and processing systems in which rejects of load instructions due to load-hit-store (LHS) hazards is reduced by predicting the occurrence of such hazards using a LHS prediction table to track dispatched stores that may or may not have been issued/executed. Load instructions are examined at dispatch time to determine whether or not a pending store exists that has not been committed for a cache write or that has otherwise been flushed from the load-store execution path. If an LHS hazard is detected, the load instruction is dispatched with an ITAG matching the ITAG of the store instruction corresponding to the entry in the LHS prediction table, so that the load-store unit will issue the load instruction dependent on the store result, i.e., will retain the load instruction in its issue queue until the store instruction is committed or flushed, preventing rejections of load instructions due to identification of LHS hazards during issue of the load instructions.
Referring now to
Referring now to
Referring now to
It should be noted that the above-described matching does not generally detect all LHS hazards, since, for example, a store instruction using immediate addressing may hit the same address as a load instruction using register or register indirect addressing, and a matching entry in LHS detection table 41 will not be found for the load. Such an LHS hazard will instead be rejected during the issue phase after the full EA has been computed for both the load and store instructions. However, most likely LHS hazards should be detected under normal circumstances and the number of load rejects due to LHS hazards dramatically reduced. Further, an entry may be found in LHS detection table 41 that is flagged as an LHS hazard and in actuality is not, for example, when a base register value has been modified between a register-addressed load and a preceding register-addressed store using the same base register pair. Therefore, the method detects likely LHS hazards and not guaranteed address conflicts/overlaps. However, such occurrences should be rare compared to the number of actual LHS hazards detected.
Referring now to
Referring now to
While the invention has been particularly shown and described with reference to the preferred embodiments thereof, it will be understood by those skilled in the art that the foregoing and other changes in form, and details may be made therein without departing from the spirit and scope of the invention.
Claims
1. A method of operation of a processor core, the method comprising:
- fetching instructions of an instruction stream;
- dispatching instructions of the instruction stream by an instruction dispatch unit of the processor core that dispatches the instructions to issue queues, according to a type of the instructions;
- detecting likely load-hit-store hazards prior to the dispatch of load instructions to an issue queue of a load-store unit of the processor core; and
- identifying the likely load-hit-store hazards to the load-store unit, whereby rejections of the load instructions by the load-store unit due to load-hit-store hazards is reduced.
2. The method of claim 1, wherein the instruction dispatch unit detects store instructions of the instruction stream during the dispatching of the store instructions and stores store address information associated with the store instructions in corresponding entries in a load-hit-store detection table, and wherein the detecting likely load-hit-store hazards comprises detecting load instructions of the instruction stream and comparing the store address information of the entries in the table with load address information of load instructions of the instruction stream.
3. The method of claim 2, further comprising:
- responsive to the detecting of a store operation, writing the store address information associated with the store operation to the load-hit-store detection table and dispatching the store operation to the issue queue of the load-store unit of the processor core;
- responsive to the detecting of a load instruction, comparing the load address information of the load instruction to entries in the load-hit-store detection table corresponding to store operations occurring earlier in the instruction stream to determine if a likely load-hit-store hazard exists between the load instruction and a given one of the store operations;
- responsive to the comparing determining that the likely load-hit-store hazard exists between the load instruction and the given store operation, dispatching the load instruction to the issue queue of the load-store unit of the processor core along with a tag identifying the given store operation; and
- responsive to the comparing determining that the likely load-hit-store hazard does not exist between the load instruction and the given store operation, dispatching the load instruction to the issue queue of the load-store unit of the processor core without the tag.
4. The method of claim 2, wherein the store address information is one or both of an immediate field of the store instruction and one or more base register numbers of the store instruction.
5. The method of claim 3, further comprising:
- the load-store unit examining a next entry of the issue queue to determine whether or not a next operation is a load instruction with a corresponding tag;
- the load-store unit, responsive to determining that the load instruction with a corresponding tag is not present, processing the next entry for execution by the load-store unit;
- the load-store unit examining the next entry of the issue queue to determine whether or not the next operation is a store operation;
- the load-store unit, responsive to determining that the next operation is a store operation, examining the issue queue to determine whether a load instruction having a corresponding tag matching a tag of the store operation is present;
- the load-store unit, responsive to determining that the next operation is a store operation, processing the next entry for execution by the load-store unit; and
- the load-store unit, responsive to determining that the load instruction having the corresponding tag matching the tag of the store operation is present, processing the load instruction for execution by the load-store unit subsequent to processing the next entry.
6. The method of claim 3, further comprising:
- responsive to detecting a store operation in the instruction stream, comparing entries in the load-hit-store detection table with the store address information of the store operation; and
- responsive to the comparing detecting a match between the store address information of the store instruction and an entry in the load-hit-store detection table, invalidating the entry in the load-hit-store detection table prior to the instruction dispatch unit storing an entry corresponding to the store instruction in the load-hit-store detection table, whereby only a single valid entry in the load-hit-store detection table contains identical store address information at any time.
7. The method of claim 3, wherein the comparing compares a most-recently-stored matching entry in the load-hit-store detection table that has a match between the load address information of the load instruction and the most-recently-stored matching entry in the load-hit-store detection table, whereby multiple valid entries in the load-hit-store detection table may match a particular load address information, without causing a load-hit-store hazard.
Type: Application
Filed: May 28, 2015
Publication Date: Apr 28, 2016
Inventors: Sundeep Chadha (AUSTIN, TX), Richard James Eickemeyer (ROCHESTER, MN), John Barry Griswell, JR. (AUSTIN, TX), Dung Quoc Nguyen (AUSTIN, TX)
Application Number: 14/724,175