Branch Prediction Patents (Class 712/239)
  • Patent number: 9292292
    Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.
    Type: Grant
    Filed: June 20, 2013
    Date of Patent: March 22, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Luke Yen
  • Patent number: 9286081
    Abstract: A background thread can be used to process events, e.g., a touch, gesture, pinch, or swipe, that are received on a touch sensitive device, or events, e.g., mouse scroll wheel events that are received on a input device, e.g., a mouse. The background thread can be used to process events when a main thread assigned to the Graphical User Interface (GUI) is interrupted. In such situations, the background thread can continue processing events. In cases where the main thread is interrupted and the event is scroll input, the background thread can draw content on the GUI in response to the scroll, so that the response to the scroll input observed by the user is unaffected by the interrupted main thread. By processing events and drawing content using the background thread while the main thread is blocked, the GUI can be navigated without having the user experience a stall or stutter.
    Type: Grant
    Filed: October 10, 2012
    Date of Patent: March 15, 2016
    Assignee: Apple Inc.
    Inventors: Corbin R. Dunn, Ali T. Ozer, Raleigh Joseph Ledet, Kristin Forster
  • Patent number: 9239721
    Abstract: Embodiments relate to global weak pattern history table (PHT) filtering. An aspect includes receiving a search address associated with a branch prediction, and receiving a prediction strength indicator and a tag from a PHT. Based on determining that the tag matches the search address and the prediction strength indicator is weak, an accuracy counter is compared to a comparison threshold to determine whether a PHT direction prediction from the PHT is more likely accurate than a branch history table (BHT) direction prediction from a BHT. The PHT direction prediction is selected as a direction prediction based on determining that the accuracy counter indicates that the PHT direction prediction is more likely accurate than the BHT direction prediction. The BHT direction prediction is selected as the direction prediction based on determining that the accuracy counter indicates that the BHT direction prediction is more likely accurate than the PHT direction prediction.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: January 19, 2016
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 9229723
    Abstract: Embodiments relate to global weak pattern history table (PHT) filtering. An aspect includes receiving a search address associated with a branch prediction, and receiving a prediction strength indicator and a tag from a PHT. Based on determining that the tag matches the search address and the prediction strength indicator is weak, an accuracy counter is compared to a comparison threshold to determine whether a PHT direction prediction from the PHT is more likely accurate than a branch history table (BHT) direction prediction from a BHT. The PHT direction prediction is selected as a direction prediction based on determining that the accuracy counter indicates that the PHT direction prediction is more likely accurate than the BHT direction prediction. The BHT direction prediction is selected as the direction prediction based on determining that the accuracy counter indicates that the BHT direction prediction is more likely accurate than the PHT direction prediction.
    Type: Grant
    Filed: June 11, 2012
    Date of Patent: January 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 9213563
    Abstract: Systems and methods for executing non-native instructions in a computing system having a processor configured to execute native instructions are provided. A dynamic translator uses instruction code translation in parallel with just-in-time (JIT) compilation to execute the non-native instructions. Non-native instructions may be interpreted to generate instruction codes, which may be stored in a shadow memory. During a subsequent scheduling of a non-native instruction for execution, the corresponding instruction code may be retrieved from the shadow memory and executed, thereby avoiding reinterpreting the non-native instruction. In addition, the JIT compiler may compile instruction codes to generate native instructions, which may be made available for execution, further speeding up the execution process.
    Type: Grant
    Filed: December 30, 2013
    Date of Patent: December 15, 2015
    Assignee: Unisys Corporation
    Inventors: Andrew T. Jennings, Charles R Caldarale, Maurice Marks, Kevin W Harris
  • Patent number: 9201654
    Abstract: Disclosed are a processor and a processing method incorporating an instruction pipeline with direction prediction (i.e., taken or not taken) for conditional branch instructions. In the embodiments, reading of a branch instruction history table (BHT) and a branch instruction target address cache (BTAC) for branch direction prediction occurs in parallel with the current instruction fetch in order to minimize delay in the next instruction fetch. Additionally, direction prediction is performed in the very next clock cycle based either on an initial direction prediction for the specific instruction, as stored in the BHT, or, if applicable, on a prior entry for the specific instruction in the BTAC. An override bit associated with each entry in the BTAC is the determining factor for whether or the BTAC or BHT is controlling. Override bits in the BTAC can be pre-established based on the branch instruction type in order to ensure prediction accuracy.
    Type: Grant
    Filed: June 28, 2011
    Date of Patent: December 1, 2015
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Jack R. Smith, Arnold S. Tran, Kenichi Tsuchiya
  • Patent number: 9182991
    Abstract: A computer system for instruction execution includes a processor having a pipeline. The system is configured to perform a method including fetching, in the pipeline, a plurality of instructions, wherein the plurality of instructions includes a plurality of branch instructions, for each of the plurality of branch instructions, assigning a branch uncertainty to each of the plurality of branch instructions, for each of the plurality of instructions, assigning an instruction uncertainty that is a summation of branch uncertainties of older unresolved branches and balancing the instructions, based on a current summation of instruction uncertainty, in the pipeline.
    Type: Grant
    Filed: February 6, 2012
    Date of Patent: November 10, 2015
    Assignee: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Brian R. Prasky, Vijayalakshmi Srinivasan
  • Patent number: 9176737
    Abstract: A data processing apparatus is disclosed, having: an instruction decoder configured to decode a stream of instructions, a data processor configured to process the decoded stream of instructions; wherein in response to a plurality of adjacent instructions within the stream of instructions execution of which is dependent upon a data condition being met and whose execution when said data condition is not met does not change a state of said processing apparatus, the processor is configured to: commence determining whether the data condition is met or not; and commence processing said plurality of adjacent instructions; and in response to determining that said data condition is not met; skip to a next instruction to be executed after said plurality of adjacent instructions without executing any intermediate ones of said plurality of adjacent instructions not yet executed and continue execution at the next instruction.
    Type: Grant
    Filed: February 7, 2011
    Date of Patent: November 3, 2015
    Assignee: ARM Limited
    Inventor: Alastair David Reid
  • Patent number: 9170817
    Abstract: Some microprocessors check branch prediction information in a branch history table and/or a branch target buffer. To check for branch prediction information, a microprocessor can identify which instructions are control flow instructions and which instructions are non control flow instructions. To reduce power consumption in the branch history table and/or branch target buffer, the branch history table and/or branch target buffer can check for branch prediction information corresponding to the control flow instructions and not the non control flow instructions.
    Type: Grant
    Filed: August 4, 2009
    Date of Patent: October 27, 2015
    Assignee: STMicroelectronics (Beijing) R&D Co., Ltd.
    Inventors: Kai-Feng Wang, Hong-Xia Sun, Yong-Qiang Wu
  • Patent number: 9075622
    Abstract: In a data processing system, data representing program instructions is fetched from memory, each instruction being from one of a plurality of sets of instructions including at least first and second sets of instructions and each program instruction within the fetched data comprising one or more blocks to be pre-decoded, each block representing a portion of an instruction. Pre-decoding circuitry is configured to perform pre-decoding operations on the blocks. For at least one portion of an instruction from the first set of instructions and at least one portion of an instruction from the second set of instructions the pre-decoding operation performed on a block fetched from memory is independent of whether the block is identified as representing the at least one portion of an instruction from the first set of instructions or as the at least one portion of an instruction from the second set of instructions.
    Type: Grant
    Filed: January 23, 2008
    Date of Patent: July 7, 2015
    Assignee: ARM Limited
    Inventors: Peter Richard Greenhalgh, Andrew Christopher Rose
  • Patent number: 9032191
    Abstract: A hypervisor and one or more guest operating systems resident in a data processing system and hosted by the hypervisor are configured to selectively enable or disable branch prediction logic through separate hypervisor-mode and guest-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic for different operating scenarios.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: May 12, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 9021240
    Abstract: A system and method for controlling restarting of instruction fetching using speculative address computations in a processor are provided. The system includes a predicted target queue to hold branch prediction logic (BPL) generated target address values. The system also includes target selection logic including a recycle queue. The target selection logic selects a saved branch target value between a previously speculatively calculated branch target value from the recycle queue and an address value from the predicted target queue. The system further includes a compare block to identify a wrong target in response to a mismatch between the saved branch target value and a current calculated branch target, where instruction fetching is restarted in response to the wrong target.
    Type: Grant
    Filed: February 22, 2008
    Date of Patent: April 28, 2015
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Brian R. Prasky, Anthony Saporito, Robert J. Sonnelitter, III
  • Patent number: 9021241
    Abstract: Embodiments provide methods, apparatus, systems, and computer readable media associated with predicting predicates and branch targets during execution of programs using combined branch target and predicate predictions. The predictions may be made using one or more prediction control flow graphs which represent predicates in instruction blocks and branches between blocks in a program. The prediction control flow graphs may be structured as trees such that each node in the graphs is associated with a predicate instruction, and each leaf associated with a branch target which jumps to another block. During execution of a block, a prediction generator may take a control point history and generate a prediction. Following the path suggested by the prediction through the tree, both predicate values and branch targets may be predicted. Other embodiments may be described and claimed.
    Type: Grant
    Filed: June 18, 2010
    Date of Patent: April 28, 2015
    Assignee: The Board of Regents of The University of Texas System
    Inventors: Douglas C. Burger, Stephen W. Keckler
  • Publication number: 20150100762
    Abstract: A processor includes an instruction fetch unit and an execution unit. The instruction fetch unit retrieves instructions from memory to be executed by the execution unit. The instruction fetch unit includes a branch prediction unit which is configured to predict whether a branch instruction is likely to be executed. The memory includes an instruction cache comprising a portion of the fetch blocks available in the memory. The instruction fetch unit may use a combination of way prediction and serialized access to retrieve instructions from the instruction cache. The instruction fetch unit initially accesses the instruction cache to retrieve the predicted fetch block associated with a way prediction. The instruction fetch unit compares a cache tag associated with the way prediction with the address of the cache line that includes the predicted fetch block. If the tag matches, then the way prediction is correct and the retrieved fetch block is valid.
    Type: Application
    Filed: October 6, 2014
    Publication date: April 9, 2015
    Inventor: Eino Jacobs
  • Publication number: 20150100769
    Abstract: A processor uses a prediction unit to predict subsequent instructions of a program to be executed by the processor. Many implementations or combinations of implementations may be used to predict the subsequent instruction of the program. In one embodiment, a branch cache is used to store branch information. A prediction table is used to store prediction information based on the branch. A prediction logic module determines whether a branch is taken or not taken based on the branch information stored in the branch cache and the prediction information stored in the prediction table.
    Type: Application
    Filed: October 6, 2014
    Publication date: April 9, 2015
    Inventor: Eino Jacobs
  • Patent number: 8989887
    Abstract: A method and system for the use of prediction data in monitoring actual production targets is described herein. In one embodiment, a process is provided to receive data from a plurality of source systems in a manufacturing facility and generate a prediction pertaining to a future state of the manufacturing facility based on the data received from the plurality of source systems. A recent state of the manufacturing facility is determined based on the data received from the plurality of source systems and a comparison between the recent state and the prediction is facilitated.
    Type: Grant
    Filed: February 10, 2010
    Date of Patent: March 24, 2015
    Assignee: Applied Materials, Inc.
    Inventors: Richard Stafford, David E. Norman
  • Patent number: 8972706
    Abstract: A data processing system and computer program product for processing instructions. The instructions are processed by a processor unit while using a first table in a plurality of tables to predict a set of instructions needed by the processor unit after processing of a conditional instruction. An identification is formed that a rate of success in correctly predicting the set of instructions when using the first table is less than a threshold number. A sequence of the instructions being processed by the processor unit is searched for an instruction that matches a marker in a set of markers for identifying when to use the plurality of tables. An identification that the instruction that matches the marker is formed. A second table from the plurality of tables referenced by the marker is identified. The second table is used in place of the first table.
    Type: Grant
    Filed: May 26, 2011
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Wen-Tzer T. Chen, Diane G. Flemming, William A. Maron, Mysore S. Srinivas, David B. Whitworth
  • Publication number: 20150058607
    Abstract: Embodiments relate to confidence threshold-based opposing path execution for branch prediction. An aspect includes determining a branch prediction for a first branch instruction that is encountered during execution of a first thread, wherein the branch prediction indicates a primary path and an opposing path for the first branch instruction. Another aspect includes executing the primary path by the first thread. Another aspect includes determining a confidence of the branch prediction and comparing the confidence of the branch prediction to a confidence threshold. Yet another aspect includes, based on the confidence of the branch prediction being less than the confidence threshold, starting a second thread that executes the opposing path of the first branch instruction, wherein the second thread is executed in parallel with the first thread.
    Type: Application
    Filed: September 30, 2014
    Publication date: February 26, 2015
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
  • Patent number: 8959320
    Abstract: A system and method for efficient branch prediction. A processor includes two branch predictors. A first branch predictor generates branch prediction data, such as a branch direction and a branch target address. The second branch predictor generates branch prediction data at a later time and with higher prediction accuracy. Control logic may determine whether the branch prediction data from each of the first and the second branch predictors match. If a mismatch occurs, the first predictor may be trained with the branch prediction data generated by the second branch predictor. A stored indication of hysteresis may indicate a given branch instruction exhibits a frequently alternating pattern regarding its branch direction. Such behavior may lead to consistent branch mispredictions due to the training is unable to keep up with the changing branch direction. When such a condition is determined to occur, the control logic may prevent training of the first predictor.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: February 17, 2015
    Assignee: Apple Inc.
    Inventors: Andrew J. Beaumont-Smith, Ramesh B. Gunna
  • Publication number: 20150046690
    Abstract: A technique for branch target prediction includes storing, based on an instruction fetch address for a group of fetched instructions, first predicted targets for first indirect branch instructions in respective entries of a local count cache. Second predicted targets for second indirect branch instructions are stored in respective entries of a global count cache, based on the instruction fetch address and a global history vector for the instruction fetch address. One of the local count cache and the global count cache is selected to provide a selected predicted target for an indirect branch instruction in the group of fetched instructions.
    Type: Application
    Filed: August 8, 2013
    Publication date: February 12, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard James Eickemeyer, Tejas Karkhanis, Brian R. Konigsburg, David Stephen Levitan, Douglas Robert Gordan Logan
  • Patent number: 8943300
    Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: January 27, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Patent number: 8943301
    Abstract: Methods for storing branch information in an address table of a processor are disclosed. A processor of the disclosed embodiments may generally include an instruction fetch unit connected to an instruction cache, a branch execution unit, and an address table being connected to the instruction fetch unit and the branch execution unit. The address table may generally be adapted to store a plurality of entries with each entry of the address table being adapted to store a base address and a base instruction tag. In a further embodiment, the branch execution unit may be adapted to determine the address of a branch instruction having an instruction tag based on the base address and the base instruction tag of an entry of the address table associated with the instruction tag. In some embodiments, the address table may further be adapted to store branch information.
    Type: Grant
    Filed: May 5, 2011
    Date of Patent: January 27, 2015
    Assignee: International Business Machines Corporation
    Inventors: Brian R. Konigsburg, David Stephen Levitan, Wolfram M. Sauer, Samuel Jonathan Thomas
  • Publication number: 20150006868
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for minimizing bandwidth to compress an output stream of an instruction tracing system. For example, the method may include identifying a current instruction in a trace of the IT module as a conditional branch (CB) instruction. The method includes executing one of generating a CB packet including a byte pattern with an indication of outcome of the CB instruction, or adding an indication of the outcome of the CB instruction to the byte pattern of an existing CB packet. The method includes generating a packet when a subsequent instruction in the trace is not the CB instruction. The packet is different from the CB packet. The method also includes adding the packet into a deferred queue when the packet is deferrable. The method further includes outputting the CB packet followed by the deferred packet into a packet log.
    Type: Application
    Filed: June 28, 2013
    Publication date: January 1, 2015
    Inventors: Ilya Wagner, Matthew C. Merten, Frank Binns, Christine E. Wang, Mayank Bomb, Tong Li, Thilo Schmitt, MD A. Rahman
  • Publication number: 20140372736
    Abstract: A data processing apparatus and method are provided for handling retrieval of instructions from an instruction cache. Fetch circuitry retrieves instructions from the instruction cache into a temporary buffer, and execution circuitry executes a sequence of instructions retrieved from the temporary buffer, that sequence including branch instructions. Branch prediction circuitry is configured to predict, for each identified branch instruction in the sequence, if that branch instruction will result in a taken branch when that branch instruction is subsequently executed by the execution circuitry. In a normal operating mode, the fetch circuitry retrieves one or more speculative instructions from the instruction cache between the time that a source branch instruction is retrieved from the instruction cache and the branch prediction circuitry predicts if that source branch instruction will result in the taken branch.
    Type: Application
    Filed: June 11, 2014
    Publication date: December 18, 2014
    Inventor: Peter Richard GREENHALGH
  • Patent number: 8914622
    Abstract: Processors may be tested according to various implementations. In one general implementation, a process for processor testing may include randomly generating a first plurality of branch instructions for a first portion of an instruction set, each branch instruction in the first portion branching to a respective instruction in a second portion of the instruction set, the branching of the branch instructions to the respective instructions being arranged in a sequential manner. The process may also include randomly generating a second plurality of branch instructions for the second portion of the instruction set, each branch instruction in the second portion branching to a respective instruction in the first portion of the instruction set, the branching of the branch instructions to the respective instructions being arranged in a sequential manner. The process may additionally include generating a plurality of instructions to increment a counter when each branch instruction is encountered during execution.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: December 16, 2014
    Assignee: International Business Machines Corporation
    Inventors: Abhishek Bansal, Nitin P. Gupta, Brad L. Herold, Jayakumar N. Sankarannair
  • Publication number: 20140365753
    Abstract: A microprocessor includes a predicting unit and a control unit. The control unit controls the predicting unit to accumulate a history of characteristics of executed instructions and makes predictions related to subsequent instructions based on the history while the microprocessor is running a first thread. The control unit also detects a transition from running the first thread to running a second thread and controls the predicting unit to selectively suspend accumulating the history and making the predictions using the history while running the second thread. The predicting unit makes static predictions while running the second thread. The selectivity may be based on the privilege level, identity or length of the second thread, static prediction effectiveness during a previous execution instance of the thread, whether the transition was made due to a system call, and whether the second thread is an interrupt handler.
    Type: Application
    Filed: January 27, 2014
    Publication date: December 11, 2014
    Inventors: Rodney E. Hooker, Terry Parks, John Michael Greer
  • Patent number: 8909907
    Abstract: Exemplary embodiments include a system and method for reducing branch prediction latency using a branch target buffer with most recently used column prediction. An exemplary embodiment includes a method for reducing branch prediction latency, the method including reading most-recently-used information from a most-recently-used table associated with the branch target buffer where each most-recently-used entry corresponds to one or more branch target buffer rows and specifies the ordering from least-recently-used to most-recently-used of the associated branch target buffer columns, selecting a row from the branch target buffer and simultaneously selecting the associated entry from the most-recently-used table and speculating that there is a prediction in the most recently used column of the plurality of columns from the selected row from the branch target buffer while determining whether there is a prediction and which column contains the prediction.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: December 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 8909908
    Abstract: A pipelined out-of-order execution in-order retire microprocessor includes a branch predictor that predicts a target address of a branch instruction, a fetch unit that fetches instructions at the predicted target address, and an execution unit that: resolves a target address of the branch instruction and detects that the predicted and resolved target addresses are different; determines whether there is an unretired instruction that must be corrected and that is older in program order than the branch instruction, in response to detecting that the predicted and resolved target addresses are different; execute the branch instruction by flushing instructions fetched at the predicted target address and causing the fetch unit to fetch from the resolved target address, if there is not an unretired instruction that must be corrected and that is older in program order than the branch instruction; and otherwise, refrain from executing the branch instruction.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: December 9, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: Rodney E. Hooker, Gerard M. Col, Bryan Wayne Pogor
  • Patent number: 8904155
    Abstract: In response to a property of a conditional branch instruction associated with a loop, such as a property indicating that the branch is a loop-ending branch, a count of the number of iterations of the loop is maintained, and a multi-bit value indicative of the loop iteration count is stored in a Branch History Register (BHR). In one embodiment, the multi-bit value may comprise the actual loop count, in which case the number of bits is variable. In another embodiment, the number of bits is fixed (e.g., two) and loop iteration counts are mapped to one of a fixed number of multi-bit values (e.g., four) by comparison to thresholds. Separate iteration counts may be maintained for nested loops, and a multi-bit value stored in the BHR may indicate a loop iteration count of only an inner loop, only the outer loop, or both.
    Type: Grant
    Filed: March 17, 2006
    Date of Patent: December 2, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: James Norris Dieffenderfer, Bohuslav Rychlik
  • Patent number: 8904156
    Abstract: A multithreaded microprocessor includes an instruction fetch unit including a perceptron-based conditional branch prediction unit configured to provide, for each of one or more concurrently executing threads, a direction branch prediction. The conditional branch prediction unit includes a plurality of storages each including a plurality of entries. Each entry may be configured to store one or more prediction values. Each prediction value of a given storage may correspond to at least one conditional branch instruction in a cache line. The conditional branch prediction unit may generate a separate index value for accessing each storage by generating a first index value for accessing a first storage by combining one or more portions of a received instruction fetch address, and generating each other index value for accessing the other storages by combining the first index value with a different portion of direction branch history information.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: December 2, 2014
    Assignee: Oracle America, Inc.
    Inventors: Manish K. Shah, Gregory F. Grohoski, Robert T. Golla, Jama I. Barreh
  • Publication number: 20140351568
    Abstract: Disclosed are an opportunistic multi-thread method and processor, the method comprising the following steps: if a zeroth thread, a first thread, a second thread and a third thread all have instructions ready to be executed, then a zeroth clock period, a first clock period, a second clock period and a third clock period are respectively allocated to the zeroth thread, the first thread, the second thread and the third thread; if one of the threads cannot issue an instruction within a specified clock period because the instruction is not ready, and the previous thread still has an instruction ready to be executed after issuing certain instructions in the previous specified clock period, then the previous thread will take the specified clock period.
    Type: Application
    Filed: November 15, 2012
    Publication date: November 27, 2014
    Inventor: Shenghong Wang
  • Publication number: 20140344558
    Abstract: A system and method for efficient branch prediction. A processor includes a next fetch predictor to generate a fast branch prediction for branch instructions at an early pipeline stage. The processor also includes a main return address stack (RAS) at a later pipeline stage for predicting the target of return instructions. When a return instruction is encountered, the prediction from the next fetch predictor is replaced by the top of the main RAS. If there are any recent call or return instructions in flight toward the main RAS, then a separate prediction is generated by a mini-RAS.
    Type: Application
    Filed: May 14, 2013
    Publication date: November 20, 2014
    Applicant: Apple Inc.
    Inventors: Douglas C. Holman, Ramesh B. Gunna, Conrado Blasco-Allue
  • Publication number: 20140337605
    Abstract: A mechanism for reducing power consumption of a cache memory of a processor includes a processor with a cache memory that stores instruction information for one or more instruction fetch groups fetched from a system memory. The cache memory may include a number of ways that are each independently controllable. The processor also includes a way prediction unit. The way prediction unit may enable, in a next execution cycle, a given way within which instruction information corresponding to a target of a next branch instruction is stored in response to a branch taken prediction for the next branch instruction. The way prediction unit may also, in response to the branch taken prediction for the next branch instruction, enable, one at a time, each corresponding way within which instruction information corresponding to respective sequential instruction fetch groups that follow the next branch instruction are stored.
    Type: Application
    Filed: May 7, 2013
    Publication date: November 13, 2014
    Applicant: Apple Inc.
    Inventors: Ronald P. Hall, Conrado Blasco-Allue
  • Patent number: 8886920
    Abstract: A processor configured to facilitate transfer and storage of predicted targets for control transfer instructions (CTIs). In certain embodiments, the processor may be multithreaded and support storage of predicted targets for multiple threads. In some embodiments, a CTI branch target may be stored by one element of a processor and a tag may indicate the location of the stored target. The tag may be associated with the CTI rather than associating the complete target address with the CTI. When the CTI reaches an execution stage of the processor, the tag may be used to retrieve the predicted target address. In some embodiments using a tag to retrieve a predicted target, CTI instructions from different processor threads may be interleaved without affecting retrieval of predicted targets.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: November 11, 2014
    Assignee: Oracle International Corporation
    Inventors: Christopher H. Olson, Manish K. Shah
  • Patent number: 8874885
    Abstract: Embodiments relate to mitigation of lookahead branch predication latency. An aspect includes receiving an instruction address in an instruction cache for fetching instructions in a microprocessor pipeline. Another aspect includes receiving the instruction address in a branch presence predictor coupled to the microprocessor pipeline. Another aspect includes determining, by the branch presence predictor, presence of a branch instruction in the instructions being fetched, wherein the branch instruction is predictable by the branch target buffer, and any indication of the instruction address not written to the branch target buffer is also not written to the branch presence predictor. Another aspect includes, based on receipt of an indication that the branch instruction is present from the branch presence predictor, holding the branch instruction.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: October 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, David S. Hutton, Brian R. Prasky, Anthony Saporito
  • Publication number: 20140317390
    Abstract: A data processing apparatus executes call instructions, and after a sequence of instructions executed in response to a call instruction a return instruction causes the program flow to return to a point in the program sequence associated with that call instruction. The data processing apparatus is configured to speculatively execute instructions in dependence on a predicted outcome of earlier instructions and a return address prediction unit is configured to store return addresses associated with unresolved call instructions. The return address prediction unit comprises: a stack portion onto which return addresses associated with unresolved call instructions are pushed, and from which a return address is popped when a return instruction is speculatively executed; and a buffer portion which stores an entry for each unresolved call instruction executed and for each return instruction which is speculatively executed.
    Type: Application
    Filed: April 18, 2013
    Publication date: October 23, 2014
    Applicant: ARM Limited
    Inventor: ARM Limited
  • Patent number: 8862861
    Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: October 14, 2014
    Assignee: Oracle International Corporation
    Inventors: Christopher H. Olson, Manish K. Shah
  • Publication number: 20140297996
    Abstract: A processor includes storage elements to store a first and second value, as well as a plurality of hash units coupled to the storage elements. Each hash unit performs a hash operation using the first value and the second value to generate a corresponding hash result value. The processor further includes selection logic to select a hash result value from the hash result values generated by the plurality of hash units responsive to a selection input generated from another hash operation performed using the first value and the second value. A method includes predicting whether a branch instruction is taken based on a prediction value stored at an entry of a branch prediction table indexed by an index value selected from a plurality of values concurrently generated from an address value of the branch instruction and a branch history value representing a history of branch directions at the processor.
    Type: Application
    Filed: April 1, 2013
    Publication date: October 2, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Ramkumar Jayaseelan
  • Patent number: 8850167
    Abstract: Provided is a processor including an instruction issue unit that issues a vector load instruction read from a main memory based on branch target prediction of a branch target in a branch instruction, a data acquisition unit that starts issue of a plurality of acquisition requests for acquiring a plurality of vector data based on the issued vector load instruction from the main memory, a determination unit that determines a success or a failure of the branch target prediction after the branch target is determined, and a vector load management unit that, when the branch target prediction is determined to be a success, acquires all vector data based on the plurality of acquisition requests and then transfers all the vector data to a vector register, and, when the branch target prediction is determined to be a failure, discards the vector data acquired by the issued acquisition requests.
    Type: Grant
    Filed: September 22, 2011
    Date of Patent: September 30, 2014
    Assignee: NEC Corporation
    Inventor: Masao Fukagawa
  • Publication number: 20140281440
    Abstract: An example method of storing a partial target address in an instruction cache includes receiving a branch instruction. The method also includes predicting a direction of the branch instruction as being not taken. The method further includes calculating a destination address based on executing the branch instruction. The method also includes determining a partial target address using the destination address. The method further includes in response to the predicted direction of the branch instruction changing from not taken to taken, replacing an offset in an instruction cache with the partial target address.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Applicant: QUALCOMM INCORPORATED
    Inventors: Jiajin Tu, Suresh K. Venkumahanti, Brian R. Mestan
  • Publication number: 20140281441
    Abstract: Methods and indirect branch predictor logic units to predict the target addresses of indirect branch instructions. The method comprises storing in a table predicted target addresses for indirect branch instructions indexed by a combination of the indirect path history for previous indirect branch instructions and the taken/not-taken history for previous conditional branch instructions. When a new indirect branch instruction is received for prediction, the indirect path history and the taken/not-taken history are combined to generate an index for the indirect branch instruction. The generated index is then used to identify a predicted target address in the table. If the identified predicted target address is valid, then the target address of the indirect branch instruction is predicted to be the predicted target address.
    Type: Application
    Filed: January 31, 2014
    Publication date: September 18, 2014
    Applicant: IMAGINATION TECHNOLOGIES, LTD.
    Inventor: Manouk MANOUKIAN
  • Publication number: 20140281439
    Abstract: Methods and apparatuses for optimizing hard-to-predict short forward branches. A method detects a forward conditional branch with at least one instruction between the forward conditional branch and forward conditional branch target. The method determines whether a first of the at least one instruction includes at least one of a conditional branch or a condition-code setter. If the first instruction does not have the at least one of a conditional branch or a condition-code setter, the first instruction is dynamically assigned an inverted condition to optimize a code path. The method determines if there is a next instruction between the forward conditional branch and its target. If there is, the method analyzes the next instruction. If there is no next instruction, the method executes the optimized code path. If the instruction includes the conditional branch or condition-code setter, it discards dynamic assignments and executes the detected forward conditional branch.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Applicant: QUALCOMM INCORPORATED
    Inventors: Vimal K. Reddy, Niket K. Choudhary, Michael William Morrow
  • Publication number: 20140258696
    Abstract: Systems and methods for predicting an indirect branch target address. A strided target address predictor (STAP) system can observe a striding pattern from a previous indirect branch target. The system can predict a target address based on the observed striding pattern. The system can initialize a confidence counter. The system can determine a previous indirect branch target address. The system can determine a predicted target address.
    Type: Application
    Filed: March 5, 2013
    Publication date: September 11, 2014
    Applicant: QUALCOMM INCORPORATED
    Inventor: Shekhar S. Srikantaiah
  • Patent number: 8832418
    Abstract: A microprocessor includes a branch target address cache (BTAC), each entry thereof configured to store branch prediction information for at most N branch instructions. An execution unit executes a branch instruction previously fetched in a fetch quantum. Update logic determines whether the BTAC is already storing information for N branch instructions within the fetch quantum (N is at least two), updates the BTAC for the branch instruction if the BTAC is not already storing information for N branch instructions, determines whether a type of the branch instruction has a higher replacement priority than a type of the N branch instructions if the BTAC is already storing information for N branch instructions, and updates the BTAC for the branch instruction if the type of the branch instruction has a higher replacement priority than the type of the N branch instructions already stored in the BTAC.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: September 9, 2014
    Assignee: VIA Technologies, Inc.
    Inventor: Thomas C. McDonald
  • Patent number: 8812826
    Abstract: In one implementation, processor testing may include the ability to randomly generate a first plurality of branch instructions for a first portion of an instruction set, each branch instruction in the first portion branching to a respective instruction in a second portion of the instruction set, the branching of the branch instructions to the respective instructions being arranged in a sequential manner. Processor testing may also include the ability to randomly generate a second plurality of branch instructions for the second portion of the instruction set, each branch instruction in the second portion branching to a respective instruction in the first portion of the instruction set, the branching of the branch instructions to the respective instructions being arranged in a sequential manner. Processor testing may additionally include the ability to generate a plurality of instructions to increment a counter when each branch instruction is encountered during execution.
    Type: Grant
    Filed: October 20, 2010
    Date of Patent: August 19, 2014
    Assignee: International Business Machines Corporation
    Inventors: Abhishek Bansal, Nitin Gupta, Brad L. Herold, Jayakumar N Sankarannair
  • Publication number: 20140229719
    Abstract: A branch prediction unit BPU (500) for prediction of a next taken branch instruction in a processing unit (100). The BPU (500) comprises a pattern history memory (504) comprising branch source addresses and branch indicators; a branch target buffer (506) comprising branch targets; and branch prediction logical circuit (502). By means of a search PC, the circuit finds in the memory a branch indicator indicating a predicted taken branch instruction. The circuit selects a first found branch indicator as an indication of a first predicted taken branch instruction. Using the first found branch indicator, the circuit retrieves from the memory, a branch source address of the first predicted taken branch instruction. When the retrieved branch source address is the branch source address nearest to the search PC, the circuit outputs as next PC a branch target retrieved from the buffer. Then the prediction stops.
    Type: Application
    Filed: July 16, 2012
    Publication date: August 14, 2014
    Applicant: Ericsson Moderns SA
    Inventors: Jean-Paul Smeets, Erik Rijshouwer
  • Patent number: 8806184
    Abstract: A branch prediction circuit includes: a memory for storing information representing a branch instruction and a branch prediction; a control circuit for controlling rewriting information in the memory in accordance with a result of determining whether or not a predicted branch has been taken, and determining an attribute of the predicted branch from a branch condition set by the branch instruction and the predicted branch that has been taken, if the predicted branch has been taken; and a rewriting circuit rewriting the information in the memory under the control of the control circuit.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: August 12, 2014
    Assignee: Fujitsu Limited
    Inventor: Yoshimasa Takebe
  • Patent number: 8788797
    Abstract: A branch predictor for use in a processor includes a Level 1 branch predictor, a Level 2 branch predictor, a match determining circuit, and an override determining circuit. The Level 1 branch predictor generates a Level 1 branch prediction. The Level 2 branch predictor generates a Level 2 branch prediction. The match determining circuit determines whether the Level 1 and Level 2 branch predictions match. The override determining circuit determines whether to override the Level 1 branch prediction with the Level 2 branch prediction. The Level 1 branch prediction is used when the Level 1 and Level 2 branch predictions match or when the Level 1 and Level 2 branch predictions do not match and the Level 1 branch prediction is not overridden. The Level 2 branch prediction is used when the Level 1 and Level 2 branch predictions do not match and the Level 1 branch prediction is overridden.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: July 22, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Trivikram Krishnamurthy, Anthony Jarvis
  • Publication number: 20140201509
    Abstract: Methods and branch predictors for predicting a target location of a jump table switch statement in a program. The method includes continuously monitoring instructions at the branch predictor to determine if they write to registers used to store an input variable to a jump table switch statement. Any update to a monitored register is stored in a register table maintained by the branch predictor. Then when it comes time to make a prediction for a jump table switch statement instruction the branch predictor uses the register value stored in the table is used to predict where the jump table switch statement will branch to.
    Type: Application
    Filed: January 13, 2014
    Publication date: July 17, 2014
    Applicant: IMAGINATION TECHNOLOGIES, LTD.
    Inventor: Hugh JACKSON
  • Publication number: 20140201507
    Abstract: A processor employs one or more branch predictors to issue branch predictions for each thread executing at an instruction pipeline. Based on the branch predictions, the processor determines a branch prediction confidence for each of the executing threads, whereby a lower confidence level indicates a smaller likelihood that the corresponding thread will actually take the predicted branch. Because speculative execution of an untaken branch wastes resources of the instruction pipeline, the processor prioritizes threads associated with a higher confidence level for selection at the stages of the instruction pipeline.
    Type: Application
    Filed: January 11, 2013
    Publication date: July 17, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ramkumar Jayaseelan, Ravindra N. Bhargava