Prefetching A Branch Target (i.e., Look Ahead) Patents (Class 712/237)
  • Patent number: 11928472
    Abstract: Methods and apparatus relating to branch prefetch mechanisms for mitigating front-end branch resteers are described. In an embodiment, predecodes an entry in a cache to generate a predecoded branch operation. The entry is associated with a cold branch operation, where the cold branch operation corresponds to an operation that is detected for a first time after storage in an instruction cache and wherein the cold branch operation remains undecoded since it is stored at a location in a cache line prior to a subsequent location of a branch operation in the cache line. The predecoded branch operation is stored in a Branch Prefetch Buffer (BPB) in response to a cache line fill operation of the cold branch operation in an instruction cache. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: September 26, 2020
    Date of Patent: March 12, 2024
    Assignee: Intel Corporation
    Inventors: Gilles Pokam, Jared Warner Stark, IV, Niranjan Kumar Soundararajan, Oleg Ladin
  • Patent number: 11900117
    Abstract: A streaming engine in a system receives a first set of stream parameters into a queue to define a first stream along with an indication of either a queue mode of operation or a speculative mode of operation for the first stream. Acquisition of the first stream then begins. At some point, a second set of stream parameters is received into the queue to define a second stream. When the queue mode of operation was specified for the first stream, the second set of parameters is queued and acquisition of the second stream is delayed until completion of acquisition of the first stream. When the speculative mode of operation was specified for the first stream, acquisition of the first stream is canceled upon receipt of the second set of stream parameters and acquisition of the second stream begins immediately.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: February 13, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Jonathan (Son) Hung Tran, Joseph Raymond Michael Zbiciak
  • Patent number: 11880231
    Abstract: Timekeeping on a computing device is deterministically performed by implementing two successive calls to a time function that returns current time based on a continuously running counter that is maintained in one or more cores of the device's CPU. The same fixed time computation parameters are used in each call, with the single variable being a value that is read from the counter. For the initial call to the time function, the processor optimizes the instruction execution by predicting the function's execution path based on observed patterns. As the instructions and data are already cached, and the processor has the results of the prior execution path prediction, the subsequent call executes quickly and predictably relative to the initial call while the processor remains in a working (i.e., non-sleep) state. The series of calls provides a deterministic time computation with improved accuracy by mitigating the unpredictability of processor sleep state delays.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: January 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sarath Madakasira, Keith Loren Mange
  • Patent number: 11836047
    Abstract: Embodiments of small file restore process in deduplication file system wherein restoration requires issuing a read request within an I/O request to the file system. The process places the small files in a prefetch queue such that a combined size of the small files meets or exceeds a size of the prefetch queue as defined by a prefetch horizon. A queue processor issues a read request for the first file in the queue, scans the prefetch queue to find a read request for a file at the prefetch horizon, and prefetches the file at the prefetch horizon. The prefetch queue essentially constitutes a hint from the client that a read I/O is imminent for purposes of filling the read-ahead cache and preventing a need to issue a blocking I/O operation.
    Type: Grant
    Filed: October 8, 2021
    Date of Patent: December 5, 2023
    Assignee: Dell Products L.P.
    Inventors: Nitin Madan, Donna Barry Lewis, Kedar Godbole
  • Patent number: 11755367
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented on a computation graph. One of the methods receiving, by a computation graph system, a request to generate a schedule for processing a computation graph, obtaining data representing the computation graph generating a separator of the computation graph; and generating the schedule to perform the operations represented in the computation graph, wherein generating the schedule comprises: initializing the schedule with zero nodes; for each node in the separator: determining whether the node has any predecessor nodes in the computation graph, when the node has any predecessor nodes, adding the predecessor nodes to the schedule, and adding the node in the schedule, and adding to the schedule each node in each subgraph that is not a predecessor to any node in the separator on the computation graph.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Erik Nathan Vee, Manish Deepak Purohit, Joshua Ruizhi Wang, Shanmugasundaram Ravikumar, Zoya Svitkina
  • Patent number: 11726544
    Abstract: Aspects of the disclosure provide an apparatus for executing a program that involves a plurality of operators. For example, the apparatus can include an executor and an analyzer. The executor can be configured to execute the program with at least a first one of the operators loaded on a second memory from a first memory that stores the operators and to generate a signal based on a progress of the execution of the program with the first operator. The analyzer can be coupled to the executor, the first memory and the second memory, and configured to load at least a second one of the operators of the program next to the first operator stored in the first memory to the second memory before the executor finishes execution of the program with the first operator based on the signal from the executor and an executing scheme stored in the second memory.
    Type: Grant
    Filed: April 6, 2022
    Date of Patent: August 15, 2023
    Assignee: MEDIATEK INC.
    Inventors: Chih-Hsiang Hsiao, Chia-Feng Hsu
  • Patent number: 11671656
    Abstract: A method is provided. The method includes: obtaining multiple association relationships, wherein each association relationship includes a resolution switching algorithm and a historical freeze rate; determining a historical freeze rate meeting a preset condition from the multiple association relationships; and selecting a target resolution switching algorithm from a corresponding association relationship of the association relationships according to the historical freeze rate meeting the preset condition, wherein the target resolution switching algorithm is used for switching a resolution of a video. A system, a computing device, and a computer-program product are also provided.
    Type: Grant
    Filed: July 12, 2021
    Date of Patent: June 6, 2023
    Assignee: Shanghai Bilibili Technology Co., Ltd.
    Inventors: Jianqiang Ding, Zhaoxin Tan
  • Patent number: 11599361
    Abstract: A data processing apparatus is provided. It includes control flow detection prediction circuitry that performs a presence prediction of whether a block of instructions contains a control flow instruction. A fetch queue stores, in association with prediction information, a queue of indications of the instructions and the prediction information comprises the presence prediction. An instruction cache stores fetched instructions that have been fetched according to the fetch queue. Post-fetch correction circuitry receives the fetched instructions prior to the fetched instructions being received by decode circuitry, the post-fetch correction circuitry includes analysis circuitry that causes the fetch queue to be at least partly flushed in dependence on a type of a given fetched instruction and the prediction information associated with the given fetched instruction.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: March 7, 2023
    Assignee: Arm Limited
    Inventors: Jaekyu Lee, Yasuo Ishii, Krishnendra Nathella, Dam Sunwoo
  • Patent number: 11403105
    Abstract: An apparatus has processing circuitry for executing instructions and fetch circuitry for fetching the instructions for execution. When a branch instruction is encountered by the fetch circuitry, it determines subsequent instructions to be fetched in dependence on an initial branch direction prediction for the branch instruction made by branch prediction circuitry. Value prediction circuitry is used to maintain a predicted result value for one or more instructions, and dispatch circuitry maintains a record of pending instructions that have been fetched by the fetch circuitry and are awaiting execution by the processing circuitry, and selects pending instructions from the record for dispatch to the processing circuitry.
    Type: Grant
    Filed: January 26, 2021
    Date of Patent: August 2, 2022
    Assignee: Arm Limited
    Inventors: Vladimir Vasekin, David Michael Bull, Frederic Claude Marie Piry, Alexei Fedorov
  • Patent number: 11314511
    Abstract: A value to be used in register-indirect branching is predicted and concurrently stored in a selected location accessible to one or more instructions. The value may be a target address used by an indirect branch and the selected location may be a hardware register, providing concurrent prediction of branch addresses and the update of register contents.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: April 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 11269641
    Abstract: A data processing apparatus is provided having branch prediction circuitry, the branch prediction circuitry having a Branch Target Buffer, BTB. A fetch target queue receives entries corresponding to a sequence of instruction addresses, at least one of the sequence having been predicted using the branch prediction circuitry. A fetch engine is provided to fetch instruction addresses taken from a top of the fetch target queue whilst a prefetch engine sends a prefetch probe to an instruction cache. The BTB is to detect a BTB miss when attempting to populate a storage slot of the fetch target queue and the BTB triggers issuance of a BTB miss probe to the memory to fetch at least one instruction from the memory to resolve the BTB miss using branch-prediction based prefetching.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: March 8, 2022
    Assignee: THE UNIVERSITY COURT OF THE UNIVERSITY OF EDINBURGH
    Inventors: Rakesh Kumar, Boris Grot, Vijay Nagarajan, Cheng Chieh Huang
  • Patent number: 11184281
    Abstract: A packet processing method and apparatus relating to the field of communications technologies are provided, so as to reduce overheads and improve update efficiency. The method includes: receiving a first packet and a second packet; determining a first instruction block; obtaining a first identifier according to the first instruction block and the first packet, and obtaining a second identifier according to the first instruction block and the second packet, the first entry includes a third identifier, and the third identifier is a storage address of a second instruction block; obtaining the third identifier by accessing the first entry indicated by the first identifier, and obtaining the third identifier by accessing the first entry indicated by the second identifier; obtaining the second instruction block according to the third identifier; and processing the first packet and the second packet according to the second instruction block.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: November 23, 2021
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Jingzhou Yu
  • Patent number: 11150904
    Abstract: A value to be used in register-indirect branching is predicted and concurrently stored in a selected location accessible to one or more instructions. The value may be a target address used by an indirect branch and the selected location may be a hardware register, providing concurrent prediction of branch addresses and the update of register contents.
    Type: Grant
    Filed: August 18, 2017
    Date of Patent: October 19, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 11086291
    Abstract: A numerically controlled production system is connected to a numerical controller, which includes a control program with successive program sets, and a look-ahead module which determines therefrom for successive clock-cycle points a movement profile with guidance variables for a movement axis prior to a movement. Subject to a condition, the control program includes program branching with multiple alternative control program sections, and determines which of the alternative control program sections is to be additionally executed subject to the condition. The look-ahead module calculates and stores an alternative movement profile for each alternative control program section prior to an additional movement, and holds the alternative movement profile available for the conditional program branching in order to carry out the additional movement.
    Type: Grant
    Filed: October 11, 2018
    Date of Patent: August 10, 2021
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Thomas Pitz, Ralf Spielmann
  • Patent number: 11086629
    Abstract: Apparatus and a method of operating the same is disclosed. Instruction fetch circuitry is provided to fetch a block of instructions from memory and branch prediction circuitry to generate branch prediction indications for each branch instruction present in the block of instructions. The branch prediction circuitry is responsive to identification of a first conditional branch instruction in the block of instructions that is predicted to be taken to modify a branch prediction indication generated for the first conditional branch instruction to include a subsequent branch status indicator. When there is a subsequent branch instruction after the first conditional branch instruction in the block of instructions that is predicted to be taken the subsequent branch status indicator has a first value, and otherwise the subsequent branch status indicator has a second value. This supports improved handling of a misprediction as taken.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: August 10, 2021
    Assignee: ARM Limited
    Inventors: Yasuo Ishii, Muhammad Umar Farooq, Chris Abernathy
  • Patent number: 10929135
    Abstract: Predicting a predicted value to be used in register-indirect branching. The predicted value is stored in a first selected location and a second selected location accessible to one or more instructions of a computing environment. The storing is performed concurrently to processing a register-indirect branch. Further, the first selected location and the second selected location is in addition to another location used to store an instruction address. The predicted value is used in speculative processing that includes the register-indirect branch.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: February 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10908911
    Abstract: Predicting a predicted value to be used in register-indirect branching. The predicted value is stored in a first selected location and a second selected location accessible to one or more instructions of a computing environment. The storing is performed concurrently to processing a register-indirect branch. Further, the first selected location and the second selected location is in addition to another location used to store an instruction address. The predicted value is used in speculative processing that includes the register-indirect branch.
    Type: Grant
    Filed: August 18, 2017
    Date of Patent: February 2, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10884748
    Abstract: Detecting that a sequence of instructions creates an affiliated relationship. A determination is made that a sequence of instructions creates an affiliated relationship. Based on determining that the sequence of instructions creates the affiliated relationship, a sequence of operations is generated. The sequence of operations provides a predicted target address to be included in a selected register and to be used in branching.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: January 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10884745
    Abstract: Detecting that a sequence of instructions creates an affiliated relationship. A determination is made that a sequence of instructions creates an affiliated relationship. Based on determining that the sequence of instructions creates the affiliated relationship, a sequence of operations is generated. The sequence of operations provides a predicted target address to be included in a selected register and to be used in branching.
    Type: Grant
    Filed: August 18, 2017
    Date of Patent: January 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10747535
    Abstract: Systems, apparatuses, and methods for processing load instructions are disclosed. A processor includes at least a data cache and a load queue for storing load instructions. The load queue includes poison indicators for load instructions waiting to reach non-speculative status. When a non-cacheable load instruction is speculatively executed, then the poison bit is automatically set for the load instruction. If a cacheable load instruction is speculatively executed, then the processor waits until detecting a first condition before setting the poison bit for the load instruction. The first condition may be detecting a cache line with data for the load instruction being evicted from the cache. If an ordering event occurs for a load instruction with a set poison bit, then the load instruction may be flushed and replayed. An ordering event may be a data barrier or a hazard on an older load targeting the same address as the load.
    Type: Grant
    Filed: July 11, 2016
    Date of Patent: August 18, 2020
    Assignee: Apple Inc.
    Inventors: Mahesh K. Reddy, Matthew C. Stone
  • Patent number: 10338923
    Abstract: A method for branch prediction, the method comprising, receiving a branch wrong guess instruction having a branch wrong guess instruction address and data including an opcode and a branch target address, determining whether the branch wrong guess instruction was predicted by a branch prediction mechanism, sending the branch wrong guess instruction to an execution unit responsive to determining that the branch wrong guess instruction was predicted by the branch prediction mechanism, and receiving and decoding instructions at the branch target address.
    Type: Grant
    Filed: May 5, 2009
    Date of Patent: July 2, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Philip G. Emma, Allan M. Hartstein, Keith N. Langston, Brian R. Prasky, Thomas R. Puzak, Charles F. Webb
  • Patent number: 10289808
    Abstract: Described herein are embodiments that relate to a method for use in data processing. An embodiment includes providing an arithmetic unit configured to perform any one in a set of operations. An embodiment includes providing a control register configured to hold control data. An embodiment includes providing in the set of operations, a control operation to provide process control, the control operation to operate on an operand that is coupled to the control data. A system for use in data processing is also disclosed having process registers and a control register. Further, a non-transitory computer-readable medium storing instruction code thereon for use in data processing is disclosed. When executed, the code causes a control operation forming part of a set of operations to operate on an operand that is coupled to control data held in a control register.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: May 14, 2019
    Assignee: Infineon Technologies AG
    Inventors: Bala Nagendra Raja Munjuluri, Prakash Nayak
  • Patent number: 10255074
    Abstract: Selective flushing of instructions in an instruction pipeline in a processor back to an execution-determined target address in response to a precise interrupt is disclosed. A selective instruction pipeline flush controller determines if a precise interrupt has occurred for an executed instruction in the instruction pipeline. The selective instruction pipeline flush controller determines if an instruction at the correct resolved target address of the instruction that caused the precise interrupt is contained in the instruction pipeline. If so, the selective instruction pipeline flush controller can selectively flush instructions back to the instruction in the pipeline that contains the correct resolved target address to reduce the amount of new instruction fetching.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: April 9, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Vignyan Reddy Kothinti Naresh, Rami Mohammad Al Sheikh, Harold Wade Cain, III
  • Patent number: 10241800
    Abstract: A split level history buffer in a central processing unit is provided. A history buffer is partitioned into a first portion and a second portion, wherein the first portion includes a first tagged instruction. A result is generated for the first tagged instruction. A determination whether a second tagged instruction is to be stored in the first portion of the history buffer is made. Responsive to the determination that the second tagged instruction is to be stored in the first portion of the history buffer, the first tagged instruction and the generated result for the first tagged instruction is written to the second portion of the history buffer.
    Type: Grant
    Filed: June 16, 2015
    Date of Patent: March 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, David R. Terry
  • Patent number: 9665374
    Abstract: A method is described. The method includes receiving an instruction, accessing a return cache to load a predicted return target address upon determining that the instruction is a return instruction, searching a lookup table for executable binary code upon determining that the predicted translated return target address is incorrect and executing the executable binary code to perform a binary translation.
    Type: Grant
    Filed: December 18, 2014
    Date of Patent: May 30, 2017
    Assignee: Intel Corporation
    Inventors: Koichi Yamada, Ashish Bijlani, Jiwei Lu, Cheng Yan Zhao
  • Patent number: 9569220
    Abstract: A processor uses a prediction unit to predict subsequent instructions of a program to be executed by the processor. Many implementations or combinations of implementations may be used to predict the subsequent instruction of the program. In one embodiment, a branch cache is used to store branch information. A prediction table is used to store prediction information based on the branch. A prediction logic module determines whether a branch is taken or not taken based on the branch information stored in the branch cache and the prediction information stored in the prediction table.
    Type: Grant
    Filed: October 6, 2014
    Date of Patent: February 14, 2017
    Assignee: Synopsys, Inc.
    Inventor: Eino Jacobs
  • Patent number: 9477594
    Abstract: A system-in-package semiconductor device with a CPU, a first flash memory configured to store first instructions to be executed by the CPU, and a second flash memory configured to store second instructions to be executed in accordance with a predetermined control instruction included in the first instructions. The semiconductor device determines, prior to the CPU executing the instruction, whether an instruction read out from the first flash memory is a branch instruction, and if it is determined to be the branch instruction, causes the second flash memory to perforin read-out operation using a branch destination address value indicated by the branch instruction, and if a value of a program counter of the CPU matches the branch destination address value, while the second flash memory is in a state of being ready for read-out operation in accordance with the instruction, starts reading out the second instructions from the second flash memory.
    Type: Grant
    Filed: March 13, 2015
    Date of Patent: October 25, 2016
    Assignee: MegaChips Corporation
    Inventor: Takao Kusano
  • Patent number: 9442729
    Abstract: A processing device implementing minimizing bandwidth to track return targets by an instruction tracing system is disclosed. A processing device of the disclosure an instruction fetch unit comprising a return stack buffer (RSB) to predict a target address of a return (RET) instruction corresponding to a call (CALL) instruction. The processing device further includes a retirement unit comprising an instruction tracing module to initiate instruction tracing for instructions executed by the processing device, determine whether the target address of the RET instruction was mispredicted, determine a value of call depth counter (CDC) maintained by the instruction tracing module, and when the target address of the RET instruction was not mispredicted and when the value of the CDC is greater than zero, generate an indication that the RET instruction branches to a next linear instruction after the corresponding CALL instruction.
    Type: Grant
    Filed: May 9, 2013
    Date of Patent: September 13, 2016
    Assignee: Intel Corporation
    Inventors: Beeman C. Strong, Matthew C. Merten, Tong Li
  • Patent number: 9122485
    Abstract: The described embodiments include a processor that executes a vector instruction. In the described embodiments, while dispatching instructions at runtime, the processor encounters a dependency-checking instruction. Upon determining that a result of the dependency-checking instruction is predictable, the processor dispatches a prediction micro-operation associated with the dependency-checking instruction, wherein the prediction micro-operation generates a predicted result vector for the dependency-checking instruction. The processor then executes the prediction micro-operation to generate the predicted result vector. In the described embodiments, when executing the prediction micro-operation to generate the predicted result vector, if a predicate vector is received, for each element of the predicted result vector for which the predicate vector is active, otherwise, for each element of the predicted result vector, the processor sets the element to zero.
    Type: Grant
    Filed: April 19, 2011
    Date of Patent: September 1, 2015
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 9098295
    Abstract: The described embodiments provide a processor that executes vector instructions. In the described embodiments, while dispatching instructions at runtime, the processor encounters an Actual instruction. Upon determining that a result of the Actual instruction is predictable, the processor dispatches a prediction micro-operation associated with the Actual instruction, wherein the prediction micro-operation generates a predicted result vector for the Actual instruction. The processor then executes the prediction micro-operation to generate the predicted result vector. In the described embodiments, when executing the prediction micro-operation to generate the predicted result vector, if the predicate vector is received, for each element of the predicted result vector for which the predicate vector is active, otherwise, for each element of the predicted result vector, generating the predicted result vector comprises setting the element of the predicted result vector to true.
    Type: Grant
    Filed: April 20, 2011
    Date of Patent: August 4, 2015
    Assignee: APPLE INC.
    Inventor: Jeffry E. Gonion
  • Patent number: 9087095
    Abstract: Database processing using columns to present to a processing unit decompressed column data without changing the underlying row-based database architecture. For some embodiments, a database accelerator is used to efficiently process the columns of a database and output tuples to a processing unit's memory, such that the columns can be quickly processed (with the advantages of a column-based architecture) to create tuples of requested data, but without having to depart from a row-based architecture at the processing unit level or having decompressed data scattered throughout the processing unit's memory.
    Type: Grant
    Filed: June 21, 2012
    Date of Patent: July 21, 2015
    Assignee: International Business Machines Corporation
    Inventors: Jason A. Viehland, John S. Yates, Jr.
  • Patent number: 9043773
    Abstract: Techniques for implementing identification and management of unsafe optimizations are disclosed. A method of the disclosure includes receiving, by a managed runtime environment (MRE) executed by a processing device, a notice of misprediction of optimized code, the misprediction occurring during a runtime of the optimized code, determining, by the MRE, whether a local misprediction counter (LMC) associated with a code region of the optimized code causing the misprediction exceeds a local misprediction threshold (LMT) value, and when the LMC exceeds the LMT value, compiling, by the MRE, native code of the optimized code to generate a new version of the optimized code, wherein the code region in the new version of the optimized code is not optimized.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 26, 2015
    Assignee: Intel Corporation
    Inventors: Alejandro M. Vicente, Joseph M. Codina, Christos E. Kotselidis, Carlos Madriles, Raul Martinez
  • Patent number: 9037835
    Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: May 19, 2015
    Assignee: ARM Limited
    Inventors: Ganesh Suryanarayan Dasika, Rune Holm
  • Patent number: 9021241
    Abstract: Embodiments provide methods, apparatus, systems, and computer readable media associated with predicting predicates and branch targets during execution of programs using combined branch target and predicate predictions. The predictions may be made using one or more prediction control flow graphs which represent predicates in instruction blocks and branches between blocks in a program. The prediction control flow graphs may be structured as trees such that each node in the graphs is associated with a predicate instruction, and each leaf associated with a branch target which jumps to another block. During execution of a block, a prediction generator may take a control point history and generate a prediction. Following the path suggested by the prediction through the tree, both predicate values and branch targets may be predicted. Other embodiments may be described and claimed.
    Type: Grant
    Filed: June 18, 2010
    Date of Patent: April 28, 2015
    Assignee: The Board of Regents of The University of Texas System
    Inventors: Douglas C. Burger, Stephen W. Keckler
  • Patent number: 9020616
    Abstract: In a microcomputer, by virtue of the function of one input signal judgment module in the application layer, with respect to whether the situation is such that operation is to be requested to a controlled object from each of a plurality of applications, judgment processing onto input signals representing status information of controlled objects or detection information from sensors or the like is made common. The object-oriented architecture is introduced into an embedded computer program so that the memory is saved and the apparatus is simplified.
    Type: Grant
    Filed: March 8, 2010
    Date of Patent: April 28, 2015
    Assignees: Autonetworks Technologies, Ltd., Sumitomo Wiring Systems, Ltd., Sumitomo Electric Industries, Ltd.
    Inventors: Yuri Kishita, Kazuhito Fujita
  • Patent number: 9015720
    Abstract: A system and method to optimize processor performance and minimizing average thread latency by selectively loading a cache when a program state, resources required for execution of a program or the program itself change, is described. An embodiment of the invention supports a “cache priming program” that is selectively executed for a first thread/program/sub-routine of each process. Such a program is optimized for situations when instructions and other program data are not yet resident in cache(s), and/or whenever resources required for program execution or the program itself changes. By pre-loading the cache with two resources required for two instructions for only a first thread, average thread latency is reduced because the resources are already present in the cache.
    Type: Grant
    Filed: January 6, 2009
    Date of Patent: April 21, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Andrew Brown, Brian Emberling
  • Patent number: 9009734
    Abstract: One or more embodiments of the invention is a computer-implemented method for speculatively executing application event responses. The method includes the steps of identifying one or more event responses that could be issued for execution by an application being executed by a master process, for each event response, generating a child process to execute the event response, determining that a first event response included in the one or more event responses has been issued for execution by the application, committing the child process associated with the first event response as a new master process, and aborting the master process and all child processes other than the child process associated with the first event response.
    Type: Grant
    Filed: March 6, 2012
    Date of Patent: April 14, 2015
    Assignee: AUTODESK, Inc.
    Inventor: Francesco Iorio
  • Patent number: 8978022
    Abstract: Embodiments include systems and methods for reducing instruction cache miss penalties during application execution. Application code is profiled to determine “hot” code regions likely to experience instruction cache miss penalties. The application code can be linearized into a set of traces that include the hot code regions. Embodiments traverse the traces in reverse, keeping track of instruction scheduling information, to determine where an accumulated instruction latency covered by the code blocks exceeds an amount of latency that can be covered by prefetching. Each time the accumulated latency exceeds the amount of latency that can be covered by prefetching, a prefetch instruction can be scheduled in the application code. Some embodiments insert additional prefetches, merge prefetches, and/or adjust placement of prefetches to account for scenarios, such as loops, merging or forking branches, edge confidence values, etc.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: March 10, 2015
    Assignee: Oracle International Corporation
    Inventors: Spiros Kalogeropulos, Partha Tirumalai
  • Publication number: 20140372735
    Abstract: The invention relates to the method of prefetching instruction in micro-processor buffer under software controls.
    Type: Application
    Filed: June 14, 2013
    Publication date: December 18, 2014
    Inventors: Muhammmad Yasir Qadri, Nadia Nawaz Qadri, Klaus Dieter McDonald-Maier
  • Publication number: 20140372734
    Abstract: A processor, a method and a computer-readable medium for recording branch addresses are provided. The processor comprises hardware registers and first and second circuitry. The first circuitry is configured to store a first address associated with a branch instruction in the hardware registers. The first circuitry is further configured to store a second address that indicates where the processor execution is redirected to as a result of the branch instruction in the hardware registers. The second circuitry is configured to, in response to a second instruction, retrieve a value of at least one of the registers. The second instruction can be a user-level instruction.
    Type: Application
    Filed: June 12, 2013
    Publication date: December 18, 2014
    Inventors: Joseph Lee Greathouse, Anton Chernoff
  • Patent number: 8886920
    Abstract: A processor configured to facilitate transfer and storage of predicted targets for control transfer instructions (CTIs). In certain embodiments, the processor may be multithreaded and support storage of predicted targets for multiple threads. In some embodiments, a CTI branch target may be stored by one element of a processor and a tag may indicate the location of the stored target. The tag may be associated with the CTI rather than associating the complete target address with the CTI. When the CTI reaches an execution stage of the processor, the tag may be used to retrieve the predicted target address. In some embodiments using a tag to retrieve a predicted target, CTI instructions from different processor threads may be interleaved without affecting retrieval of predicted targets.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: November 11, 2014
    Assignee: Oracle International Corporation
    Inventors: Christopher H. Olson, Manish K. Shah
  • Patent number: 8832384
    Abstract: A storage proxy receives different abstracted memory access requests that are abstracted from the original memory access requests from different sources. The storage proxy reconstructs the characteristics of the original memory access requests from the abstracted memory access requests and makes prefetch decisions based on the reconstructed characteristics. An inflight table is configured to identify contiguous address ranges formed by an accumulation of sub-address ranges used by different abstracted memory access requests. An operation table is configured to identify the number of times the contiguous address ranges are formed by the memory access operations. A processor is then configured to prefetch the contiguous address ranges for certain corresponding read requests.
    Type: Grant
    Filed: July 29, 2010
    Date of Patent: September 9, 2014
    Assignee: Violin Memory, Inc.
    Inventor: Erik de la Iglesia
  • Publication number: 20140195788
    Abstract: Embodiments include systems and methods for reducing instruction cache miss penalties during application execution. Application code is profiled to determine “hot” code regions likely to experience instruction cache miss penalties. The application code can be linearized into a set of traces that include the hot code regions. Embodiments traverse the traces in reverse, keeping track of instruction scheduling information, to determine where an accumulated instruction latency covered by the code blocks exceeds an amount of latency that can be covered by prefetching. Each time the accumulated latency exceeds the amount of latency that can be covered by prefetching, a prefetch instruction can be scheduled in the application code. Some embodiments insert additional prefetches, merge prefetches, and/or adjust placement of prefetches to account for scenarios, such as loops, merging or forking branches, edge confidence values, etc.
    Type: Application
    Filed: January 10, 2013
    Publication date: July 10, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Spiros KALOGEROPULOS, Partha TIRUMALAI
  • Publication number: 20140164748
    Abstract: The present application describes a method and apparatus for prefetching instructions based on predicted branch target addresses. Some embodiments of the method include providing a second cache line to a second cache when a target address for a branch instruction in a first cache line of a first cache is included in the second cache line of the first cache and when the second cache line is not resident in the second cache.
    Type: Application
    Filed: December 11, 2012
    Publication date: June 12, 2014
    Inventor: James D. Dundas
  • Patent number: 8725992
    Abstract: A programming language may include hint instructions that may notify a programming idiom accelerator that a programming idiom is coming. An idiom begin hint exposes the programming idiom to the programming idiom accelerator. Thus, the programming idiom accelerator need not perform pattern matching or other forms of analysis to recognize a sequence of instructions. Rather, the programmer may insert idiom hint instructions, such as an idiom begin hint, to expose the idiom to the programming idiom accelerator. Similarly, an idiom end hint may mark the end of the programming idiom.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 13, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20140122846
    Abstract: An integrated circuit 2 incorporates prefetch circuitry 12 for prefetching program instructions from a memory 6. The prefetch circuitry 12 includes a branch target address cache 28. The branch target address cache 28 stores data indicative of branch target addresses of previously encountered branch instructions fetched from the memory 6. For each previously encountered branch instructions, the branch target address cache stores a tag value indicative of a fetch address of that previously encountered branch instruction. The tag values stored are generated by tag value generating circuitry 32 which performs a hashing function upon a portion of the fetch address such that the tag value has a bit length less than the bit length of the portion of the fetch address concerned.
    Type: Application
    Filed: October 31, 2012
    Publication date: May 1, 2014
    Applicant: ARM LIMITED
    Inventors: Vladimir VASEKIN, Allan John SKILLMAN, Chiloda Ashan Senerath PATHIRANE, Jean-Baptiste BRELOT
  • Patent number: 8694759
    Abstract: A method and apparatus to utilize a branch prediction scheme that limits the expenditure of power and the area consumed caused by branch prediction schemes is provided. The method includes accessing a first entry and a second entry of the data structure, wherein each entry stores a portion of a predicted target address, determining the predicted target address using the portion of the predicted target address stored in the first entry and a portion of a branch address of a fetched branch instruction for a fetched branch instruction of a first type, and determining the predicted target address using the portion of the predicted target address stored in the first entry and the portion of the predicted target address stored in the second entry for a fetched branch instruction of a second type.
    Type: Grant
    Filed: November 12, 2010
    Date of Patent: April 8, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: James D. Dundas, Marvin A. Denman
  • Patent number: 8667257
    Abstract: Techniques are disclosed relating to improving the performance of branch prediction in processors. In one embodiment, a processor is disclosed that includes a branch prediction unit configured to predict a sequence of instructions to be issued by the processor for execution. The processor also includes a pattern detection unit configured to detect a pattern in the predicted sequence of instructions, where the pattern includes a plurality of predicted instructions. In response to the pattern detection unit detecting the pattern, the processor is configured to switch from issuing instructions predicted by the branch prediction unit to issuing the plurality of instructions. In some embodiments, the processor includes a replay unit that is configured to replay fetch addresses to an instruction fetch unit to cause the plurality of predicted instructions to be issued.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: March 4, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ravindra N. Bhargava, David Suggs, Anthony X. Jarvis
  • Patent number: 8650364
    Abstract: A processing device includes a memory and a processor that generates a plurality of read commands for reading read data from the memory and a plurality of write commands for writing write data to the memory. A prefetch memory interface prefetches prefetch data to a prefetch buffer, retrieves the read data from the prefetch buffer when the read data is included in the prefetch buffer, and retrieves the read data from the memory when the read data is not included in the prefetch buffer, wherein the prefetch buffer is managed via a linked list.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: February 11, 2014
    Assignee: ViXS Systems, Inc.
    Inventor: Jing Zhang
  • Publication number: 20140019736
    Abstract: In accordance with some embodiments of the present invention, a branch prediction unit for an embedded controller may be placed in association with the instruction fetch unit instead of the decode stage. In addition, the branch prediction unit may include no branch predictor. Also, the return address stack may be associated with the instruction decode stage and is structurally separate from the branch prediction unit. In some cases, this arrangement reduces the area of the branch prediction unit, as well as power consumption.
    Type: Application
    Filed: December 30, 2011
    Publication date: January 16, 2014
    Inventors: Xiaowei Jiang, Srihari Makineni, Zhen Fang, Dmitri Pavlov, Ravi Iyer