Reducing An Impact Of A Stall Or Pipeline Bubble Patents (Class 712/219)
  • Patent number: 11853762
    Abstract: Systems, apparatuses and methods are disclosed for efficient management of registers in a graph stream processing (GSP) system. The GSP system includes a thread scheduler module operative to initiate a Single Instruction Multiple Data (SIMD) thread, the SIMD thread including a dispatch mask with an initial value. A thread arbiter module operative to select an instruction from the instructions and provide the instruction to each of one or more compute resources, and an instruction iterator module, associated with the each of one or more compute resources operative to determine a data type of the instruction. The instruction iterator module iteratively executes the instruction based on the data type and the dispatch mask.
    Type: Grant
    Filed: May 20, 2022
    Date of Patent: December 26, 2023
    Assignee: Blaize, Inc.
    Inventors: Kamaraj Thangam, Srinivasulu Nagisetty, Venkata Divya Bharathi Palaparthy, Aswathy Asok, Satyaki Koneru
  • Patent number: 11645083
    Abstract: A system and method for reducing pipeline latency. In one embodiment, a processing system includes a processing pipeline. The processing pipeline includes a plurality of processing stages. Each stage is configured to further processing provided by a previous stage. A first of the stages is configured to perform a first function in a pipeline cycle. A second of the stages is disposed downstream of the first of the stages, and is configured to perform, in a pipeline cycle, a second function that is different from the first function. The first of the stages is further configured to selectably perform the first function and the second function in a pipeline cycle, and bypass the second of the stages.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: May 9, 2023
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Christian Wiencke, Shrey Sudhir Bhatia, Jeroen Vilegen
  • Patent number: 11243775
    Abstract: In one embodiment, an apparatus includes: a plurality of registers; a first instruction queue to store first instructions; a second instruction queue to store second instructions; a program order queue having a plurality of portions each associated with one of the plurality of registers, each of the portions having entries to store a state of an instruction, the state comprising an encoding of a use of the register by the instruction and a source instruction queue for the instruction; and a dispatcher to dispatch for execution the first and second instructions from the first and second instruction queues based at least in part on information stored in the program order queue, to manage instruction dependencies between the first instructions and the second instructions. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: February 8, 2022
    Assignee: Intel Corporation
    Inventors: Andrey Ayupov, Srikanth T. Srinivasan, Jonathan D. Pearce, David B. Sheffield
  • Patent number: 11231925
    Abstract: A data processing device has an instruction decoder, a control logic unit, and ALU. The instruction decoder decodes instruction codes of an arithmetic instruction. The control logic unit detects the effective data width of operation data to be processed according to the decode result from the instruction decoder and determines the number of cycles for the instruction execution corresponding to the effective, data width. The ALU executes the instruction with the number of cycles of the instruction execution determined by the control logic unit.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: January 25, 2022
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventors: Sugako Ohtani, Hiroyuki Kondo
  • Patent number: 11201622
    Abstract: The invention provides an apparatus comprising a programmable circuit including a plurality of 2-input 1-output ALUs, and an updating unit updating the programmable circuit according to circuit information, wherein each of the ALUs includes a calculation unit which performs a set type of calculation for two data and output a calculation result, a delay unit which delays the two input data in accordance with delay amounts independently set and supplies the delayed data to the calculation unit, and a controller which controls a delay amount for the delay unit and a calculation timing for the calculation unit in accordance with externally set information, wherein the updating unit sets clock gating start timings for a plurality of delay elements of the delay unit if an ALU of interest as a first processing circuit in the programmable circuit inputs final data to be processed.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: December 14, 2021
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Kazuma Sakato, Yohei Horikawa
  • Patent number: 11150874
    Abstract: Methods and systems that facilitate automatic generation of Application Programming Interface (API) specification from web traffic. Methods include obtaining a plurality of API requests and responses to the plurality of API requests. Methods include processing these API requests and responses to API requests to identify one or more attributes, such as, for example, variables, query parameters, response status codes, and response schemas. Methods include identifying variables using a tree data structure to represent resource paths. Methods include identifying query parameters based on resource items in resource paths. Methods include determining that the API call does not conform to the API specification by comparing one or more attributes of the API call with the attributes of the API specification.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: October 19, 2021
    Assignee: Google LLC
    Inventors: Alex David Lester, Sibo Liu, Che Liu, Jared Scott Borner, Andrew Marsh Gardiner, Matthew Symonds, Kenneth Chan, Michael Christopher Yara, Terrence Li, Joy Aloysius Thomas, Sri Harsha Vardhan Reddy Chevuru, Tsenguun Tsogbadrakh
  • Patent number: 11080059
    Abstract: A method for reducing firmware size and increasing firmware performance. Core timing control conditions used by a die controller are converted into production ready core timing control conditions, from which firmware instructions are then generated. The production ready core timing control conditions comprise a plurality of fixed core timing control conditions. The firmware instructions are modified to determine core timing control condition values for fixed core timing control conditions before implementing storage operations, to store the core timing control condition values in global condition registers, and to modify references to fixed core timing control conditions to access the values in those global condition registers. Finally, the modified firmware instructions are stored on the die controller, which comprises a microcontroller configured to execute them.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: August 3, 2021
    Assignee: SanDisk Technologies LLC
    Inventors: Sonam Agarwal, Vijay Sukhlal Chinchole, Pavithra Devaraj
  • Patent number: 10963249
    Abstract: A processor, system and/or techniques are disclosed for prefetching data streams in a processor. A prefetcher issues a plurality of requests to pre-fetch data from a stream in a plurality of streams; evaluates a confidence level of at least the first request based on an amount of confirmations observed in the stream; and assigns at least a first more aggressive prefetching ramping mode or a second less aggressive prefetching ramping mode based upon the confidence level of a thread associated with the prefetch request, wherein the prefetcher has one or more probationary states and is configured to transition between the first and second prefetching ramp mode by entering at least one of the probation states wherein the prefetcher continues to operate in the first prefetching ramp mode. In another aspect, the prefetcher may transition to the one or more probation states after a number of cycles.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: March 30, 2021
    Assignee: International Business Machines Corporation
    Inventors: Mohit Karve, Vivek Britto, George W. Rohrbaugh, III, Brian W. Thompto
  • Patent number: 10956166
    Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: March 23, 2021
    Assignees: Arm Limited, The Regents of The University of Michigan
    Inventors: Vaibhav Gogte, Wei Wang, Stephan Diestelhorst, Peter M Chen, Satish Narayanasamy, Thomas Friedrich Wenisch
  • Patent number: 10838731
    Abstract: Branch prediction methods and systems include, for a branch instruction fetched by a processor, indexing a branch identification (ID) table based on a function of a program counter (PC) value of the branch instruction, wherein each entry of the branch ID table comprises at least a tag field, and an accuracy counter. For a tag hit at an entry indexed by the PC value, if a value of the corresponding accuracy counter is greater than or equal to zero, a prediction counter from a prediction counter pool is selected based on a function of the PC value and a load-path history, wherein the prediction counters comprise respective confidence values and prediction values. A memory-dependent branch prediction of the branch instruction is assigned as the prediction value of the selected prediction counter if the associated confidence value is greater than zero, while branch prediction from a conventional branch predictor is overridden.
    Type: Grant
    Filed: September 19, 2018
    Date of Patent: November 17, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Rami Mohammad A. Al Sheikh, Michael Scott McIlvaine, Robert Douglas Clancy, Derek Hower
  • Patent number: 10795695
    Abstract: A method and system that timely processes webpage windows after a window is in an unresponsive stage, a parent window of the webpage window will not be affected and closed, thereby reducing operating resources. The method includes creating a child window corresponding to a parent window, a proxy window corresponding to the child, setting a parent of the child to be the proxy, and setting a parent of the proxy to be the parent window, where a thread to which the proxy window belongs communicates with a thread to which the child window belongs by using an asynchronous message, determining based on the proxy thread, a state of the child window and in response to determining that the child window is unresponsive, setting the proxy window to have no parent and removing the child window from a current display by removing the proxy window.
    Type: Grant
    Filed: March 28, 2017
    Date of Patent: October 6, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Zi Feng Shang
  • Patent number: 10776127
    Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.
    Type: Grant
    Filed: October 3, 2018
    Date of Patent: September 15, 2020
    Assignee: Micron Technology, Inc.
    Inventors: Neal Andrew Crook, Alan T. Wootton, James Peterson
  • Patent number: 10740102
    Abstract: An apparatus includes an execution unit, an instruction queue, and a control circuit. The control circuit may be configured to activate a plurality of processor threads. Each of the plurality of processor threads may include a respective plurality of instructions. The instruction queue may be configured to issue at least one instruction included in the plurality of processor threads to the execution unit at a first rate. The control circuit may also be configured to track, for a particular processor thread, a period of time from activating the particular processor thread. The instruction queue may be further configured to limit issue of a next instruction for at least one other processor thread to a second rate, based on a comparison of the period of time to a threshold amount of time. The second rate may be lower than the first rate.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: August 11, 2020
    Assignee: Oracle International Corporation
    Inventors: Munsefar Khaleque, Nathan Sheeley, Mark Greenberg, Matthew Smittle, Paul Jordan
  • Patent number: 10691527
    Abstract: A system on chip (SoC) includes a bus matrix configured to connect a plurality of functional blocks. A monitoring unit is configured to monitor whether a transaction between the functional blocks has a hang or stall and distinguish a functional block that caused a hang or stall from among the functional blocks. A recovery signal generation unit is configured to provide a recovery signal for releasing the hang or stall to at least one of the functional blocks based on the distinguishing by the monitoring unit.
    Type: Grant
    Filed: November 22, 2017
    Date of Patent: June 23, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sueng-Chul Ryu, Bo-Eok Seo
  • Patent number: 10620952
    Abstract: A Set Boolean machine instruction is provided that has associated therewith a result location to be used for a set Boolean operation and a mask. The mask is configured to test a plurality of types of conditions, including simple conditions and composite conditions. The machine instruction is executed, and the executing includes performing a first logical operation between the mask and contents of a selected field to obtain an output. The mask indicates a condition to be tested, and the condition is one type of condition of the plurality of types of conditions. The executing further includes performing a second logical operation on the output to obtain a first value represented as one data type, and placing a result in the result location based on the first value. The result including a second a value of another data type, the other data type being different from the one data type.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: April 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Brett Olsson
  • Patent number: 10622997
    Abstract: An asynchronous circuit which includes a first circuit suitable for receiving, from a first other circuit, a first data input signal, and for generating a first acknowledgement of receipt signal and a first data output signal; a second circuit suitable for receiving, from a second other circuit, a second data input signal, and for generating a second acknowledgement of receipt signal and a second data output signal, the second circuit being functionally equivalent to the first circuit; a comparator suitable for detecting an inconsistency between the first and second data input or output signals; and at least one circuit for pausing an acknowledgement of receipt suitable for preventing the propagation of the first and second acknowledgement of receipt signals towards the first and second other circuits if an inconsistency is detected by the comparator.
    Type: Grant
    Filed: March 10, 2017
    Date of Patent: April 14, 2020
    Assignees: Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Centre National de la Recherche Scientifique
    Inventors: Jeremy Lopes, Grégory Di Pendina
  • Patent number: 10579776
    Abstract: Various aspects of the present disclosed technology relate to techniques for selective conditional stall for speeding up hardware-based circuit verification. A path-breaking circuit device is inserted into a location of a design path configured to generate a stall signal indicating whether a change of signal between a pair of neighboring clock cycles of a clock signal is detected at the location. The stall signal is used to directly or indirectly suppress, when the change of signal between the pair of neighboring clock cycles is detected, the next state updating for state element models in the hardware model of circuit design. The design path is usually the critical design path. The insertion location is usually selected to be a location where the signal does not change frequently.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: March 3, 2020
    Assignee: Mentor Graphics Corporation
    Inventors: Charles W. Selvidge, Ansuman Prusty, Vipul Kulshrestha, Kenneth W. Crouch, Matthew L. Dahl, Laurent Vuillemin
  • Patent number: 10430912
    Abstract: A GPU may be configured to detect and nullify unnecessary instructions. Nullifying unnecessary instructions include overwriting a detected unnecessary instruction with a no operation (NOP) instruction. In another example, nullifying unnecessary instructions may include writing a value to a 1-bit instruction memory. Each bit of the 1-bit instruction memory may be associated with a particular instruction of the draw call. If the 1-bit instruction memory has a true value (e.g., 1), the GPU is configured to not execute the particular instruction.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: October 1, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Lin Chen
  • Patent number: 10379858
    Abstract: A device for executing conditional instructions is provided. The device includes one or more processors and a memory unit including a plurality of registers storing at least a predicate instruction and a conditional instruction, executable by the one or more processors. Execution of the conditional instructions is predicated on execution results of the predicate instruction. The one or more processors are configured to extract predicate-determining information of the predicate instruction and conditional instruction information of the conditional instruction; predict execution results for the predicate instruction and the conditional instruction based on the predicate-determining information and the conditional instruction information; and execute the predicate instruction and the conditional instruction in parallel, based on the predicted execution results for the predicate instruction and the conditional instruction.
    Type: Grant
    Filed: September 14, 2015
    Date of Patent: August 13, 2019
    Assignee: Spreadtrum Hong Kong Limited
    Inventor: Jeremy Branscome
  • Patent number: 10318245
    Abstract: A device for determining an inverse of an initial value related to a modulus, comprising a unit configured to process an iterative algorithm in a plurality of iterations, wherein an iteration includes two modular reductions and has, as an iteration loop result, values obtained by an iteration loop of an extended Euclidean algorithm.
    Type: Grant
    Filed: May 30, 2012
    Date of Patent: June 11, 2019
    Assignee: Infineon Technologies AG
    Inventor: Wieland Fischer
  • Patent number: 10310896
    Abstract: Various embodiments are generally directed to techniques for job flow processing, such as by ordering the performance of parallel tasks in a job flow to minimize a makespan for the job flow, for instance. Some embodiments are particularly directed to ordering the performance of tasks in a job flow based on computation of one or more independent and dependent metrics for tasks in a job flow. In many embodiments, tasks along a critical path of a job flow may be identified and prioritized using the one or more metrics computed for tasks in the job flow. For example, computing a time remaining until end and/or a longest path to end for each task in a job flow may enable a listing of tasks in the job flow to be ordered in a manner that prioritizes tasks to optimize the makespan for the job flow to be executed.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: June 4, 2019
    Assignee: SAS INSTITUTE INC.
    Inventors: John Michael Kichak, Edward L. Rowe, James Edward Georges, Daniel Thomas Kelly, Glenn Daniel Sidle, Charles Michael Cavalier
  • Patent number: 10169038
    Abstract: A delay facility is provided in which program execution may be delayed until a predefined event occurs, such as a comparison of memory locations results in a true condition, a timeout is reached, an interruption is made pending or another condition exists. The delay facility includes one or more compare and delay machine instructions used to delay execution. The one or more compare and delay instructions may include a 32-bit compare and delay (CAD) instruction and a 64-bit compare and delay (CADG) instruction.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: January 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles W. Gainey, Jr., Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
  • Patent number: 10120681
    Abstract: A delay facility is provided in which program execution may be delayed until a predefined event occurs, such as a comparison of memory locations results in a true condition, a timeout is reached, an interruption is made pending or another condition exists. The delay facility includes one or more compare and delay machine instructions used to delay execution. The one or more compare and delay instructions may include a 32-bit compare and delay (CAD) instruction and a 64-bit compare and delay (CADG) instruction.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: November 6, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles W. Gainey, Jr., Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
  • Patent number: 10114673
    Abstract: A method for scheduling the execution of a computer instruction, receive an entitlement processor resource percentage for a logical partition on a computer system. The logical partition is associated with a hardware thread of a processor of the computer system. The entitlement processor resource percentage for the logical partition is stored in a register of the hardware thread associated with the logical partition. An instruction is received from the logical partition of the computer system and the processor dispatches the instruction based on the entitlement processor resource percentage stored in the register of the hardware thread associated with the logical partition.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: October 30, 2018
    Assignee: International Business Machines Corporation
    Inventors: Nitin Gupta, Mehulkumar J. Patel, Deepak C. Shetty
  • Patent number: 9921850
    Abstract: A method for outputting alternative instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor. A frequently miss-predicted branch instruction is identified, wherein the predicted outcome of the branch instruction is frequently wrong. An alternative instruction sequence for the branch instruction target is stored into a buffer. On a subsequent hit to the branch instruction where the predicted outcome of the branch instruction was wrong, the alternative instruction sequence is output from the buffer.
    Type: Grant
    Filed: November 16, 2016
    Date of Patent: March 20, 2018
    Assignee: INTEL CORPORATION
    Inventor: Mohammad Abdallah
  • Patent number: 9824413
    Abstract: Methods and apparatus relating to sort-free threading model for a multi-threaded graphics pipeline are described. In an embodiment, draw requests, corresponding to one or more primitives in an image, are stored in entries of a queue (e.g., in the order received). Each entry remains locked until both a front-end and a back-end of a graphics pipeline have completed one or more operations associated with the draw request. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: November 15, 2014
    Date of Patent: November 21, 2017
    Assignee: Intel Corporation
    Inventors: Jason M. Surprise, Zack S. Waters
  • Patent number: 9817669
    Abstract: A computer processor includes execution logic (having a number of functional units) configured to perform operations that access operand data values stored in a plurality of operand storage elements. Such operand data values include a predefined None operand data value indicative of a missing operand value. The operations include a RETIRE operation specifying a number of operand data values that is intended to be retired in a predefined machine cycle. During execution of the RETIRE operation, zero or more at None operand data values are selectively retired in the predefined machine cycle based on the number of operand data values specified by the RETIRE operation and the number of operand data values to be retired as a result of execution of other operations by the execution logic in the predefined machine cycle. Other aspects and software tools are also described and claimed.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: November 14, 2017
    Assignee: Mill Computing, Inc.
    Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
  • Patent number: 9747165
    Abstract: Systems and methods for recovering a process in an application are disclosed. According to some aspects, a guest process is run within an application executing at a computing device. The guest process stores and processes untrusted content. An embedder process is run within the application and in parallel with the guest process. The embedder process stores and processes trusted content and a guest process state. The guest process state is periodically updated based on asynchronous communication between the guest process and the embedder process. The embedder process receives an indication of an execution failure of the guest process. The guest process is recovered after the execution failure based on the guest process state stored by the embedder process.
    Type: Grant
    Filed: April 23, 2014
    Date of Patent: August 29, 2017
    Assignee: GOOGLE INC.
    Inventor: Fady Samuel
  • Patent number: 9733945
    Abstract: Systems, methods and computer program product provide for pipelining out-of-order instructions. Embodiments comprise an instruction reservation station for short instructions of a short latency type and long instructions of a long latency type, an issue queue containing at least two short instructions of a short latency type, which are to be chained to match a latency of a long instruction of a long latency type, a register file, at least one execution pipeline for instructions of a short latency type and at least one execution pipeline for instructions of a long latency type; wherein results of the at least one execution pipeline for instructions of the short latency type are written to the register file, preserved in an auxiliary buffer, or forwarded to inputs of said execution pipelines. Data of the auxiliary buffer are written to the register file.
    Type: Grant
    Filed: March 2, 2016
    Date of Patent: August 15, 2017
    Assignee: International Business Machines Corporation
    Inventors: Harry Barowski, Tim Niggemeier
  • Patent number: 9733944
    Abstract: A method for outputting reliably predictable instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor, and out of that set, identifying a branch instruction having a series of subsequent frequently executed branch instructions that form a reliably predictable instruction sequence. The reliably predictable instruction sequence is stored into a buffer. On a subsequent hit to the branch instruction, the reliably predictable instruction sequence is output from the buffer.
    Type: Grant
    Filed: October 12, 2011
    Date of Patent: August 15, 2017
    Assignee: INTEL CORPORATION
    Inventor: Mohammad Abdallah
  • Patent number: 9678755
    Abstract: A method for outputting alternative instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor. A frequently miss-predicted branch instruction is identified, wherein the predicted outcome of the branch instruction is frequently wrong. An alternative instruction sequence for the branch instruction target is stored into a buffer. On a subsequent hit to the branch instruction where the predicted outcome of the branch instruction was wrong, the alternative instruction sequence is output from the buffer.
    Type: Grant
    Filed: October 12, 2011
    Date of Patent: June 13, 2017
    Assignee: Intel Corporation
    Inventor: Mohammad Abdallah
  • Patent number: 9639368
    Abstract: Branch prediction using a correlating event, such as an unconditional branch that calls a routine including the branch, instead of the branch itself, to predict the behavior of the branch. The circumstances in which the branch is employed, and not the actual branch itself, is used to predict how strongly taken or not taken the branch is to behave. An anchor point associated with the branch (e.g., an address of the instruction calling a routine that includes the branch), an address of the branch, and a value that represents the number of selected branch instructions between the anchor point and the branch are used to select information to be used to predict the direction of the branch.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: May 2, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Richard J. Moore, Brian R. Prasky
  • Patent number: 9632789
    Abstract: Branch prediction using a correlating event, such as an unconditional branch that calls a routine including the branch, instead of the branch itself, to predict the behavior of the branch. The circumstances in which the branch is employed, and not the actual branch itself, is used to predict how strongly taken or not taken the branch is to behave. An anchor point associated with the branch (e.g., an address of the instruction calling a routine that includes the branch), an address of the branch, and a value that represents the number of selected branch instructions between the anchor point and the branch are used to select information to be used to predict the direction of the branch.
    Type: Grant
    Filed: November 25, 2014
    Date of Patent: April 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Richard J. Moore, Brian R. Prasky
  • Patent number: 9594684
    Abstract: A method for temporarily storing data and a storage device is provided. The method for temporarily storing data is applied to the storage device, and the storage device includes a source agent and a target agent. The method includes: sending, by the source agent, a data obtaining request to the target agent; receiving, by the source agent, target data that is corresponding to the data obtaining request and is returned by the target agent; determining, by the source agent, whether a snooping request that is for the target data and sent by the target agent is received after the data obtaining request is sent and before the target data is received, where the snooping request indicates that the target agent is simultaneously processing an obtaining request from another source agent for the target data; and if the snooping request is received, discarding, by the source agent, the target data.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: March 14, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Kejia Lan, Yongbo Cheng, Chenghong He
  • Patent number: 9594861
    Abstract: An improved approach is provided to implement equivalency checking. A check is performed as to whether two designs are equivalent without needing to analyze their outputs on a cycle-by-cycle basis. Instead, the two designs are checked to see if they are equivalent on the transaction-level. This approach abstracts the timing delays between the two designs, which allows verification of data transportation and transformation between the designs.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: March 14, 2017
    Assignee: Cadence Design Systems, Inc.
    Inventors: Antonio Celso Caldeira, Jr., Lawrence Chunkhang Loh, Marcus Vinicius da Mata Gomes
  • Patent number: 9529594
    Abstract: A multi-threaded processor configured to allocate entries in a buffer for instruction cache misses is disclosed. Entries in the buffer may store thread state information for a corresponding instruction cache miss for one of a plurality of threads executable by the processor. The buffer may include dedicated entries and dynamically allocable entries, where the dedicated entries are reserved for a subset of the plurality of threads and the dynamically allocable entries are allocable to a group of two or more of the plurality of threads. In one embodiment, the dedicated entries are dedicated for use by a single thread and the dynamically allocable entries are allocable to any of the plurality of threads. The buffer may store two or more entries for a given thread at a given time. In some embodiments, the buffer may help ensure none of the plurality of threads experiences starvation with respect to instruction fetches.
    Type: Grant
    Filed: November 30, 2010
    Date of Patent: December 27, 2016
    Assignee: Oracle International Corporation
    Inventors: Manish K. Shah, Jama I. Barreh
  • Patent number: 9489206
    Abstract: A method includes suppressing execution of at least one dependent instruction of a first instruction by a processor responsive to an invalid status of an ancestor load instruction associated with the first instruction. A processor includes an instruction pipeline having an execution unit to execute instructions, a load store unit for retrieving data from a memory hierarchy, and a scheduler unit. The scheduler unit selects for execution in the execution unit a first load instruction having at least one dependent instruction linked to the first load instruction for data forwarding from the load store unit and suppresses execution of a second dependent instruction of the first dependent instruction responsive to an invalid status of the first load instruction.
    Type: Grant
    Filed: July 16, 2013
    Date of Patent: November 8, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Francesco Spadini, Michael Achenbach, Emil Talpes, Ganesh Venkataramanan
  • Patent number: 9448879
    Abstract: An apparatus and method are described for detecting and correcting instruction fetch errors within a processor core. For example, in one embodiment, an instruction processing apparatus for detecting and recovering from instruction fetch errors comprises, the instruction processing apparatus performing the operations of: detecting an error associated with an instruction in response to an instruction fetch operation; and determining if the instruction is from a speculative access, wherein if the instruction is not from a speculative access, then responsively performing one or more operations to ensure that the error does not corrupt an architectural state of the processor core.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: September 20, 2016
    Assignee: INTEL CORPORATION
    Inventors: Theodros Yigzaw, Oded Lempel, Hisham Shafi, Geeyarpuram N. Santhanakrishnan, Jose A. Vargas, Ganapati N Srinivasa, Mohan J Kumar, Larisa Novakovsky, Lihu Rappoport, Chen Koren, Julius Mandelblat, Michael Mishaeli
  • Patent number: 9430280
    Abstract: Methods and systems for task timeouts as a function of input data size are disclosed. A definition of a task is received. The definition of the task indicates a set of input data for the task. A timeout duration for the task is determined based on the set of input data. The timeout duration varies with one or more characteristics of the set of input data. The execution of the task is initiated. The execution of the task is stopped if the execution of the task exceeds the timeout duration.
    Type: Grant
    Filed: February 11, 2013
    Date of Patent: August 30, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Kathryn Marie Shih, Carl Louis Christofferson, Richard Jeffrey Cole, Peter Sirota, Vaibhav Aggarwal
  • Patent number: 9424441
    Abstract: Disabling communication in a multiprocessor fabric. The multiprocessor fabric may include a plurality of processors and a plurality of communication elements and each of the plurality of communication elements may include a memory. A configuration may be received for the multiprocessor fabric, which specifies disabling of communication paths between one or more of: one or more processors and one or more communication elements; one or more processors and one or more other processors; or one or more communication elements and one or more other communication elements. Accordingly, the multiprocessor fabric may be automatically configured in hardware to disable the communication paths specified by the configuration. The multiprocessor fabric may be operated to execute a software application according to the configuration.
    Type: Grant
    Filed: October 2, 2014
    Date of Patent: August 23, 2016
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B. Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, David A. Gibson
  • Patent number: 9383999
    Abstract: An instruction decoder (14) is responsive to a conditional compare instruction to generate control signals for controlling processing circuitry (4) to perform a conditional compare operation. The conditional compare operation comprises: (i) if a current condition state of the processing circuitry (4) passes a test condition, then performing a compare operation on a first operand and a second operand and setting the current condition state to a result condition state generated during the compare operation; and (ii) if the current condition state fails the test condition, then setting the current condition state to a fail condition state specified by the conditional compare instruction. The conditional compare instruction can be used to represent chained sequences of comparison operations where each individual comparison operation may test a different kind of relation between a pair of operands.
    Type: Grant
    Filed: April 12, 2011
    Date of Patent: July 5, 2016
    Assignee: ARM Limited
    Inventors: David James Seal, Simon John Craske
  • Patent number: 9207943
    Abstract: In a particular embodiment, a method is disclosed that includes receiving an interrupt at a first thread, the first thread including a lowest priority thread of a plurality of executing threads at a processor at a first time. The method also includes identifying a second thread, the second thread including a lowest priority thread of a plurality of executing threads at a processor at a second time. The method further includes directing a subsequent interrupt to the second thread.
    Type: Grant
    Filed: March 17, 2009
    Date of Patent: December 8, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Erich James Plondke, Lucian Codrescu
  • Patent number: 9182986
    Abstract: An apparatus is described having an out-of-order instruction execution pipeline. The out-of-order execution pipeline has a first circuit and a second circuit. The first circuit is to hold a pointer to physical storage space where information is kept that cannot yet be confirmed as being free of potential dependencies on the information. The second circuit is to hold the pointer if the pointer existed in the first circuit when a non speculative region of program code ended and upon retirement of a following speculative overwriter instruction originally coded to overwrite the information.
    Type: Grant
    Filed: December 29, 2012
    Date of Patent: November 10, 2015
    Assignee: Intel Corporation
    Inventors: Ravi Rajwar, David Lim, James Hadley, Matthew Merten, Joseph McMahon, Yury Ilin, Justin Deinlein
  • Patent number: 9170785
    Abstract: Generating a parameter value for an executable statement includes, in a plurality of statements, identifying an input statement that provides input information and an output statement associated with the input statement; wherein the output statement comprises a reference to a temporary data set and another of the plurality of statements also includes a parameter reference to the temporary data set. The method also includes modifying the input information to produce modified input information; and outputting the modified input information to the temporary data set.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: October 27, 2015
    Assignee: CA, Inc.
    Inventor: Lukas Masek
  • Patent number: 9158579
    Abstract: A system and method for prioritized queues is provided. A plurality of queues are organized to enable long-running operations to be directed to a long running queue operation, while faster operations are directed to a non-long running operation queue. When an operation request is received, a determination is made whether it is a long-running operation, and, if so, the operation is placed in a long-running operation queue. When the processor core that is executing long-running operations is ready for the next operation, it removes an operation from the long-running operation queue and processes the operation.
    Type: Grant
    Filed: November 10, 2008
    Date of Patent: October 13, 2015
    Assignee: NetApp, Inc.
    Inventor: David Morgan Robles
  • Patent number: 9141547
    Abstract: An atomic transaction includes one or more memory access operations that are completed atomically. A Best-Effort Transaction (BET) system makes its best effort to complete each atomic transaction without guaranteeing completion of all atomic transactions. When an atomic transaction is aborted, BET may provide software with appropriate runtime information such as cause of the abortion. With proper coherence layer enhancements, BET can be implemented efficiently for multiprocessor systems, using caches as buffers for data accessed by atomic transactions. Furthermore, with appropriate fairness support, forward progress can be guaranteed for atomic transactions that incur no buffer overflow.
    Type: Grant
    Filed: January 3, 2008
    Date of Patent: September 22, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Xiaowei Shen
  • Patent number: 9026769
    Abstract: A processor for processing loop instructions can include an instruction reorder structure and a loop processing controller. The instruction reorder structure is configured to store decoded instructions according to program order and issue the decoded instructions for execution out of program order. The loop processing controller is configured to detect a loop in the decoded instructions stored in the instruction reorder structure and cause the instruction reorder structure to reissue the decoded instructions that form the loop for re-execution.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: May 5, 2015
    Assignee: Marvell International Ltd.
    Inventors: Sujat Jamil, R. Frank O'Bleness, Joseph Delgross, Tom Hameenanttila
  • Publication number: 20150012731
    Abstract: Techniques to control power and processing among a plurality of asymmetric cores. In one embodiment, one or more asymmetric cores are power managed to migrate processes or threads among a plurality of cores according to the performance and power needs of the system.
    Type: Application
    Filed: September 26, 2014
    Publication date: January 8, 2015
    Inventors: Herbert HUM, Eric SPRANGLE, Douglas CARMEAN, Rajesh KUMAR
  • Patent number: 8918625
    Abstract: A processor that executes instructions out of program order is described. In some implementations, a processor detects whether a second memory operation is dependent on a first memory operation prior to memory address calculation. If the processor detects that the second memory operation is not dependent on the first memory operation, the processor is configured to allow the second memory operation to be scheduled. If the processor detects that the second memory operation is dependent on the first memory operation, the processor is configured to prevent the second memory operation from being scheduled until the first memory operation has been scheduled to reduce the likelihood of having to reexecute the second memory operation.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: December 23, 2014
    Assignee: Marvell International Ltd.
    Inventors: R. Frank O'Bleness, Sujat Jamil, Tom Hameenanttila
  • Patent number: 8893134
    Abstract: A method for identifying a consumer-producer pattern in a multi-threaded application includes obtaining synchronization event data of the multi-threaded application, and identifying the consumer-producer communication pattern from the synchronization event data.
    Type: Grant
    Filed: April 13, 2011
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Peter F. Sweeney, Qiming Teng, Haichuan Wang, Xiao Zhong