Reducing An Impact Of A Stall Or Pipeline Bubble Patents (Class 712/219)
  • Publication number: 20120054472
    Abstract: Execution states of tasks are inferred from collection of information associated with runtime execution of a computer system. Collection of information may include infrequent samples of executing tasks, the samples which may provide inaccurate executing states. One or more tasks may be aggregated by one or more execution states for determining execution time, idle time, or system policy violations, or combinations thereof.
    Type: Application
    Filed: September 1, 2010
    Publication date: March 1, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Erik R. Altman, Matthew R. Arnold, Nicholas M. Mitchell
  • Patent number: 8117425
    Abstract: The Thread Data Base 1 holds a thread identifier to uniquely identify a thread in the system. The Check means 3 lets, when no thread being a target exist in the same processor, a trap (TRAP) 10 occur. The Issue means 2, when a thread being a target exists in the same processor, at a time of issuing a subsequent instruction, successively inputs a thread 9 to be executed next, as a thread serving as a target, into a pipeline. The Gate (G) means 11 uses data on the execution of a thread as an input for computation of a thread serving as a succeeding target. The Switch means 13 transfers data in a context of a thread to a context of a target thread without inputting the target thread as a non-executable thread into a pipeline while the thread is being executed.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: February 14, 2012
    Assignee: NEC Corporation
    Inventor: Hitoshi Takagi
  • Publication number: 20120036338
    Abstract: An extended DRAIN instruction is used to stall processing within a computing environment. The instruction includes an indication of the one or more processing stages at which processing is to be stalled. It also includes a control that allows processing to be stalled for additional cycles, as desired.
    Type: Application
    Filed: October 14, 2011
    Publication date: February 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Khary J. Alexander, Fadi Y. Busaba, Mark S. Farrell, Bruce C. Giamei, Timothy J. Slegel, Charles F. Webb
  • Patent number: 8099583
    Abstract: A new signal processor technique and apparatus combining microprocessor technology with switch fabric telecommunication technology to achieve a programmable processor architecture wherein the processor and the connections among its functional blocks are configured by software for each specific application by communication through a switch fabric in a dynamic, parallel and flexible fashion to achieve a reconfigurable pipeline, wherein the length of the pipeline stages and the order of the stages varies from time to time and from application to application, admirably handling the explosion of varieties of diverse signal processing needs in single devices such as handsets, set-top boxes and the like with unprecedented performance, cost and power savings, and with full application flexibility.
    Type: Grant
    Filed: October 6, 2007
    Date of Patent: January 17, 2012
    Assignee: Axis Semiconductor, Inc.
    Inventor: Xiaolin Wang
  • Patent number: 8099582
    Abstract: A mechanism is provided for tracking deallocated load instructions. A processor detects whether a load instruction in a set of instructions in an issue queue has missed. Responsive to a miss of the load instruction, an instruction scheduler allocates the load instruction to a load miss queue and deallocates the load instruction from the issue queue. The instruction scheduler determines whether there is a dependence entry for the load instruction in an issue queue portion of a dependence matrix. Responsive to the existence of the dependence entry for the load instruction in the issue queue portion of the dependence matrix, the instruction scheduler reads data from the dependence entry of the issue queue portion of the dependence matrix that specifies a set of dependent instructions that are dependent on the load instruction and writes the data into a new entry in a load miss queue portion of the dependence matrix.
    Type: Grant
    Filed: March 24, 2009
    Date of Patent: January 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, William E. Burky, Todd A. Venton
  • Publication number: 20110320777
    Abstract: The architecture and techniques described herein can improve system performance with respect to the following. Communication between two interdependent hardware engines, that are part of pipeline, such that the engines are synchronized to consume resources when the engines are done with the work. Reduction of the role of software/firmware from feeding each stage of the hardware pipeline when the previous stage of the pipeline has completed. Reduction in the memory allocation for software-initialized hardware descriptors to improve performance by reducing pipeline stalls due to software interaction.
    Type: Application
    Filed: June 28, 2010
    Publication date: December 29, 2011
    Inventors: DANIEL NEMIROFF, Balaji Vembu, Raul Gutierrez, Suryaprasad Kareenahalli
  • Patent number: 8086826
    Abstract: An information handling system includes a processor with an issue unit (IU) that may perform instruction dependency tracking for successive instruction issue operations. The IU maintains non-shifting issue queue (NSIQ) and shifting issue queue (SIQ) instructions along with relative instruction to instruction dependency information. A mapper maps queue position data for instructions that dispatch to issue queue locations within the IU. The IU may test an issuing producer instruction against consumer instructions in the IU for queue position (QPOS) and register tag (RTAG) matches. A matching consumer instruction may issue in a successive manner in the case of a queue position match or in a next processor cycle in the case of a register tag match.
    Type: Grant
    Filed: March 24, 2009
    Date of Patent: December 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: Mary Douglass Brown, William Elton Burky, Dung Quoc Nguyen, Balaram Sinharoy
  • Patent number: 8082421
    Abstract: A program instruction rearrangement method calculates the dependency depth of each instruction of a program based on dependency between instructions, based on register access order, and rearranging instructions based on the dependency depth. Additionally, the dependency between instructions can be utilized to locate and remove redundant instructions.
    Type: Grant
    Filed: September 4, 2007
    Date of Patent: December 20, 2011
    Assignee: Via Technologies, Inc.
    Inventor: Yi-Peng Chen
  • Publication number: 20110307688
    Abstract: Computer-implemented methods and systems for synthesizing a hardware description for a pipelined datapath for a digital circuit. A transactional datapath specification framework and a transactional design automation system automatically synthesize pipeline implementations. The transactional datapath specification framework captures an abstract datapath, whose execution semantics is interpreted as a sequence of “transactions” where each transaction reads the state values left by the preceding transaction and computes a new set of state values to be seen by the next transaction. The transactional datapath specification framework exposes sufficient information about state accesses that can occur in a datapath, which is necessary for performing precise data hazards analysis, and eventually pipeline synthesis.
    Type: Application
    Filed: June 10, 2011
    Publication date: December 15, 2011
    Applicant: Carnegie Mellon University
    Inventors: Eriko Nurvitadhi, James C. Hoe
  • Patent number: 8078846
    Abstract: A conditional move instruction implemented in a processor by forming and processing two decoded instructions, and applications thereof. In an embodiment, the conditional move instruction specifies a first source operand, a second source operand, and a third operand that is both a source and a destination. If the value of the second operand is not equal to a specified value, the first decoded instruction moves the third operand to a completion buffer register. If the value of the second operand is equal to the specified value, the second decoded instruction moves the value of the first operand to the completion buffer. When the decoded instruction that performed the move graduates, the contents of the completion buffer register is transferred to a register file register specified by the third operand.
    Type: Grant
    Filed: December 18, 2006
    Date of Patent: December 13, 2011
    Assignee: MIPS Technologies, Inc.
    Inventors: Karagada Ramarao Kishore, Xing Yu Jiang, Vidya Rajagopalan, Maria Ukanwa
  • Publication number: 20110296145
    Abstract: Reducing pipeline stall between a compute unit and address unit in a processor can be accomplished by computing results in a compute unit in response to instructions of an algorithm; storing in a local random access memory array in a compute unit predetermined sets of functions, related to the computed results for predetermined sets of instructions of the algorithm; and providing within the compute unit direct mapping of computed results to related function.
    Type: Application
    Filed: August 10, 2011
    Publication date: December 1, 2011
    Inventors: James Wilson, Joshua A. Kablotsky, Yosef Stein, Colm J. Prendergast, Gregory M. Yukna, Christopher M. Mayer
  • Patent number: 8065505
    Abstract: This invention provides flexible load latency to pipeline cache misses. A memory controller selects the output of one of a set of cascades inserted execute stages. This selection may be controlled by a latency field in a load instruction or by a latency specification of a prior instruction. This invention is useful in the great majority of cases where the code can tolerate incremental increases in load latency for a reduction in cache miss penalty.
    Type: Grant
    Filed: August 16, 2007
    Date of Patent: November 22, 2011
    Assignee: Texas Instruments Incorporated
    Inventor: Chris Yoochang Chung
  • Patent number: 8055884
    Abstract: One embodiment of the present invention provides a system for augmenting a pipeline with a bubble-removal circuit. During operation, the system generates a bubble-removal circuit which determines a clock-enable signal based at least on whether an upstream register has valid data and whether the pipeline is stalled. Next, the system gates the clock signal using the clock-enable signal. The augmented pipeline can determine whether a first register contains invalid data, which is associated with a bubble. Next, the augmented pipeline determines whether a second register contains valid data, wherein the second register is adjacent to and upstream from the first register. If the first register contains invalid data and the second register contains valid data, the augmented pipeline replaces the invalid data of the first register with valid data based on the valid data in the second register without propagating the invalid data to a downstream register.
    Type: Grant
    Filed: December 2, 2009
    Date of Patent: November 8, 2011
    Assignee: Synopsys, Inc.
    Inventors: John D. Lofgren, Brett Kobernat
  • Patent number: 8037366
    Abstract: A mechanism is provided for issuing instructions. An instruction dispatch unit receives an instruction for dispatch to one of a plurality of execution units. The instruction dispatch unit analyzes a tag register to determine whether a previous tag associated with a previous instruction has been stored in the tag register. Responsive to the previous tag associated with the previous instruction failing to be stored in the tag register, the instruction dispatch unit storing a tag corresponding to the instruction in the tag register. The instruction dispatch unit dispatches the instruction to an issue queue for issue to the one of the plurality of execution units.
    Type: Grant
    Filed: March 24, 2009
    Date of Patent: October 11, 2011
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Dung Q. Nguyen, Todd A. Venton
  • Patent number: 8037287
    Abstract: An instruction processing pipeline 6 is provided. This has error detection and error recovery circuitry 20 associated with one or more of the pipeline stages. If an error is detected within a signal value within that pipeline stage, then it can be repaired. Part of the error recovery may be to flush upstream program instructions from the instruction pipeline 6. When multi-threading, only those instructions from a thread including an instruction which has been lost as a consequence of the error recovery need to be flushed from the instruction pipeline 6. Instruction can also be selected for flushing in dependence upon characteristics such as privileged level, number of dependent instructions etc.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: October 11, 2011
    Assignee: ARM Limited
    Inventors: Emre Özer, Shidhartha Das, David Michael Bull
  • Patent number: 8028151
    Abstract: A method, system and processor for improving the performance of an in-order processor. A processor may include an execution unit with an execution pipeline that includes a backup pipeline and a regular pipeline. The backup pipeline may store a copy of the instructions issued to the regular pipeline. The execution pipeline may include logic for allowing instructions to flow from the backup pipeline to the regular pipeline following the flushing of the instructions younger than the exception detected in the regular pipeline. By maintaining a backup copy of the instructions issued to the regular pipeline, instructions may not need to be flushed from separate execution pipelines and re-fetched. As a result, one may complete the results of the execution units to the architected state out of order thereby allowing the completion point to vary among the different execution pipelines.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: September 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, Jonathan James Dement, Ronald Hall, Albert James Van Norstrand
  • Patent number: 8019974
    Abstract: A bypass circuit is provided in a pipeline processor. A pipeline register is provided between an instruction execution stage and a write-back stage. The pipeline register stores a data validity flag and a WRITE control flag to control writing data into a general purpose register unit. The data retained in the pipeline register is allowed to be written back into the general purpose register unit when the WRITE control flag indicates “valid”. The pipeline register continues to retain the retained data even after the writing of the retained data into the general purpose register unit. The first pipeline register supplies the retained data to the second stage through the bypass circuit at the time of executing a subsequent instruction having data dependency on a preceding instruction.
    Type: Grant
    Filed: January 12, 2009
    Date of Patent: September 13, 2011
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Jun Tanabe
  • Patent number: 8006072
    Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.
    Type: Grant
    Filed: May 18, 2010
    Date of Patent: August 23, 2011
    Assignee: Micron Technology, Inc.
    Inventors: Neal Andrew Cook, Alan T. Wootton, James Peterson
  • Patent number: 7984268
    Abstract: An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner.
    Type: Grant
    Filed: July 23, 2004
    Date of Patent: July 19, 2011
    Assignee: NetLogic Microsystems, Inc.
    Inventors: David T. Hass, Abbas Rashid
  • Patent number: 7975130
    Abstract: A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction.
    Type: Grant
    Filed: February 20, 2008
    Date of Patent: July 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Fadi Y. Busada, Bruce C. Giamei, David S. Hutton, Chung-Lung Kevin Shum
  • Patent number: 7962730
    Abstract: In one embodiment, a processor comprises a retire unit and a load/store unit coupled thereto. The retire unit is configured to retire a first store memory operation responsive to the first store memory operation having been processed at least to a pipeline stage at which exceptions are reported for the first store memory operation. The load/store unit comprises a queue having a first entry assigned to the first store memory operation. The load/store unit is configured to retain the first store memory operation in the first entry subsequent to retirement of the first store memory operation if the first store memory operation is not complete. The queue may have multiple entries, and more than one store may be retained in the queue after being retired by the retire unit.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: June 14, 2011
    Assignee: Apple Inc.
    Inventors: Wei-Han Lien, Po-Yung Chang
  • Patent number: 7962726
    Abstract: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: June 14, 2011
    Assignee: International Business Machines Corporation
    Inventors: Edward T. Malley, Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Jeffrey S. Plate, John G. Rell, Jr., Chung-Lung Kevin Shum
  • Patent number: 7958340
    Abstract: Software pipelining on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, each IP block adapted to a router through a memory communications controller and a network interface controller, each memory communications controller controlling communication between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers.
    Type: Grant
    Filed: May 9, 2008
    Date of Patent: June 7, 2011
    Assignee: International Business Machines Corporation
    Inventors: Russell D. Hoover, Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Patent number: 7949855
    Abstract: A processor buffers asynchronous threads. Instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one computation operation and at least one memory access operation. Instructions within each phase are qualified and prioritized. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the current instructions. The instructions may also be qualified based on an age of each instruction, status of the execution units, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.
    Type: Grant
    Filed: April 28, 2008
    Date of Patent: May 24, 2011
    Assignee: NVIDIA Corporation
    Inventors: Peter C. Mills, John Erik Lindholm, Brett W. Coon, Gary M. Tarolli, John Matthew Burgess
  • Patent number: 7945765
    Abstract: An electronic apparatus includes a plurality of stages serially interconnected as a pipeline to perform sequential processings on input operands. A shortening circuit associated with at least one stage of the pipeline recognizes when one or more of input operands for the stage has been predetermined as appropriate for shortening and execute the shortening when appropriate.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: May 17, 2011
    Assignee: International Business Machines Corporation
    Inventors: Philip George Emma, Allan Mark Hartstein, Hans Jacobson, William Robert Reohr
  • Patent number: 7934031
    Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: April 26, 2011
    Assignee: California Institute of Technology
    Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
  • Patent number: 7917793
    Abstract: The present invention uses a swing structure to avoid using a clock period at a non-efficient execution time. The execution time is precisely controlled to enhance a performance of a processor using a low voltage. Thus, synchronization problems in a chip under different environments are solved for high reliability.
    Type: Grant
    Filed: February 11, 2008
    Date of Patent: March 29, 2011
    Assignee: National Chung Cheng University
    Inventors: Shu-Hsuan Chou, Yi-Chao Chan, Ming-Ku Chang, Tien-Fu Chen
  • Publication number: 20110055525
    Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.
    Type: Application
    Filed: November 8, 2010
    Publication date: March 3, 2011
    Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
  • Patent number: 7900023
    Abstract: A technique to allow independent loads to be satisfied during high-latency instruction processing. Embodiments of the invention relate to a technique in which a storage structure is used to hold store operations in program order while independent load instructions are satisfied during a time in which a high-latency instruction is being processed. After the high-latency instruction is processed, the store operations can be restored in program order without searching the storage structure.
    Type: Grant
    Filed: December 16, 2004
    Date of Patent: March 1, 2011
    Assignee: Intel Corporation
    Inventors: Ravi Rajwar, Srikanth T. Srinivasan, Haitham Akkary, Amit Gandhi
  • Patent number: 7890733
    Abstract: A data processor comprises a plurality of processing elements (PEs), with memory local to at least one of the processing elements, and a data packet-switched network interconnecting the processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid, e.g., in a SIMD array, so as to connect the PEs and their local memories to a common controller. Transaction-enabled PEs and nodes set flags, which are maintained until the transaction is completed and signal status to the controller e.g., over a series of OR-gates. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The local memories share the same memory map or space. External memory may also be connected to the “end” nodes interfacing with the network, eg to provide cache.
    Type: Grant
    Filed: August 11, 2005
    Date of Patent: February 15, 2011
    Assignee: Rambus Inc.
    Inventor: Ray McConnell
  • Patent number: 7882337
    Abstract: A method of tentative tracing execution events in a multiprocessor system. Each processor stores tentative events in a corresponding buffer. The processor sets pointers in an array to a head and tail of a thread. When a condition triggers a tentative thread to be committed, the processor marks the first event as committed and sets the pointers to a null value. When a condition triggers the thread to be discarded, the processor marks the first event as discarded and sets the pointers to a null value. The processor makes the buffer available to a consumer process, which extracts the first event. If the first event is marked as committed, the consumer process follows a link to a second event of the thread and marks the second event as committed. If the first event is marked as discarded, the second event is marked as discarded and the first event is skipped.
    Type: Grant
    Filed: May 19, 2007
    Date of Patent: February 1, 2011
    Assignee: International Business Machines Corporation
    Inventor: Jose G. Rivera
  • Patent number: 7877580
    Abstract: A method of handling program instructions in a microprocessor which reduces delays associated with mispredicted branch instructions, by detecting the occurrence of a stall condition during execution of the program instructions, speculatively executing one or more pending instructions which include at least one branch instruction during the stall condition, and determining the validity of data utilized by the speculative execution. Dispatch logic determines the validity of the data by marking one or more registers of an instruction dispatch unit to indicate which results of the pending instructions are invalid. The speculative execution of instructions can occur across multiple pipeline stages of the microprocessor, and the validity of the data is tracked during their execution in the multiple pipeline stages while monitoring a dependency of the speculatively executed instructions relative to one another during their execution in the multiple pipeline stages.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7865702
    Abstract: Thread switching prevents pipeline stalls when executing multiple threads. An analysis of a first thread identifies instructions capable of causing pipeline stalls. If pipeline stalls from the identified instructions are likely, thread switching instructions are added to the first thread in place of the identified instructions. Thread switching instructions direct a microprocessor to suspend executing the thread and begin executing a second thread. Thread switching instructions can be added to the second thread to enable the resumption of the first thread at the location specified by the identified instruction. The thread switching instructions are configured to avoid pipeline stalls when switching threads. Thread switching instructions can store and retrieve thread-specific information upon the suspension and resumption of threads. Thread switching instructions can schedule the execution of two or more threads in accordance with load balancing schemes.
    Type: Grant
    Filed: August 17, 2009
    Date of Patent: January 4, 2011
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Victor Suba
  • Patent number: 7861066
    Abstract: A mechanism for suppressing instruction replay includes a processor having one or more execution units and a scheduler that issue instruction operations for execution by the one or more execution units. The scheduler may also cause instruction operations that are determined to be incorrectly executed to be replayed, or reissued. In addition, a prediction unit within the processor may predict whether a given instruction operation will replay and to provide an indication that the given instruction operation will replay. The processor also includes a decode unit that may decode instructions and in response to detecting the indication, may flag the given instruction operation. The scheduler may further inhibit issue of the flagged instruction operation until a status associated with the flagged instruction is good.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: December 28, 2010
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashutosh S. Dhodapkar, Michael G. Butler, Gene W. Shen
  • Patent number: 7861064
    Abstract: A method for selectively accelerating early instruction processing including receiving an instruction data that is normally processed in an execution stage of a processor pipeline, wherein a configuration of the instruction data allows a processing of the instruction data to be accelerated from the execution stage to an address generation stage that occurs earlier in the processor pipeline than the execution stage, determining whether the instruction data can be dispatched to the address generation stage to be processed without being delayed due to an unavailability of a processing resource needed for the processing of the instruction data in the address generation stage, dispatching the instruction data to be processed in the address generation stage if it can be dispatched without being delayed due to the unavailability of the processing resource, and dispatching the instruction data to be processed in the execution stage if it can not be dispatched without being delayed due to the unavailability of the pro
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: December 28, 2010
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, David S. Hutton, Chung-Lung Kevin Shum
  • Publication number: 20100325469
    Abstract: An instruction detecting section (235) detects whether or not there is any succeeding instruction executable regardless of an order based on a data dependency relationship between a presently executed instruction and a succeeding instruction following the presently executed instruction. A clock switch judging section (236) receives notification of the start and end of a memory stall, determines whether or not a memory stall is occurring, and judges whether to switch a clock signal to be supplied to a CPU (200) to a low clock signal (239) or to stop the clock signal based on a detection result of the instruction detecting section (235) if it is judged that the memory stall is occurring. A clock switching section (237) switches the clock signal based on judgment by the clock switch judging section (236). By this construction, power consumption can be reduced without reducing performance.
    Type: Application
    Filed: December 10, 2008
    Publication date: December 23, 2010
    Inventors: Ryo Yokoyama, Tadao Tanikawa
  • Patent number: 7849290
    Abstract: Embodiments of the present invention provide a system that buffers stores on a processor that supports speculative execution. The system starts by buffering a store into an entry in the store queue during a speculative execution mode. If an entry for the store does not already exist in the store queue, the system writes the store into an available entry in the store queue and updates a byte mask for the entry. Otherwise, if an entry for the store already exists in the store queue, the system merges the store into the existing entry in the store queue and updates the byte mask for the entry to include information about the newly merged store. The system then forwards the data from the store queue to subsequent dependent loads.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: December 7, 2010
    Assignee: Oracle America, Inc.
    Inventors: Robert E. Cypher, Shailender Chaudhry
  • Patent number: 7844799
    Abstract: A method and system for operating a high frequency out-of-order processor with increased pipeline length. A new scheme is disclosed to reduce the pipeline by the detection and exploitation of so called “no dependency” for an instruction. A “no dependency” signal tells that all required source data is available for the instruction at least one cycle before the source data valid bit(s) are inserted into the issue queue. Therefore, one or more stages of the pipeline are bypassed.
    Type: Grant
    Filed: December 20, 2001
    Date of Patent: November 30, 2010
    Assignee: International Business Machines Corporation
    Inventors: Jens Leenstra, Antje Mueller, Juergen Pille, Dieter Wendel
  • Patent number: 7836279
    Abstract: A system for supporting software pipelining using a shifting register queue is provided. The system includes a register file that comprises a plurality of registers. The register file is operable to receive a shift mask signal and a shift signal and to identify a shifting register queue within the register file based on the shift mask signal. The shifting register queue comprises a plurality of queue registers. The register file is further operable to shift the contents of the queue registers based on the shift signal.
    Type: Grant
    Filed: December 31, 2003
    Date of Patent: November 16, 2010
    Assignee: STMicroelectronics, Inc.
    Inventors: Osvaldo Colavin, Vineet Soni, Davide Rizzo
  • Patent number: 7831810
    Abstract: Embodiments of the present invention provide a system for transferring data between a receiver chip and a transmitter chip. The system includes a set of data path circuits in the transmitter chip and a set of data path circuits in the receiver chip coupled to a shared data channel. In addition, the system includes a set of asynchronous control circuits for controlling corresponding data path circuits in the transmitter chip and receiver chip. Upon detecting the transition of a control signal for an asynchronous control circuit in the transmitter chip, the asynchronous control circuit is configured to enable a transfer of data from the corresponding data path circuit in the transmitter chip across the data channel to a corresponding data path circuit in the receiver chip, and generate a control signal to cause a next asynchronous control circuit to commence the transfer of a data signal.
    Type: Grant
    Filed: October 2, 2007
    Date of Patent: November 9, 2010
    Assignee: Oracle America, Inc.
    Inventor: Scott M. Fairbanks
  • Patent number: 7818544
    Abstract: Mechanisms for placing a processor into a gradual slow down mode of operation are provided. The gradual slow down mode of operation comprises a plurality of stages of slow down operation of an issue unit in a processor in which the issuance of instructions is slowed in accordance with a staging scheme. The gradual slow down of the processor allows the processor to break out of livelock conditions. Moreover, since the slow down is gradual, the processor may flexibly avoid various degrees of livelock conditions. The mechanisms of the illustrative embodiments impact the overall processor performance based on the severity of the livelock condition by taking a small performance impact on less severe livelock conditions and only increasing the processor performance impact when the livelock condition is more severe.
    Type: Grant
    Filed: September 5, 2008
    Date of Patent: October 19, 2010
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Kurt A. Feiste, Ronald P. Hall, Albert J. Van Norstrand, Jr.
  • Patent number: 7814300
    Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.
    Type: Grant
    Filed: April 30, 2008
    Date of Patent: October 12, 2010
    Assignee: Freescale Semiconductor, Inc.
    Inventors: William C. Moyer, Jeffrey W. Scott
  • Patent number: 7802077
    Abstract: A new class traces for a processing engine, called “extended blocks,” possess an architecture that permits possible many entry points but only a single exit point. These extended blocks may be indexed based upon the address of the last instruction therein. Use of the new trace architecture provides several advantages, including reduction of instruction redundancies, dynamic block extension and a sharing of instructions among various extended blocks.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: September 21, 2010
    Assignee: Intel Corporation
    Inventors: Stephen J. Jourdan, Lihu Rappoport, Ronny Ronen, Adi Yoaz
  • Patent number: 7788473
    Abstract: Prediction of data values to be read from memory by a microprocessor for load operations. In one aspect, a method for predicting a data value that will result from a load operation to be executed by the microprocessor includes accessing an entry in a load value prediction table that stores a predicted data value corresponding to the load operation. The predicted data value is stored in a physical storage destination of the microprocessor to be available as a result of the load operation without waiting for execution of the load operation to complete. The storage destination is the destination for a loaded data value resulting from executing the load operation.
    Type: Grant
    Filed: December 26, 2006
    Date of Patent: August 31, 2010
    Assignee: Oracle America, Inc.
    Inventors: Chris Nelson, Matthew Ashcraft, John Gregory Favor, Seungyoon Peter Song
  • Patent number: 7779236
    Abstract: The invention provides a method and system for operating a pipelined microprocessor more quickly, by detecting instructions that load from identical memory locations as were recently stored to, without having to actually compute the referenced external memory addresses. The microprocessor examines the symbolic structure of instructions as they are encountered, so as to be able to detect identical memory locations by examination of their symbolic structure. For example, in a preferred embodiment, instructions that store to and load from an identical offset from an identical register are determined to be referencing the identical memory location, without having to actually compute the complete physical target address.
    Type: Grant
    Filed: November 19, 1999
    Date of Patent: August 17, 2010
    Assignee: STMicroelectronics, Inc.
    Inventor: David L. Isaman
  • Patent number: 7774582
    Abstract: A data processing system including multiple execution pipelines each having multiple execution stages E1, E2, E3 may have instructions issued together in parallel despite a data dependency therebetween if it is detected that the result operand value for the older instruction will be generated in an execution stage prior to an execution stage which requires that result operand value as an input operand value to the younger instruction and accordingly cross-forwarding of the operand value is possible between the execution pipelines to resolve the data dependency.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: August 10, 2010
    Assignee: ARM Limited
    Inventors: David James Williamson, Glen Andrew Harris, Stephen John Hill
  • Patent number: 7761691
    Abstract: A method for scheduling instructions for clustered digital signal processors comprising a plurality of clusters, each cluster including at least two functional units and a first register file having a first unit, a second unit and a single set of access ports shared by the functional units comprises steps of checking whether executing one instruction needs data to be read from the first unit and the second unit of the first register file, generating a copying instruction to transfer data from the first unit to the second unit of the first register file, checking whether there is a prior operation cycle available to perform the copying instruction, scheduling the copying instruction in the prior operation cycle, and scheduling the instruction after the copying instruction.
    Type: Grant
    Filed: October 27, 2005
    Date of Patent: July 20, 2010
    Assignee: National Tsing Hua University
    Inventors: Chung-Lin Tang, Yung-Chia Lin, Jenq-Kuen Lee
  • Publication number: 20100180103
    Abstract: A computer processor pipeline has both an architectural register file and a working register file. The lifetime of an entry in the working register file is determined by a predetermined number of instructions passing through a specified stage in the pipeline after the location in the working register file is allocated for an instruction. The size of the working register file is selected based upon performance characteristics. A working register file creditor indicator is coupled to the front end pipeline portion and to the back end pipeline portion. The working register file credit indicator is monitored to prevent a working register file overflow. When the a location in the architectural register file is read early, the location is monitored to determine whether the location is written to prior to issuance of the instruction associated with the early read.
    Type: Application
    Filed: January 15, 2009
    Publication date: July 15, 2010
    Inventors: Shailender Chaudhry, Paul Caprioli, Marc Tremblay
  • Patent number: 7748001
    Abstract: Method, apparatus and system embodiments to assign priority to a thread when the thread is otherwise unable to proceed with instruction retirement. For at least one embodiment, the thread is one of a plurality of active threads in a multiprocessor system that includes memory livelock breaker logic and/or starvation avoidance logic. Other embodiments are also described and claimed.
    Type: Grant
    Filed: September 23, 2004
    Date of Patent: June 29, 2010
    Assignee: Intel Corporation
    Inventors: David W. Burns, K. S. Venkatraman
  • Publication number: 20100146314
    Abstract: Forming a plurality of pipeline orderings, each pipeline ordering comprising one of a sequential, a parallel, or a sequential and parallel combination of a plurality of stages of a pipeline, analyzing the plurality of pipeline orderings to determine a total power of each of the orderings, and selecting one of the plurality of pipeline orderings based on the determined total power of each of the plurality of pipeline orderings.
    Type: Application
    Filed: February 11, 2010
    Publication date: June 10, 2010
    Inventors: Ron Gabor, Hong Jiang, Alon Naveh, Doron Rajwan, James Varga, Gady Yearim, Yuval Yosef