Reducing An Impact Of A Stall Or Pipeline Bubble Patents (Class 712/219)

AUTOMATIC IDENTIFICATION OF BOTTLENECKS USING RULE-BASED EXPERT KNOWLEDGE

Publication number: 20120054472

Abstract: Execution states of tasks are inferred from collection of information associated with runtime execution of a computer system. Collection of information may include infrequent samples of executing tasks, the samples which may provide inaccurate executing states. One or more tasks may be aggregated by one or more execution states for determining execution time, idle time, or system policy violations, or combinations thereof.

Type: Application

Filed: September 1, 2010

Publication date: March 1, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Erik R. Altman, Matthew R. Arnold, Nicholas M. Mitchell
Multithread processor and method of synchronization operations among threads to be used in same

Patent number: 8117425

Abstract: The Thread Data Base 1 holds a thread identifier to uniquely identify a thread in the system. The Check means 3 lets, when no thread being a target exist in the same processor, a trap (TRAP) 10 occur. The Issue means 2, when a thread being a target exists in the same processor, at a time of issuing a subsequent instruction, successively inputs a thread 9 to be executed next, as a thread serving as a target, into a pipeline. The Gate (G) means 11 uses data on the execution of a thread as an input for computation of a thread serving as a succeeding target. The Switch means 13 transfers data in a context of a thread to a context of a target thread without inputting the target thread as a non-executable thread into a pipeline while the thread is being executed.

Type: Grant

Filed: April 14, 2008

Date of Patent: February 14, 2012

Assignee: NEC Corporation

Inventor: Hitoshi Takagi
FACILITATING PROCESSING IN A COMPUTING ENVIRONMENT USING AN EXTENDED DRAIN INSTRUCTION

Publication number: 20120036338

Abstract: An extended DRAIN instruction is used to stall processing within a computing environment. The instruction includes an indication of the one or more processing stages at which processing is to be stalled. It also includes a control that allows processing to be stalled for additional cycles, as desired.

Type: Application

Filed: October 14, 2011

Publication date: February 9, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Khary J. Alexander, Fadi Y. Busaba, Mark S. Farrell, Bruce C. Giamei, Timothy J. Slegel, Charles F. Webb
Method of and apparatus and architecture for real time signal processing by switch-controlled programmable processor configuring and flexible pipeline and parallel processing

Patent number: 8099583

Abstract: A new signal processor technique and apparatus combining microprocessor technology with switch fabric telecommunication technology to achieve a programmable processor architecture wherein the processor and the connections among its functional blocks are configured by software for each specific application by communication through a switch fabric in a dynamic, parallel and flexible fashion to achieve a reconfigurable pipeline, wherein the length of the pipeline stages and the order of the stages varies from time to time and from application to application, admirably handling the explosion of varieties of diverse signal processing needs in single devices such as handsets, set-top boxes and the like with unprecedented performance, cost and power savings, and with full application flexibility.

Type: Grant

Filed: October 6, 2007

Date of Patent: January 17, 2012

Assignee: Axis Semiconductor, Inc.

Inventor: Xiaolin Wang
Tracking deallocated load instructions using a dependence matrix

Patent number: 8099582

Abstract: A mechanism is provided for tracking deallocated load instructions. A processor detects whether a load instruction in a set of instructions in an issue queue has missed. Responsive to a miss of the load instruction, an instruction scheduler allocates the load instruction to a load miss queue and deallocates the load instruction from the issue queue. The instruction scheduler determines whether there is a dependence entry for the load instruction in an issue queue portion of a dependence matrix. Responsive to the existence of the dependence entry for the load instruction in the issue queue portion of the dependence matrix, the instruction scheduler reads data from the dependence entry of the issue queue portion of the dependence matrix that specifies a set of dependent instructions that are dependent on the load instruction and writes the data into a new entry in a load miss queue portion of the dependence matrix.

Type: Grant

Filed: March 24, 2009

Date of Patent: January 17, 2012

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Mary D. Brown, William E. Burky, Todd A. Venton
DIRECT MEMORY ACCESS ENGINE PHYSICAL MEMORY DESCRIPTORS FOR MULTI-MEDIA DEMULTIPLEXING OPERATIONS

Publication number: 20110320777

Abstract: The architecture and techniques described herein can improve system performance with respect to the following. Communication between two interdependent hardware engines, that are part of pipeline, such that the engines are synchronized to consume resources when the engines are done with the work. Reduction of the role of software/firmware from feeding each stage of the hardware pipeline when the previous stage of the pipeline has completed. Reduction in the memory allocation for software-initialized hardware descriptors to improve performance by reducing pipeline stalls due to software interaction.

Type: Application

Filed: June 28, 2010

Publication date: December 29, 2011

Inventors: DANIEL NEMIROFF, Balaji Vembu, Raul Gutierrez, Suryaprasad Kareenahalli
Dependency tracking for enabling successive processor instructions to issue

Patent number: 8086826

Abstract: An information handling system includes a processor with an issue unit (IU) that may perform instruction dependency tracking for successive instruction issue operations. The IU maintains non-shifting issue queue (NSIQ) and shifting issue queue (SIQ) instructions along with relative instruction to instruction dependency information. A mapper maps queue position data for instructions that dispatch to issue queue locations within the IU. The IU may test an issuing producer instruction against consumer instructions in the IU for queue position (QPOS) and register tag (RTAG) matches. A matching consumer instruction may issue in a successive manner in the case of a queue position match or in a next processor cycle in the case of a register tag match.

Type: Grant

Filed: March 24, 2009

Date of Patent: December 27, 2011

Assignee: International Business Machines Corporation

Inventors: Mary Douglass Brown, William Elton Burky, Dung Quoc Nguyen, Balaram Sinharoy
Program instruction rearrangement methods in computer

Patent number: 8082421

Abstract: A program instruction rearrangement method calculates the dependency depth of each instruction of a program based on dependency between instructions, based on register access order, and rearranging instructions based on the dependency depth. Additionally, the dependency between instructions can be utilized to locate and remove redundant instructions.

Type: Grant

Filed: September 4, 2007

Date of Patent: December 20, 2011

Assignee: Via Technologies, Inc.

Inventor: Yi-Peng Chen
SYNTHESIS SYSTEM FOR PIPELINED DIGITAL CIRCUITS

Publication number: 20110307688

Abstract: Computer-implemented methods and systems for synthesizing a hardware description for a pipelined datapath for a digital circuit. A transactional datapath specification framework and a transactional design automation system automatically synthesize pipeline implementations. The transactional datapath specification framework captures an abstract datapath, whose execution semantics is interpreted as a sequence of “transactions” where each transaction reads the state values left by the preceding transaction and computes a new set of state values to be seen by the next transaction. The transactional datapath specification framework exposes sufficient information about state accesses that can occur in a datapath, which is necessary for performing precise data hazards analysis, and eventually pipeline synthesis.

Type: Application

Filed: June 10, 2011

Publication date: December 15, 2011

Applicant: Carnegie Mellon University

Inventors: Eriko Nurvitadhi, James C. Hoe
Conditional move instruction formed into one decoded instruction to be graduated and another decoded instruction to be invalidated

Patent number: 8078846

Abstract: A conditional move instruction implemented in a processor by forming and processing two decoded instructions, and applications thereof. In an embodiment, the conditional move instruction specifies a first source operand, a second source operand, and a third operand that is both a source and a destination. If the value of the second operand is not equal to a specified value, the first decoded instruction moves the third operand to a completion buffer register. If the value of the second operand is equal to the specified value, the second decoded instruction moves the value of the first operand to the completion buffer. When the decoded instruction that performed the move graduates, the contents of the completion buffer register is transferred to a register file register specified by the third operand.

Type: Grant

Filed: December 18, 2006

Date of Patent: December 13, 2011

Assignee: MIPS Technologies, Inc.

Inventors: Karagada Ramarao Kishore, Xing Yu Jiang, Vidya Rajagopalan, Maria Ukanwa
PIPELINED DIGITAL SIGNAL PROCESSOR

Publication number: 20110296145

Abstract: Reducing pipeline stall between a compute unit and address unit in a processor can be accomplished by computing results in a compute unit in response to instructions of an algorithm; storing in a local random access memory array in a compute unit predetermined sets of functions, related to the computed results for predetermined sets of instructions of the algorithm; and providing within the compute unit direct mapping of computed results to related function.

Type: Application

Filed: August 10, 2011

Publication date: December 1, 2011

Inventors: James Wilson, Joshua A. Kablotsky, Yosef Stein, Colm J. Prendergast, Gregory M. Yukna, Christopher M. Mayer
Stall-free pipelined cache for statically scheduled and dispatched execution

Patent number: 8065505

Abstract: This invention provides flexible load latency to pipeline cache misses. A memory controller selects the output of one of a set of cascades inserted execute stages. This selection may be controlled by a latency field in a load instruction or by a latency specification of a prior instruction. This invention is useful in the great majority of cases where the code can tolerate incremental increases in load latency for a reduction in cache miss penalty.

Type: Grant

Filed: August 16, 2007

Date of Patent: November 22, 2011

Assignee: Texas Instruments Incorporated

Inventor: Chris Yoochang Chung
Method and apparatus for augmenting a pipeline with a bubble-removal circuit

Patent number: 8055884

Abstract: One embodiment of the present invention provides a system for augmenting a pipeline with a bubble-removal circuit. During operation, the system generates a bubble-removal circuit which determines a clock-enable signal based at least on whether an upstream register has valid data and whether the pipeline is stalled. Next, the system gates the clock signal using the clock-enable signal. The augmented pipeline can determine whether a first register contains invalid data, which is associated with a bubble. Next, the augmented pipeline determines whether a second register contains valid data, wherein the second register is adjacent to and upstream from the first register. If the first register contains invalid data and the second register contains valid data, the augmented pipeline replaces the invalid data of the first register with valid data based on the valid data in the second register without propagating the invalid data to a downstream register.

Type: Grant

Filed: December 2, 2009

Date of Patent: November 8, 2011

Assignee: Synopsys, Inc.

Inventors: John D. Lofgren, Brett Kobernat
Issuing instructions in-order in an out-of-order processor using false dependencies

Patent number: 8037366

Abstract: A mechanism is provided for issuing instructions. An instruction dispatch unit receives an instruction for dispatch to one of a plurality of execution units. The instruction dispatch unit analyzes a tag register to determine whether a previous tag associated with a previous instruction has been stored in the tag register. Responsive to the previous tag associated with the previous instruction failing to be stored in the tag register, the instruction dispatch unit storing a tag corresponding to the instruction in the tag register. The instruction dispatch unit dispatches the instruction to an issue queue for issue to the one of the plurality of execution units.

Type: Grant

Filed: March 24, 2009

Date of Patent: October 11, 2011

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Mary D. Brown, Dung Q. Nguyen, Todd A. Venton
Error recovery following speculative execution with an instruction processing pipeline

Patent number: 8037287

Abstract: An instruction processing pipeline 6 is provided. This has error detection and error recovery circuitry 20 associated with one or more of the pipeline stages. If an error is detected within a signal value within that pipeline stage, then it can be repaired. Part of the error recovery may be to flush upstream program instructions from the instruction pipeline 6. When multi-threading, only those instructions from a thread including an instruction which has been lost as a consequence of the error recovery need to be flushed from the instruction pipeline 6. Instruction can also be selected for flushing in dependence upon characteristics such as privileged level, number of dependent instructions etc.

Type: Grant

Filed: March 14, 2008

Date of Patent: October 11, 2011

Assignee: ARM Limited

Inventors: Emre Özer, Shidhartha Das, David Michael Bull
Performance of an in-order processor by no longer requiring a uniform completion point across different execution pipelines

Patent number: 8028151

Abstract: A method, system and processor for improving the performance of an in-order processor. A processor may include an execution unit with an execution pipeline that includes a backup pipeline and a regular pipeline. The backup pipeline may store a copy of the instructions issued to the regular pipeline. The execution pipeline may include logic for allowing instructions to flow from the backup pipeline to the regular pipeline following the flushing of the instructions younger than the exception detected in the regular pipeline. By maintaining a backup copy of the instructions issued to the regular pipeline, instructions may not need to be flushed from separate execution pipelines and re-fetched. As a result, one may complete the results of the execution units to the architected state out of order thereby allowing the completion point to vary among the different execution pipelines.

Type: Grant

Filed: November 25, 2008

Date of Patent: September 27, 2011

Assignee: International Business Machines Corporation

Inventors: Christopher Michael Abernathy, Jonathan James Dement, Ronald Hall, Albert James Van Norstrand
Pipeline processor with write control and validity flags for controlling write-back of execution result data stored in pipeline buffer register

Patent number: 8019974

Abstract: A bypass circuit is provided in a pipeline processor. A pipeline register is provided between an instruction execution stage and a write-back stage. The pipeline register stores a data validity flag and a WRITE control flag to control writing data into a general purpose register unit. The data retained in the pipeline register is allowed to be written back into the general purpose register unit when the WRITE control flag indicates “valid”. The pipeline register continues to retain the retained data even after the writing of the retained data into the general purpose register unit. The first pipeline register supplies the retained data to the second stage through the bypass circuit at the time of executing a subsequent instruction having data dependency on a preceding instruction.

Type: Grant

Filed: January 12, 2009

Date of Patent: September 13, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventor: Jun Tanabe
Reducing data hazards in pipelined processors to provide high processor utilization

Patent number: 8006072

Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.

Type: Grant

Filed: May 18, 2010

Date of Patent: August 23, 2011

Assignee: Micron Technology, Inc.

Inventors: Neal Andrew Cook, Alan T. Wootton, James Peterson
Advanced processor scheduling in a multithreaded system

Patent number: 7984268

Abstract: An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner.

Type: Grant

Filed: July 23, 2004

Date of Patent: July 19, 2011

Assignee: NetLogic Microsystems, Inc.

Inventors: David T. Hass, Abbas Rashid
Method and system for early instruction text based operand store compare reject avoidance

Patent number: 7975130

Abstract: A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction.

Type: Grant

Filed: February 20, 2008

Date of Patent: July 5, 2011

Assignee: International Business Machines Corporation

Inventors: Khary J. Alexander, Fadi Y. Busada, Bruce C. Giamei, David S. Hutton, Chung-Lung Kevin Shum
Replaying memory operation assigned a load/store buffer entry occupied by store operation processed beyond exception reporting stage and retired from scheduler

Patent number: 7962730

Abstract: In one embodiment, a processor comprises a retire unit and a load/store unit coupled thereto. The retire unit is configured to retire a first store memory operation responsive to the first store memory operation having been processed at least to a pipeline stage at which exceptions are reported for the first store memory operation. The load/store unit comprises a queue having a first entry assigned to the first store memory operation. The load/store unit is configured to retain the first store memory operation in the first entry subsequent to retirement of the first store memory operation if the first store memory operation is not complete. The queue may have multiple entries, and more than one store may be retained in the queue after being retired by the retire unit.

Type: Grant

Filed: November 25, 2008

Date of Patent: June 14, 2011

Assignee: Apple Inc.

Inventors: Wei-Han Lien, Po-Yung Chang
Recycling long multi-operand instructions

Patent number: 7962726

Abstract: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.

Type: Grant

Filed: March 19, 2008

Date of Patent: June 14, 2011

Assignee: International Business Machines Corporation

Inventors: Edward T. Malley, Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Jeffrey S. Plate, John G. Rell, Jr., Chung-Lung Kevin Shum
Monitoring software pipeline performance on a network on chip

Patent number: 7958340

Abstract: Software pipelining on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, each IP block adapted to a router through a memory communications controller and a network interface controller, each memory communications controller controlling communication between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers.

Type: Grant

Filed: May 9, 2008

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Russell D. Hoover, Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
Scheduler in multi-threaded processor prioritizing instructions passing qualification rule

Patent number: 7949855

Abstract: A processor buffers asynchronous threads. Instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one computation operation and at least one memory access operation. Instructions within each phase are qualified and prioritized. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the current instructions. The instructions may also be qualified based on an age of each instruction, status of the execution units, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.

Type: Grant

Filed: April 28, 2008

Date of Patent: May 24, 2011

Assignee: NVIDIA Corporation

Inventors: Peter C. Mills, John Erik Lindholm, Brett W. Coon, Gary M. Tarolli, John Matthew Burgess
Method and structure for asynchronous skip-ahead in synchronous pipelines

Patent number: 7945765

Abstract: An electronic apparatus includes a plurality of stages serially interconnected as a pipeline to perform sequential processings on input operands. A shortening circuit associated with at least one stage of the pipeline recognizes when one or more of input operands for the stage has been predetermined as appropriate for shortening and execute the shortening when appropriate.

Type: Grant

Filed: January 31, 2008

Date of Patent: May 17, 2011

Assignee: International Business Machines Corporation

Inventors: Philip George Emma, Allan Mark Hartstein, Hans Jacobson, William Robert Reohr
Reshuffled communications processes in pipelined asynchronous circuits

Patent number: 7934031

Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 26, 2011

Assignee: California Institute of Technology

Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
Apparatus providing locally adaptive retiming pipeline with swing structure

Patent number: 7917793

Abstract: The present invention uses a swing structure to avoid using a clock period at a non-efficient execution time. The execution time is precisely controlled to enhance a performance of a processor using a low voltage. Thus, synchronization problems in a chip under different environments are solved for high reliability.

Type: Grant

Filed: February 11, 2008

Date of Patent: March 29, 2011

Assignee: National Chung Cheng University

Inventors: Shu-Hsuan Chou, Yi-Chao Chan, Ming-Ku Chang, Tien-Fu Chen
PROVIDING THREAD FAIRNESS IN A HYPER-THREADED MICROPROCESSOR

Publication number: 20110055525

Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.

Type: Application

Filed: November 8, 2010

Publication date: March 3, 2011

Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
Technique to enable store forwarding during long latency instruction execution

Patent number: 7900023

Abstract: A technique to allow independent loads to be satisfied during high-latency instruction processing. Embodiments of the invention relate to a technique in which a storage structure is used to hold store operations in program order while independent load instructions are satisfied during a time in which a high-latency instruction is being processed. After the high-latency instruction is processed, the store operations can be restored in program order without searching the storage structure.

Type: Grant

Filed: December 16, 2004

Date of Patent: March 1, 2011

Assignee: Intel Corporation

Inventors: Ravi Rajwar, Srikanth T. Srinivasan, Haitham Akkary, Amit Gandhi
Processor memory system

Patent number: 7890733

Abstract: A data processor comprises a plurality of processing elements (PEs), with memory local to at least one of the processing elements, and a data packet-switched network interconnecting the processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid, e.g., in a SIMD array, so as to connect the PEs and their local memories to a common controller. Transaction-enabled PEs and nodes set flags, which are maintained until the transaction is completed and signal status to the controller e.g., over a series of OR-gates. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The local memories share the same memory map or space. External memory may also be connected to the “end” nodes interfacing with the network, eg to provide cache.

Type: Grant

Filed: August 11, 2005

Date of Patent: February 15, 2011

Assignee: Rambus Inc.

Inventor: Ray McConnell
Method and system for efficient tentative tracing of software in multiprocessors

Patent number: 7882337

Abstract: A method of tentative tracing execution events in a multiprocessor system. Each processor stores tentative events in a corresponding buffer. The processor sets pointers in an array to a head and tail of a thread. When a condition triggers a tentative thread to be committed, the processor marks the first event as committed and sets the pointers to a null value. When a condition triggers the thread to be discarded, the processor marks the first event as discarded and sets the pointers to a null value. The processor makes the buffer available to a consumer process, which extracts the first event. If the first event is marked as committed, the consumer process follows a link to a second event of the thread and marks the second event as committed. If the first event is marked as discarded, the second event is marked as discarded and the first event is skipped.

Type: Grant

Filed: May 19, 2007

Date of Patent: February 1, 2011

Assignee: International Business Machines Corporation

Inventor: Jose G. Rivera
Branch lookahead prefetch for microprocessors

Patent number: 7877580

Abstract: A method of handling program instructions in a microprocessor which reduces delays associated with mispredicted branch instructions, by detecting the occurrence of a stall condition during execution of the program instructions, speculatively executing one or more pending instructions which include at least one branch instruction during the stall condition, and determining the validity of data utilized by the speculative execution. Dispatch logic determines the validity of the data by marking one or more registers of an instruction dispatch unit to indicate which results of the pending instructions are invalid. The speculative execution of instructions can occur across multiple pipeline stages of the microprocessor, and the validity of the data is tracked during their execution in the multiple pipeline stages while monitoring a dependency of the speculatively executed instructions relative to one another during their execution in the multiple pipeline stages.

Type: Grant

Filed: December 10, 2007

Date of Patent: January 25, 2011

Assignee: International Business Machines Corporation

Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
Stall prediction thread management

Patent number: 7865702

Abstract: Thread switching prevents pipeline stalls when executing multiple threads. An analysis of a first thread identifies instructions capable of causing pipeline stalls. If pipeline stalls from the identified instructions are likely, thread switching instructions are added to the first thread in place of the identified instructions. Thread switching instructions direct a microprocessor to suspend executing the thread and begin executing a second thread. Thread switching instructions can be added to the second thread to enable the resumption of the first thread at the location specified by the identified instruction. The thread switching instructions are configured to avoid pipeline stalls when switching threads. Thread switching instructions can store and retrieve thread-specific information upon the suspension and resumption of threads. Thread switching instructions can schedule the execution of two or more threads in accordance with load balancing schemes.

Type: Grant

Filed: August 17, 2009

Date of Patent: January 4, 2011

Assignee: Sony Computer Entertainment Inc.

Inventor: Victor Suba
Mechanism for predicting and suppressing instruction replay in a processor

Patent number: 7861066

Abstract: A mechanism for suppressing instruction replay includes a processor having one or more execution units and a scheduler that issue instruction operations for execution by the one or more execution units. The scheduler may also cause instruction operations that are determined to be incorrectly executed to be replayed, or reissued. In addition, a prediction unit within the processor may predict whether a given instruction operation will replay and to provide an indication that the given instruction operation will replay. The processor also includes a decode unit that may decode instructions and in response to detecting the indication, may flag the given instruction operation. The scheduler may further inhibit issue of the flagged instruction operation until a status associated with the flagged instruction is good.

Type: Grant

Filed: July 20, 2007

Date of Patent: December 28, 2010

Assignee: Advanced Micro Devices, Inc.

Inventors: Ashutosh S. Dhodapkar, Michael G. Butler, Gene W. Shen
Method, system, and computer program product for selectively accelerating early instruction processing

Patent number: 7861064

Abstract: A method for selectively accelerating early instruction processing including receiving an instruction data that is normally processed in an execution stage of a processor pipeline, wherein a configuration of the instruction data allows a processing of the instruction data to be accelerated from the execution stage to an address generation stage that occurs earlier in the processor pipeline than the execution stage, determining whether the instruction data can be dispatched to the address generation stage to be processed without being delayed due to an unavailability of a processing resource needed for the processing of the instruction data in the address generation stage, dispatching the instruction data to be processed in the address generation stage if it can be dispatched without being delayed due to the unavailability of the processing resource, and dispatching the instruction data to be processed in the execution stage if it can not be dispatched without being delayed due to the unavailability of the pro

Type: Grant

Filed: February 26, 2008

Date of Patent: December 28, 2010

Assignee: International Business Machines Corporation

Inventors: Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, David S. Hutton, Chung-Lung Kevin Shum
CLOCK CONTROL DEVICE, CLOCK CONTROL METHOD, CLOCK CONTROL PROGRAM AND INTEGRATED CIRCUIT

Publication number: 20100325469

Abstract: An instruction detecting section (235) detects whether or not there is any succeeding instruction executable regardless of an order based on a data dependency relationship between a presently executed instruction and a succeeding instruction following the presently executed instruction. A clock switch judging section (236) receives notification of the start and end of a memory stall, determines whether or not a memory stall is occurring, and judges whether to switch a clock signal to be supplied to a CPU (200) to a low clock signal (239) or to stop the clock signal based on a detection result of the instruction detecting section (235) if it is judged that the memory stall is occurring. A clock switching section (237) switches the clock signal based on judgment by the clock switch judging section (236). By this construction, power consumption can be reduced without reducing performance.

Type: Application

Filed: December 10, 2008

Publication date: December 23, 2010

Inventors: Ryo Yokoyama, Tadao Tanikawa
Store queue architecture for a processor that supports speculative execution

Patent number: 7849290

Abstract: Embodiments of the present invention provide a system that buffers stores on a processor that supports speculative execution. The system starts by buffering a store into an entry in the store queue during a speculative execution mode. If an entry for the store does not already exist in the store queue, the system writes the store into an available entry in the store queue and updates a byte mask for the entry. Otherwise, if an entry for the store already exists in the store queue, the system merges the store into the existing entry in the store queue and updates the byte mask for the entry to include information about the newly merged store. The system then forwards the data from the store queue to subsequent dependent loads.

Type: Grant

Filed: July 9, 2007

Date of Patent: December 7, 2010

Assignee: Oracle America, Inc.

Inventors: Robert E. Cypher, Shailender Chaudhry
Method and system for pipeline reduction

Patent number: 7844799

Abstract: A method and system for operating a high frequency out-of-order processor with increased pipeline length. A new scheme is disclosed to reduce the pipeline by the detection and exploitation of so called “no dependency” for an instruction. A “no dependency” signal tells that all required source data is available for the instruction at least one cycle before the source data valid bit(s) are inserted into the issue queue. Therefore, one or more stages of the pipeline are bypassed.

Type: Grant

Filed: December 20, 2001

Date of Patent: November 30, 2010

Assignee: International Business Machines Corporation

Inventors: Jens Leenstra, Antje Mueller, Juergen Pille, Dieter Wendel
Method and system for supporting software pipelining using a shifting register queue

Patent number: 7836279

Abstract: A system for supporting software pipelining using a shifting register queue is provided. The system includes a register file that comprises a plurality of registers. The register file is operable to receive a shift mask signal and a shift signal and to identify a shifting register queue within the register file based on the shift mask signal. The shifting register queue comprises a plurality of queue registers. The register file is further operable to shift the contents of the queue registers based on the shift signal.

Type: Grant

Filed: December 31, 2003

Date of Patent: November 16, 2010

Assignee: STMicroelectronics, Inc.

Inventors: Osvaldo Colavin, Vineet Soni, Davide Rizzo
Communicating signals between semiconductor chips using round-robin-coupled micropipelines

Patent number: 7831810

Abstract: Embodiments of the present invention provide a system for transferring data between a receiver chip and a transmitter chip. The system includes a set of data path circuits in the transmitter chip and a set of data path circuits in the receiver chip coupled to a shared data channel. In addition, the system includes a set of asynchronous control circuits for controlling corresponding data path circuits in the transmitter chip and receiver chip. Upon detecting the transition of a control signal for an asynchronous control circuit in the transmitter chip, the asynchronous control circuit is configured to enable a transfer of data from the corresponding data path circuit in the transmitter chip across the data channel to a corresponding data path circuit in the receiver chip, and generate a control signal to cause a next asynchronous control circuit to commence the transfer of a data signal.

Type: Grant

Filed: October 2, 2007

Date of Patent: November 9, 2010

Assignee: Oracle America, Inc.

Inventor: Scott M. Fairbanks
Processor livelock recovery by gradual stalling of instruction processing rate during detection of livelock condition

Patent number: 7818544

Abstract: Mechanisms for placing a processor into a gradual slow down mode of operation are provided. The gradual slow down mode of operation comprises a plurality of stages of slow down operation of an issue unit in a processor in which the issuance of instructions is slowed in accordance with a staging scheme. The gradual slow down of the processor allows the processor to break out of livelock conditions. Moreover, since the slow down is gradual, the processor may flexibly avoid various degrees of livelock conditions. The mechanisms of the illustrative embodiments impact the overall processor performance based on the severity of the livelock condition by taking a small performance impact on less severe livelock conditions and only increasing the processor performance impact when the livelock condition is more severe.

Type: Grant

Filed: September 5, 2008

Date of Patent: October 19, 2010

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Kurt A. Feiste, Ronald P. Hall, Albert J. Van Norstrand, Jr.
Configurable pipeline to process an operation at alternate pipeline stages depending on ECC/parity protection mode of memory access

Patent number: 7814300

Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.

Type: Grant

Filed: April 30, 2008

Date of Patent: October 12, 2010

Assignee: Freescale Semiconductor, Inc.

Inventors: William C. Moyer, Jeffrey W. Scott
Trace indexing via trace end addresses

Patent number: 7802077

Abstract: A new class traces for a processing engine, called “extended blocks,” possess an architecture that permits possible many entry points but only a single exit point. These extended blocks may be indexed based upon the address of the last instruction therein. Use of the new trace architecture provides several advantages, including reduction of instruction redundancies, dynamic block extension and a sharing of instructions among various extended blocks.

Type: Grant

Filed: June 30, 2000

Date of Patent: September 21, 2010

Assignee: Intel Corporation

Inventors: Stephen J. Jourdan, Lihu Rappoport, Ronny Ronen, Adi Yoaz
Prediction of data values read from memory by a microprocessor using the storage destination of a load operation

Patent number: 7788473

Abstract: Prediction of data values to be read from memory by a microprocessor for load operations. In one aspect, a method for predicting a data value that will result from a load operation to be executed by the microprocessor includes accessing an entry in a load value prediction table that stores a predicted data value corresponding to the load operation. The predicted data value is stored in a physical storage destination of the microprocessor to be available as a result of the load operation without waiting for execution of the load operation to complete. The storage destination is the destination for a loaded data value resulting from executing the load operation.

Type: Grant

Filed: December 26, 2006

Date of Patent: August 31, 2010

Assignee: Oracle America, Inc.

Inventors: Chris Nelson, Matthew Ashcraft, John Gregory Favor, Seungyoon Peter Song
Symbolic store-load bypass

Patent number: 7779236

Abstract: The invention provides a method and system for operating a pipelined microprocessor more quickly, by detecting instructions that load from identical memory locations as were recently stored to, without having to actually compute the referenced external memory addresses. The microprocessor examines the symbolic structure of instructions as they are encountered, so as to be able to detect identical memory locations by examination of their symbolic structure. For example, in a preferred embodiment, instructions that store to and load from an identical offset from an identical register are determined to be referencing the identical memory location, without having to actually compute the complete physical target address.

Type: Grant

Filed: November 19, 1999

Date of Patent: August 17, 2010

Assignee: STMicroelectronics, Inc.

Inventor: David L. Isaman
Result bypassing to override a data hazard within a superscalar processor

Patent number: 7774582

Abstract: A data processing system including multiple execution pipelines each having multiple execution stages E1, E2, E3 may have instructions issued together in parallel despite a data dependency therebetween if it is detected that the result operand value for the older instruction will be generated in an execution stage prior to an execution stage which requires that result operand value as an input operand value to the younger instruction and accordingly cross-forwarding of the operand value is possible between the execution pipelines to resolve the data dependency.

Type: Grant

Filed: May 26, 2005

Date of Patent: August 10, 2010

Assignee: ARM Limited

Inventors: David James Williamson, Glen Andrew Harris, Stephen John Hill
Method for allocating registers using simulated annealing controlled instruction scheduling

Patent number: 7761691

Abstract: A method for scheduling instructions for clustered digital signal processors comprising a plurality of clusters, each cluster including at least two functional units and a first register file having a first unit, a second unit and a single set of access ports shared by the functional units comprises steps of checking whether executing one instruction needs data to be read from the first unit and the second unit of the first register file, generating a copying instruction to transfer data from the first unit to the second unit of the first register file, checking whether there is a prior operation cycle available to perform the copying instruction, scheduling the copying instruction in the prior operation cycle, and scheduling the instruction after the copying instruction.

Type: Grant

Filed: October 27, 2005

Date of Patent: July 20, 2010

Assignee: National Tsing Hua University

Inventors: Chung-Lin Tang, Yung-Chia Lin, Jenq-Kuen Lee
MECHANISM FOR INCREASING THE EFFECTIVE CAPACITY OF THE WORKING REGISTER FILE

Publication number: 20100180103

Abstract: A computer processor pipeline has both an architectural register file and a working register file. The lifetime of an entry in the working register file is determined by a predetermined number of instructions passing through a specified stage in the pipeline after the location in the working register file is allocated for an instruction. The size of the working register file is selected based upon performance characteristics. A working register file creditor indicator is coupled to the front end pipeline portion and to the back end pipeline portion. The working register file credit indicator is monitored to prevent a working register file overflow. When the a location in the architectural register file is read early, the location is monitored to determine whether the location is written to prior to issuance of the instruction associated with the early read.

Type: Application

Filed: January 15, 2009

Publication date: July 15, 2010

Inventors: Shailender Chaudhry, Paul Caprioli, Marc Tremblay
Multi-thread processing system for detecting and handling live-lock conditions by arbitrating livelock priority of logical processors based on a predertermined amount of time

Patent number: 7748001

Abstract: Method, apparatus and system embodiments to assign priority to a thread when the thread is otherwise unable to proceed with instruction retirement. For at least one embodiment, the thread is one of a plurality of active threads in a multiprocessor system that includes memory livelock breaker logic and/or starvation avoidance logic. Other embodiments are also described and claimed.

Type: Grant

Filed: September 23, 2004

Date of Patent: June 29, 2010

Assignee: Intel Corporation

Inventors: David W. Burns, K. S. Venkatraman
Power aware software pipelining for hardware accelerators

Publication number: 20100146314

Abstract: Forming a plurality of pipeline orderings, each pipeline ordering comprising one of a sequential, a parallel, or a sequential and parallel combination of a plurality of stages of a pipeline, analyzing the plurality of pipeline orderings to determine a total power of each of the orderings, and selecting one of the plurality of pipeline orderings based on the determined total power of each of the plurality of pipeline orderings.

Type: Application

Filed: February 11, 2010

Publication date: June 10, 2010

Inventors: Ron Gabor, Hong Jiang, Alon Naveh, Doron Rajwan, James Varga, Gady Yearim, Yuval Yosef

prev 1 2 3 4 5 6 7 … next