Instruction Issuing Patents (Class 712/214)

Simultaneous issuance of multiple instructions (Class 712/215)

Issue policy control within a multi-threaded in-order superscalar processor

Publication number: 20080282067

Abstract: A multi-threaded in-order superscalar processor 2 includes an issue stage 12 including issue circuitry 22, 24 for selecting instructions to be issued to execution units 14, 16 in dependence upon a currently selected issue policy. A plurality of different issue policies are provided by associated different policy circuitry 28, 30, 32 and a selection between which of these instances of the policy circuitry 28, 30, 32 is active is made by policy selecting circuitry 34 in dependence upon detected dynamic behaviour of the processor 2.

Type: Application

Filed: March 27, 2008

Publication date: November 13, 2008

Applicant: ARM Limited

Inventors: Emre Ozer, Stuart David Biles
Load lookahead prefetch for microprocessors

Patent number: 7444498

Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.

Type: Grant

Filed: December 17, 2004

Date of Patent: October 28, 2008

Assignee: International Business Machines Corporation

Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
Method of Decoding A Bit Sequence, Network Element Apparatus And PDU Specification Tool Kit

Publication number: 20080256337

Abstract: In the field of data communications, it is desirable to track bits of a bit sequence remaining to be decoded by a decoder. A method of decoding the bit sequence that corresponds to a PDU comprises reading-in a bit sequence and processing the bit sequence. In order to maintain a record of the bits reaming to be processed, a data stack is used during decoding of the bit sequence.

Type: Application

Filed: April 14, 2008

Publication date: October 16, 2008

Applicant: AGILENT TECHNOLOGIES, INC.

Inventors: Kevin Mitchell, Tony Kirkham
Apparatus and method for reducing execution latency of floating point operations having special case operands

Patent number: 7437538

Abstract: An apparatus and method for floating-point special case handling. In one embodiment, a processor may include a first execution unit configured to execute a longer-latency floating-point instruction, and a second execution unit configured to execute a shorter-latency floating-point instruction. In response to the longer-latency floating-point instruction being issued to the first execution unit, the second execution unit may be further configured to detect whether a result of the longer-latency floating-point instruction is determinable from one or more operands of the longer-latency floating-point instruction independently of the first execution unit executing the longer-latency floating-point instruction. In response to detecting that the result is determinable, the second execution unit may be further configured to flush the longer-latency floating-point instruction from the first execution unit and to determine the result.

Type: Grant

Filed: June 30, 2004

Date of Patent: October 14, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Jeffrey S. Brooks, Christopher H. Olson
MACROSCALAR PROCESSOR ARCHITECTURE

Publication number: 20080229076

Abstract: A macroscalar processor architecture is described herein. In one embodiment, an exemplary processor includes one or more execution units to execute instructions and one or more iteration units coupled to the execution units. The one or more iteration units receive one or more primary instructions of a program loop that comprise a machine executable program. For each of the primary instructions received, at least one of the iteration units generates multiple secondary instructions that correspond to multiple loop iterations of the task of the respective primary instruction when executed by the one or more execution units. Other methods and apparatuses are also described.

Type: Application

Filed: May 23, 2008

Publication date: September 18, 2008

Inventor: Jeffry E. Gonion
Systems and Methods for TDM Multithreading

Publication number: 20080222394

Abstract: Systems and methods for distributing thread instructions in the pipeline of a multi-threading digital processor are disclosed. More particularly, hardware and software are disclosed for successively selecting threads in an ordered sequence for execution in the processor pipeline. If a thread to be selected cannot execute, then a complementary thread is selected for execution.

Type: Application

Filed: May 21, 2008

Publication date: September 11, 2008

Inventors: Claude Basso, Jean Louis Calvignac, Chih-jen Chang, Gordon Taylor David, Harm Peter Hofstee, Fabrice Jean Verplanken, Colin Beaton Verrilli
System and Method for Adaptive Run-Time Reconfiguration for a Reconfigurable Instruction Set Co-Processor Architecture

Publication number: 20080215854

Abstract: A method for adaptive runtime reconfiguration of a co-processor instruction set, in a computer system with at least a main processor communicatively connected to at least one reconfigurable co-processor, includes the steps of configuring the co-processor to implement an instruction set comprising one or more co-processor instructions, issuing a co-processor instruction to the co-processor, and determining whether the instruction is implemented in the co-processor. For an instruction not implemented in the co-processor instruction set, raising a stall signal to delay the main processor, determining whether there is enough space in the co-processor for the non-implemented instruction, and if there is enough space for said instruction, reconfiguring the instruction set of the co-processor by adding the non-implemented instruction to the co-processor instruction set. The stall signal is cleared and the instruction is executed.

Type: Application

Filed: May 15, 2008

Publication date: September 4, 2008

Inventors: Sameh W. Asaad, Richard Gerard Hofmann
Time stamping transactions to validate atomic operations in multiprocessor systems

Publication number: 20080209176

Abstract: A multi-core microprocessor has a plurality of processor cores which are coupled to a bridge element. The bridge element sends transactions to and/or receives transactions from the processor cores, where each transaction has one or more packets. The transactions include atomic transactions. The bridge element comprises a buffer unit storing a time stamp for each packet sent or received. Furthermore, a multi-core multi-node processor system is provided that has debug hardware to capture and time stamp intra-node and/or inter-node transaction packets. Atomic operations are, for example, atomic read-modify-write instructions.

Type: Application

Filed: August 9, 2007

Publication date: August 28, 2008

Inventors: Padmaraj Singh, Todd Foster, Dennis Lastor
Method and Apparatus for Back to Back Issue of Dependent Instructions in an Out of Order Issue Queue

Publication number: 20080209178

Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.

Type: Application

Filed: May 2, 2008

Publication date: August 28, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William Elton Burky, Raymond Cheung Yeung
Mechanism in a Multi-Threaded Microprocessor to Maintain Best Case Demand Instruction Redispatch

Publication number: 20080209177

Abstract: A method and system for maintaining a best-case demand redispatch of an instruction to allow for maximizing the time a rejected thread may execute in lookahead execution mode, while maintaining the smallest L1 cache miss penalty supported by the memory subsystem. In response to a demand miss, a load/store unit sends a fetch request to the next level cache. The cache line of the demand miss is examined to identify the critical sector. Once the critical sector is identified, a best-case data return time is determined based on the fastest time the next level cache is able to return the critical sector of the cache line. The load/store unit then sends a speculative warning to the dispatch unit to coincide with the best-case data return, wherein the speculative warning prepares the dispatch unit to resend the instruction for execution as soon as data is available to the processor core.

Type: Application

Filed: May 1, 2008

Publication date: August 28, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Scott Bruce Frommer, Sheldon B. Levenstein, Bruce Joseph Ronchetti, Anthony Saporito
Prioritized issuing of operation dedicated execution unit tagged instructions from multiple different type threads performing different set of operations

Patent number: 7418576

Abstract: A graphics processor buffers vertex thread and pixel threads. The different types of threads issue instructions corresponding to different sets of operations. A plurality of different types of execution units are provided, each type of execution unit servicing a different class of operations, such as an executing unit supporting texture operations, an execution unit supporting blending operations, and an execution unit supporting mathematical operations. Current instructions of the threads are buffered and prioritized in a common instruction buffer. A set of high priority instructions is issued per cycle to the plurality of different types of execution units.

Type: Grant

Filed: November 17, 2004

Date of Patent: August 26, 2008

Assignee: Nvidia Corporation

Inventors: John E. Lindholm, Brett W. Coon
Speculative Instruction Issue in a Simultaneously Multithreaded Processor

Publication number: 20080189521

Abstract: A method for optimizing throughput in a microprocessor that is capable of processing multiple threads of instructions simultaneously. Instruction issue logic is provided between the input buffers and the pipeline of the microprocessor. The instruction issue logic speculatively issues instructions from a given thread based on the probability that the required operands will be available when the instruction reaches the stage in the pipeline where they are required. Issue of an instruction is blocked if the current pipeline conditions indicate that there is a significant probability that the instruction will need to stall in a shared resource to wait for operands. Once the probability that the instruction will stall is below a certain threshold, based on current pipeline conditions, the instruction is allowed to issue.

Type: Application

Filed: April 17, 2008

Publication date: August 7, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Victor Roberts Augsburg, Jeffrey Todd Bridges, Michael Scott Mcilvaine, Thomas Andrew Sartorius, Rodney Wayne Smith
USING PERFORMANCE DATA FOR INSTRUCTION THREAD DIRECTION

Publication number: 20080189520

Abstract: A method for dispatching instructions in the data processing system, having in memory for storing instructions and a plurality of central processing units, where each central processing unit includes a circuit to provide data indicating internal performance, the method having steps of receiving internal performance data signals from a pool of central processing units, selecting a central processing unit according to the received internal performance data and dispatching instructions from the memory to the selected central processing unit.

Type: Application

Filed: February 6, 2007

Publication date: August 7, 2008

Inventors: Deepak K. Singh, Francois Ibrahim Atallah
Method and apparatus for compressing VLIW instruction and sharing subinstructions

Patent number: 7409530

Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.

Type: Grant

Filed: December 17, 2004

Date of Patent: August 5, 2008

Assignee: University of Washington

Inventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
APPARATUS AND METHOD FOR DECREASING THE LATENCY BETWEEN INSTRUCTION CACHE AND A PIPELINE PROCESSOR

Publication number: 20080177981

Abstract: A method and apparatus for executing instructions in a pipeline processor. The method decreases the latency between an instruction cache and a pipeline processor when bubbles occur in the processing stream due to an execution of a branch correction, or when an interrupt changes the sequence of an instruction stream. The latency is reduced when a decode stage for detecting branch prediction and a related instruction queue location have invalid data representing a bubble in the processing stream. Instructions for execution are inserted in parallel into the decode stage and instruction queue, thereby reducing by one cycle time the length of the pipeline stage.

Type: Application

Filed: October 8, 2007

Publication date: July 24, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: James N. Dieffenderfer, Richard W. Doing, Brian M. Stempel, Steven R. Testa, Kenichi Tsuchiya
Apparatus and method for fine-grained multithreading in a multipipelined processor core

Patent number: 7401206

Abstract: An apparatus and method for fine-grained multithreading in a multipipelined processor core. According to one embodiment, a processor may include instruction fetch logic configured to assign a given one of a plurality of threads to a corresponding one of a plurality of thread groups, where each of the plurality of thread groups may comprise a subset of the plurality of threads, to issue a first instruction from one of the plurality of threads during one execution cycle, and to issue a second instruction from another one of the plurality of threads during a successive execution cycle. The processor may further include a plurality of execution units, each configured to execute instructions issued from a respective thread group.

Type: Grant

Filed: June 30, 2004

Date of Patent: July 15, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Ricky C. Hetherington, Gregory F. Grohoski, Robert T. Golla
Symbolic Execution of Instructions on In-Order Processors

Publication number: 20080168260

Abstract: A method is provided for processing instructions by a processor, in which instructions are queued in an instruction pipeline in a queued order. A first instruction is identified from the queued instructions in the instruction pipeline, the first instruction being identified as having a dependency which is satisfiable within a number of instruction cycles after a current instruction in the instruction pipeline is issued. The first instruction is placed in a side buffer and at least one second instruction is issued from the remaining queued instructions while the first instruction remains in the side buffer. Then, the first instruction is issued from the side buffer after issuing the at least one second instruction in the queued order when the dependency of the first instruction has cleared and after the number of instruction cycles have passed.

Type: Application

Filed: January 8, 2007

Publication date: July 10, 2008

Inventors: Victor Zyuban, Michael K. Gschwind, John-David Wellman
COMPUTER PROCESSING SYSTEM EMPLOYING AN INSTRUCTION SCHEDULE CACHE

Publication number: 20080162884

Abstract: A processor core and method of executing instructions, both of which utilizes schedules, are presented. Each of the schedules includes a sequence of instructions, an address of a first of the instructions in the schedule, an order vector of an original order of the instructions in the schedule, a rename map of registers for each register in the schedule, and a list of register names used in the schedule. The schedule exploits instruction-level parallelism in executing out-of-order instructions. The processor core includes a schedule cache that is configured to store schedules, a shared cache configured to store both I-side and D-side cache data, and an execution resource for requesting a schedule to be executed from the schedule cache. The processor core further includes a scheduler disposed between the schedule cache and the cache.

Type: Application

Filed: January 2, 2007

Publication date: July 3, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Krishnan K. Kailas, Ravi Nair, Sumedh W. Sathaye, Wolfram Sauer, John-David Wellman
SYSTEM FOR GENERATING EFFECTIVE ADDRESS

Publication number: 20080162887

Abstract: Method, system and computer program product for generating effective addresses in a data processing system. A method, in a data processing system, for generating an effective address includes generating a first portion of the effective address by calculating a first plurality of effective address bits of the effective address, and generating a second portion of the effective address by guessing a second plurality of effective address bits of the effective address. By intelligently guessing a plurality of the effective address bits that form the effective address, the effective address can be generated and sent to a translation unit more quickly than in a system in which all the effective address bits of the effective address are calculated. The method and system is particularly suitable for generating effective addresses in a CAM-based effective address translation design in a multi-threaded environment.

Type: Application

Filed: March 14, 2008

Publication date: July 3, 2008

Inventors: RACHEL MARIE FLOOD, Scott Bruce Frommer, David Allen Hrusecky, Sheldon B. Levenstein, Michael Thomas Vaden
Handling precompiled binaries in a hardware accelerated software transactional memory system

Publication number: 20080162886

Abstract: A method and apparatus for enabling a Software Transactional Memory (STM) with precompiled binaries is herein described. Upon encountering an access operation in a transaction, an annotation field associated with a memory location referenced by the access is checked. In response to the memory location representing a previous similar access within the transaction, the access is performed without access barriers. However, if the annotation field is in a default state representing no previous access during a pendancy of the transaction, then a mode of the processor is determined. If the processor mode is in implicit mode, an access handler/barrier is asynchronously executed. Conversely, in an explicit mode, a flag is set instead of asynchronously executing the handler. In addition, during compilation convert explicit and convert implicit instructions are inserted to intelligently convert modes for precompiled and newly compiled binaries.

Type: Application

Filed: December 28, 2006

Publication date: July 3, 2008

Inventors: Bratin Saha, Ali-Reza Adl-Tabatabai, Quinn Jacobsen
Mechanism for software transactional memory commit/abort in unmanaged runtime environment

Publication number: 20080162885

Abstract: A method and apparatus for ensuring integrity of transaction exit functions is herein described. Dead local data in a transaction is prevented from overwriting local variables associated with a transaction exit function. In a write-buffering Software Transactional Memory (STM) system, a commit function is associated with a private stack to store local variables to ensure write-back of local dead data in a write-buffer does not corrupt the commit function. Similarly, in a roll-back STM, an abort function is associated with a private stack to store local variables to ensure the roll-back of a program stack with local dead data from a write log does not corrupt the abort function. Alternatively, one stack may be used for the transaction including a first function and an exit function. Here, local dead variables are detected and prevented from overwriting local variables of the exit function.

Type: Application

Filed: December 28, 2006

Publication date: July 3, 2008

Inventors: Cheng Wang, Youfeng Wu, Bratin Saha, Ali-Reza Adl-Tabatabai
Method and system for handling events in an application framework for a wireless device

Patent number: 7395082

Abstract: Methods and systems for application framework development for wireless devices are provided herein. Aspects of the method may include acquiring an MMI event from an MMI event queue within the MMI wireless framework. An identity of the acquired MMI event may be determined and the acquired MMI event may be dispatched to an event handler based on the determined identity of the acquired event. If the acquired MMI event comprises a timing event, the acquired MMI event may be dispatched to an MMI event owner within the MMI wireless framework. If the acquired MMI event comprises a keypad event, the acquired MMI event may be dispatched to a currently active MMI view within the MMI wireless framework. If the acquired MMI event comprises an addressed event, the acquired MMI event may be dispatched to a destination handler within the MMI wireless framework.

Type: Grant

Filed: August 25, 2004

Date of Patent: July 1, 2008

Assignee: Broadcom Corporation

Inventors: Derek John Foster, Lori Yoshida, Richard Zhang
Command ordering among commands in multiple queues using hold-off vector generated from in-use vector and queue dependency scorecard

Patent number: 7392367

Abstract: A method, apparatus, system, and signal-bearing medium that in various embodiments determine whether to execute a command in a queue or whether to wait until another command or commands completed. The determination is based on a combination of an in-use vector and a scorecard vector. The in-use vector indicates which slots in various queues contain commands. The scorecard vector indicates the dependencies between various queues. In this way, the scorecard vector, and the thus the queue dependencies can be set and modified after the logic that processes the commands has been designed.

Type: Grant

Filed: June 19, 2006

Date of Patent: June 24, 2008

Assignee: International Business Machines Corporation

Inventors: Scott D. Clark, Scott M. Willenborg
Adaptive fetch gating in multithreaded processors, fetch control and method of controlling fetches

Patent number: 7392366

Abstract: A multithreaded processor, fetch control for a multithreaded processor and a method of fetching in the multithreaded processor. Processor event and use (EU) signs are monitored for downstream pipeline conditions indicating pipeline execution thread states. Instruction cache fetches are skipped for any thread that is incapable of receiving fetched cache contents, e.g., because the thread is full or stalled. Also, consecutive fetches may be selected for the same thread, e.g., on a branch mis-predict. Thus, the processor avoids wasting power on unnecessary or place keeper fetches.

Type: Grant

Filed: September 16, 2005

Date of Patent: June 24, 2008

Assignee: International Business Machines Corp.

Inventors: Pradip Bose, Alper Buyuktosunoglu, Richard J. Eickemeyer, Lee E. Eisen, Philip G. Emma, John B. Griswell, Zhigang Hu, Hung Q. Le, Douglas R. Logan, Balaram Sinharoy
High Frequency Stall Design

Publication number: 20080148021

Abstract: An issue unit includes a first instruction stage, a second instruction stage, and issue control logic. During a first instruction cycle, the issue unit performs two tasks, which are 1) the instructions located in the first instruction stage are moved to a second instruction stage, and 2) the issue control logic determines whether to issue or stall the instructions that are moved to the second instruction stage based upon their particular instruction attributes and the issue control unit's previous state. During a second instruction cycle that immediately follows the first instruction cycle, the second instruction stage's instructions are either issued or stalled based upon the issue control logic's decision from the first instruction cycle.

Type: Application

Filed: February 25, 2008

Publication date: June 19, 2008

Inventors: Jonathan James DeMent, Kurt Alan Feiste, Robert Alan Philhower, David Shippy
PROGRAMMABLE VIDEO SIGNAL PROCESSOR FOR VIDEO COMPRESSION AND DECOMPRESSION

Publication number: 20080140999

Abstract: A data processing method with multiple issue multiple datapath architecture in a video signal processor (VSP) is provided. In the method, commands are received from the external signal processor. The received commands are routed to a plurality of separate command sequencers, an Input/output (IO) processor or a plurality of configure registers according to different command types. Each of the separate command sequencers packs the received commands into a plurality of instruction packets and sending the instruction packets to a plurality of instruction dispatch units, in which each of the instruction packets includes one or more instructions. The instruction packets are dispatched to respective function units for performing operations in response to the received instruction packets.

Type: Application

Filed: December 10, 2007

Publication date: June 12, 2008

Applicant: NEMOCHIPS, INC.

Inventor: DANIAN GONG
INTEGRATED MECHANISM FOR SUSPENSION AND DEALLOCATION OF COMPUTATIONAL THREADS OF EXECUTION IN A PROCESSOR

Publication number: 20080140998

Abstract: A microprocessor core includes a plurality of inputs that indicate whether a corresponding plurality of independently occurring events has occurred. The inputs are non-memory address inputs. The core also includes a yield instruction in its instruction set architecture, comprising a user-visible output operand and an explicit input operand. The input operand specifies one or more of the independently occurring events. The yield instruction instructs the microprocessor core to suspend issuing for execution instructions of a program thread until at least one of the independently occurring events specified by the input operand has occurred. The program thread contains the yield instruction. The yield instruction further instructs the microprocessor core to return a value in the output operand indicating which of the independently occurring events occurred to cause the microprocessor core to resume issuing the instructions of the program thread.

Type: Application

Filed: December 3, 2007

Publication date: June 12, 2008

Applicant: MIPS TECHNOLOGIES, INC.

Inventor: Kevin D. Kissell
HIERARCHICAL MULTI-THREADING PROCESSOR

Publication number: 20080133885

Abstract: A hierarchical microprocessor. An embodiment of a hierarchical microprocessor includes a plurality of first-level instruction pipeline elements; a plurality of execution clusters, where each execution cluster is operatively coupled with each of the first-level instruction pipeline elements. Each execution cluster includes a plurality of second-level instruction pipeline elements, where each of the second-level instruction pipeline elements corresponds with a respective first-level instruction pipeline element, and one or more instruction execution units operatively coupled with each of the second-level instruction pipeline elements, where the microprocessor is configured to execute multiple execution threads using the plurality of first-level instruction pipeline elements and the plurality of execution clusters.

Type: Application

Filed: October 31, 2007

Publication date: June 5, 2008

Applicant: CENTAURUS DATA LLC

Inventor: Andrew Forsyth Glew
HIERARCHICAL INSTRUCTION SCHEDULER

Publication number: 20080133889

Abstract: A hierarchical instruction scheduler included in a hierarchical microprocessor comprising a plurality of execution clusters. In one embodiment, a hierarchical instruction scheduler comprises a first-level instruction scheduler configured to receive instructions for execution; store first operand status information for respective operands of the instructions; and dispatch the instructions to respective execution clusters based on the instructions' respective first operand status information.

Type: Application

Filed: October 31, 2007

Publication date: June 5, 2008

Applicant: CENTAURUS DATA LLC

Inventor: Andrew Forsyth Glew
Data processor

Publication number: 20080133888

Abstract: The data processor executes an instruction having a direction for write to a reference register of other instruction flow and an instruction having a direction for reference register invalidation. The data processor is arranged as a data processor having typical functions as an integrated whole of processors (CPU1 and CPU2) which execute simple instruction flows. When executing the instruction having a direction for write to a reference register of other instruction flow, the processor confirms whether a write register is invalid. The processor waits for the register to be made invalid, if the register is not invalid, and performs write if the register is invalid. After having executed the instruction having a direction for reference register invalidation, the processor invalidates the register to which a reference has been made. When the reference register is invalid, execution of the referring instruction is suspended until it is made valid.

Type: Application

Filed: February 16, 2007

Publication date: June 5, 2008

Inventor: Fumio Arakawa
Massively reduced instruction set processor

Patent number: 7383425

Abstract: This invention is directed to a method and apparatus for providing low, predictable latencies in processing IP packets. The apparatus provides a specialized microprocessor or hardwired circuitry to process IP packets for video communications and control of the video source without an operating system. The method relates to operation of a microprocessor which is suitably arranged to carry out the steps of the method. The method includes details of operation of the specialized microprocessor.

Type: Grant

Filed: February 27, 2004

Date of Patent: June 3, 2008

Assignee: Pleora Technologies Inc.

Inventors: Eric Boisvert, Alain Rivard, George Chamberlain
MULTI-THREAD VERTEX SHADER, GRAPHICS PROCESSING UNIT AND FLOW CONTROL METHOD

Publication number: 20080122843

Abstract: A logic unit is provided for performing operations in multiple threads on vertex data. The logic unit comprises a macro instruction register file, a flow control instruction register file, and a flow controller. The macro instruction register file stores macro blocks with each macro block including at least one instruction. The flow control instruction register file stores flow control instructions with each flow control instruction including at least one called macro block and dependency information of the called macro block.

Type: Application

Filed: July 20, 2006

Publication date: May 29, 2008

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Hsine-Chu Chung, Ko-Fang Wang, Chit-Keng Huang
Prediction based instruction steering to wide or narrow integer cluster and narrow address generation

Patent number: 7380105

Abstract: A method and apparatus for improving the operation of a computer processor by utilizing an asymmetric clustered processor architecture are disclosed. The asymmetric clustered processor apparatus includes a narrow cluster, a wide cluster, a steering logic utilizing a cluster predictor for providing a decoded instruction to either the narrow cluster or the wide cluster; address registers which are not part of the ISA, and a translation look-aside buffer for translating the virtual address of a load/store instruction in parallel with an execute stage. The method includes the steps of: predictably steering the instruction to either a W-bit Wide integer cluster or an N-bit Narrow integer cluster, managing the Address register file, and processing any instruction in the Wide integer cluster but processing only N-bit instructions in the Narrow integer cluster.

Type: Grant

Filed: June 16, 2006

Date of Patent: May 27, 2008

Assignee: The Regents of the University of California

Inventors: Alexander V. Veidenbaum, Adrian Cristal Kestelman, Mateo Valero Cortes, Ruben Gonzalez Garcia
Method and apparatus for back to back issue of dependent instructions in an out of order issue queue

Patent number: 7380104

Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.

Type: Grant

Filed: April 25, 2006

Date of Patent: May 27, 2008

Assignee: International Business Machines Corporation

Inventors: William Elton Burky, Raymond Cheung Yeung
Mechanisms for assuring quality of service for programs executing on a multithreaded processor

Patent number: 7376954

Abstract: A mechanism for assuring quality of service for a context in a digital processor has a first scheduling register dedicated to the context, the register having N out of M bits set, and a first scheduler that consults the register to assign issue slots to the context. The first scheduler grants issue slots for the context by referencing the N bits in the first register, and repeats a pattern of assignments of issue slots after referencing the M bits of the first register.

Type: Grant

Filed: October 10, 2003

Date of Patent: May 20, 2008

Assignee: MIPS Technologies, Inc.

Inventor: Kevin D. Kissell
System and method for high frequency stall design

Patent number: 7370176

Abstract: A system and method for a high frequency stall design is presented. An issue unit includes a first instruction stage, a second instruction stage, and issue control logic. During a first instruction cycle, the issue unit performs two tasks, which are 1) the instructions located in the first instruction stage are moved to a second instruction stage, and 2) the issue control logic determines whether to issue or stall the instructions that are moved to the second instruction stage based upon their particular instruction attributes and the issue control unit's previous state. During a second instruction cycle that immediately follows the first instruction cycle, the second instruction stage's instructions are either issued or stalled based upon the issue control logic's decision from the first instruction cycle.

Type: Grant

Filed: August 16, 2005

Date of Patent: May 6, 2008

Assignee: International Business Machines Corporation

Inventors: Jonathan James DeMent, Kurt Alan Feiste, Robert Alan Philhower, David Shippy
Scheduling instructions from multi-thread instruction buffer based on phase boundary qualifying rule for phases of math and data access operations with better caching

Patent number: 7366878

Abstract: A processor buffers asynchronous threads. Current instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one math operation and at least one texture cache access operation. Instructions within each phase are qualified and prioritized, with texture cache access operations in a subsequent phase not qualified until all of the texture cache access operations in a current phase have completed. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the instructions. The instructions may also be qualified based on an age of each instruction, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute current instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.

Type: Grant

Filed: April 13, 2006

Date of Patent: April 29, 2008

Assignee: NVIDIA Corporation

Inventors: Peter C. Mills, John Erik Lindholm, Brett W. Coon, Gary M. Tarolli, John Matthew Burgess
Speculative instruction issue in a simultaneously multithreaded processor

Patent number: 7366877

Abstract: A method for optimizing throughput in a microprocessor that is capable of processing multiple threads of instructions simultaneously. Instruction issue logic is provided between the input buffers and the pipeline of the microprocessor. The instruction issue logic speculatively issues instructions from a given thread based on the probability that the required operands will be available when the instruction reaches the stage in the pipeline where they are required. Issue of an instruction is blocked if the current pipeline conditions indicate that there is a significant probability that the instruction will need to stall in a shared resource to wait for operands. Once the probability that the instruction will stall is below a certain threshold, based on current pipeline conditions, the instruction is allowed to issue.

Type: Grant

Filed: September 17, 2003

Date of Patent: April 29, 2008

Assignee: International Business Machines Corporation

Inventors: Victor Roberts Augsburg, Jeffrey Todd Bridges, Michael Scott McIlvaine, Thomas Andrew Sartorius, Rodney Wayne Smith
Dependence-chain processing using trace descriptors having dependency descriptors

Patent number: 7363467

Abstract: An apparatus and method for a processor microarchitecture that quickly and efficiently takes large steps through program segments without fetching all intervening instructions. The microarchitecture processes descriptors of trace sequences in program order so as to locate and dispatch descriptors of dependence chains that are used to fetch and execute the instructions of the dependence chain in data flow order.

Type: Grant

Filed: January 3, 2002

Date of Patent: April 22, 2008

Assignee: Intel Corporation

Inventors: Sriram Vajapeyam, Bohuslav Rychlik, John P. Shen
Method for changing a thread priority in a simultaneous multithread processor

Patent number: 7363625

Abstract: An SMT system is designed to allow software alteration of thread priority. In one case, the system signals a change in a thread priority based on the state of instruction execution and in particular when the instruction has completed execution. To alter the priority of a thread, the software uses a special form of a “no operation” (NOP) instruction (hereafter termed thread priority NOP). When the thread priority NOP is dispatched, its special NOP is decoded in the decode unit of the IDU into an operation that writes a special code into the completion table for the thread priority NOP. A “trouble” bit is also set in the completion table that indicates which instruction group contains the thread priority NOP. The trouble bit indicates that special processing is required after instruction completion. The thread priority instruction is processed after completion using the special code to change a thread's priority.

Type: Grant

Filed: April 24, 2003

Date of Patent: April 22, 2008

Assignee: International Business Machines Corporation

Inventors: William E. Burky, Ronald N. Kalla, David A. Schroter, Balaram Sinharoy
Thread interleaving in a multithreaded embedded processor

Patent number: 7360064

Abstract: The present invention provides a network multithreaded processor, such as a network processor, including a thread interleaver that implements fine-grained thread decisions to avoid underutilization of instruction execution resources in spite of large communication latencies. In an upper pipeline, an instruction unit determines an instruction fetch sequence responsive to an instruction queue depth on a per thread basis. In a lower pipeline, a thread interleaver determines a thread interleave sequence responsive to thread conditions including thread latency conditions. The thread interleaver selects threads using a two-level round robin arbitration. Thread latency signals are active responsive to thread latencies such as thread stalls, cache misses, and interlocks. During the subsequent one or more clock cycles, the thread is ineligible for arbitration. In one embodiment, other thread conditions affect selection decisions such as local priority, global stalls, and late stalls.

Type: Grant

Filed: December 10, 2003

Date of Patent: April 15, 2008

Assignee: Cisco Technology, Inc.

Inventors: Donald Steiss, Earl T Cohen, John J Williams, Jr.
Replay reduction for power saving

Publication number: 20080086622

Abstract: In one embodiment, a processor comprises a scheduler configured to issue a first instruction operation to be executed and an execution core coupled to the scheduler. Configured to execute the first instruction operation, the execution core comprises a plurality of replay sources configured to cause a replay of the first instruction operation responsive to detecting at least one of a plurality of replay cases. The scheduler is configured to inhibit issuance of the first instruction operation subsequent to the replay for a subset of the plurality of replay cases. The scheduler is coupled to receive an acknowledgement indication corresponding to each of the plurality of replay cases in the subset, and is configured to inhibit issuance of the first instruction operation until the acknowledge indication is asserted that corresponds to an identified replay case of the subset.

Type: Application

Filed: October 10, 2006

Publication date: April 10, 2008

Applicant: P.A. Semi, Inc.

Inventors: Po-Yung Chang, Wei-Han Lien, Jesse Pan, Ramesh Gunna, Tse-Yu Yeh, James B. Keller
Method and apparatus for issuing instructions from an issue queue in an information handling system

Patent number: 7350056

Abstract: An information handling system includes a processor that issues instructions out of program order. The processor includes an issue queue that may advance instructions toward issue even though some instructions in the queue are not ready-to-issue. The issue queue includes a matrix of storage cells configured in rows and columns including a first row that couples to execution units. Instructions advance toward issuance from row to row as unoccupied storage cells appear. Unoccupied cells appear when instructions advance toward the first row and upon issuance. When a particular row includes an instruction that is not ready-to-issue a stall condition occurs for that instruction. However, to prevent the entire issue queue and processor from stalling, a ready-to-issue instruction in another row may bypass the row including the stalled or not-ready-to-issue instruction. Out-of-order issuance of instructions to the execution units thus continues.

Type: Grant

Filed: September 27, 2005

Date of Patent: March 25, 2008

Assignee: International Business Machines Corporation

Inventors: Christopher Michael Abernathy, Jonathan James DeMent, Kurt Alan Feiste, David Shippy
Processing System having a Plurality of Processing Units with Program Counters and Related Method for Processing Instructions in the Processing System

Publication number: 20080072017

Abstract: A method for processing predetermined instructions in a processing system having a plurality of processing units includes providing a global program counter and setting a counter value of the global program counter as an instruction of the predetermined instructions is executed; assigning each processing unit a local program counter and setting a counter value of the local program counter according to a current instruction being executed by the processing unit; and enabling at least one of the processing units to execute a specific instruction of the predetermined instructions according to counter values stored in local program counters of the processing units and the global program counter.

Type: Application

Filed: September 19, 2006

Publication date: March 20, 2008

Inventor: Hsueh-Bing Yen
Supplying halt signal to data processing unit from integer unit upon single unit format instruction in system capable of executing double unit format instruction

Patent number: 7343475

Abstract: A processor including an integer processing unit and a data processing unit. The processor can be operated by a first instruction format or a second instruction format. The first instruction format includes only an instruction for the integer processing unit, and is executed in the integer processing unit alone. The second instruction format includes instructions for the integer processing unit and the data processing unit, the second instructions being executed in both the integer processing unit and the data processing unit in parallel. When an instruction in the first instruction format is to be executed, only in the integer processing unit, a control signal is generated in the integer processing unit and is supplied to the data processing unit to halt an operation of the data processing unit.

Type: Grant

Filed: July 12, 2006

Date of Patent: March 11, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Takashi Miyamori
Minimal address state in a fine grain multithreaded processor

Patent number: 7343474

Abstract: In one embodiment, a processor comprises a plurality of pipeline stages and a first circuit operable at a first pipeline stage of the plurality of pipeline stages. The first circuit is configured to maintain a plurality of program counters (PCs), each of which corresponds to one of a plurality of threads that the processor is configured to have concurrently in process with respect to the plurality of pipeline stages. The first circuit is configured to provide a first PC to a second pipeline stage of the plurality of pipeline stages. The first PC is derived from one of the plurality of PCs that corresponds to a first thread of the plurality of threads, and a first instruction entering the second pipeline stage is from the first thread.

Type: Grant

Filed: June 30, 2004

Date of Patent: March 11, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
Thread instruction issue pool counter decremented upon execution and incremented at desired issue rate

Patent number: 7337303

Abstract: A method and apparatus for controlling issue rate of instructions for an instruction thread to be executed by a processor is provided. The rate at which instructions are to be executed for an instruction thread are stored and requests are issued to cause instructions to execute in response to the stored rate. The rate at which instruction requests are issued is reduced in response to instruction executions and is increased in the absence of instruction executions. In a multi-threaded processor, instruction rate is controlled by storing the average rate at which each thread should execute instructions. A value representative of the number of instructions available and not yet issued is monitored and is decreased in response to instruction executions. Execution of instructions is prevented on a thread if the number of instructions available but not yet issued falls below a defined value. A ranking order is assigned to a plurality of instructions threads for execution on a multi-threaded processor.

Type: Grant

Filed: September 21, 2006

Date of Patent: February 26, 2008

Assignee: Imagination Technologies Limited

Inventors: Adrian John Anderson, Martin John Woodhead
Multiprocessor system

Publication number: 20080046694

Abstract: A multiprocessor system includes a judging unit judging whether a read command inputted to a global address crossbar is a read command to a memory on an own system board, an executing unit speculatively executing, when the judging unit judges that the read command is a read command to the memory on the own system board, the read command before global access based on an address notified from the global address crossbar, a setting unit setting for queuing data read from the memory in a data queue provided on a CPU without queuing the data in a data queue provided on the memory, and an instructing unit instructing, based on notification from the global address crossbar, the data queue provided on the CPU to discard the data or transmit the data to the CPU.

Type: Application

Filed: April 20, 2007

Publication date: February 21, 2008

Applicant: Fujitsu Limited

Inventors: Toshikazu Ueki, Takaharu Ishizuka, Makoto Hataida, Takashi Yamamoto, Yuka Hosokawa, Takeshi Owaki, Daisuke Itou
System controller, identical-address-request-queuing preventing method, and information processing apparatus having identical-address-request-queuing preventing function

Publication number: 20080046695

Abstract: In a system controller including a CPU-issued request queue having a circuit that processes plural requests having identical addresses not to be inputted to the CPU-issued request queue, a latest request other than a cache replace request is retained by an input-request retaining section. Consequently, even if an address of an issued request for cache replace request matches an address of a request retained by the CPU-issued request queue, the issued request for the cache replace request is not retried but is queued in the CPU-issued request queue when the address of the issued request for the cache replace request does not match the entire address retained by the input-request retaining section.

Type: Application

Filed: April 24, 2007

Publication date: February 21, 2008

Applicant: FUJITSU LIMITED

Inventors: Takaharu Ishizuka, Toshikazu Ueki, Makoto Hataida, Takashi Yamamoto, Yuka Hosokawa, Takeshi Owaki, Daisuke Itou
Method and Apparatus for Executing Processor Instructions Based on a Dynamically Alterable Delay

Publication number: 20080046692

Abstract: Instruction execution delay is alterable after the system design has been finalized, thus enabling the system to dynamically account for various conditions that impact instruction execution. In some embodiments, the dynamic delay is determined by an application to be executed by the processing system. In other embodiments, the dynamic delay is determined by analyzing the history of previously executed instructions. In yet other embodiments, the dynamic delay is determined by assessing the processing resources available to a given application. Regardless, the delay may be dynamically altered on a per-instruction, multiple instruction, or application basis. Processor instruction execution may be controlled by determining a first delay value for a first set of one or more instructions and a second delay value for a second set of one or more instructions. Execution of the sets of instructions is delayed based on the corresponding delay value.

Type: Application

Filed: August 16, 2006

Publication date: February 21, 2008

Inventors: Gerald Paul Michalak, Kenneth Alan Dockser

prev … 7 8 9 10 11 12 13 14 15 … next