Commitment Control Or Register Bypass Patents (Class 712/218)

Register file circuitry

Patent number: 6732251

Abstract: A processor or processor core has register file circuitry having a plurality of physical registers and a plurality of tag storing portions corresponding respectively to the physical registers. Each tag storing portion stores a tag representing a logical register ID allocated to the corresponding physical register. A register selection unit receives a logical register ID and selects one of the logical registers whose tag matches the received logical register ID. A tag changing unit changes the stored tags so as to change a mapping between at least one logical register ID and one of the physical registers. Such register circuitry permits a mapping between logical register IDs and physical registers to be changed quickly efficiently and can permit a desired physical register to be selected quickly.

Type: Grant

Filed: November 1, 2001

Date of Patent: May 4, 2004

Assignee: PTS Corporation

Inventors: Jonathan Michael Harris, Adrian Philip Wise, Nigel Peter Topham
Interface to a memory system for a processor having a replay system

Publication number: 20040083351

Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.

Type: Application

Filed: October 23, 2003

Publication date: April 29, 2004

Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
Method and apparatus for memory latency avoidance in a processing system

Patent number: 6728869

Abstract: A method and apparatus for avoiding latency in a processing system that includes a memory for storing intermediate results is presented. The processing system stores results produced by an operation unit in memory, where the results may be used by subsequent dependent operations. In order to avoid the latency of the memory, the output for the operation unit may be routed directly back into the operation unit as a subsequent operand. Furthermore, one or more memory bypass registers are included such that the results produced by the operation unit during recent operations that have not yet satisfied the latency requirements of the memory are also available. A first memory bypass register may thus provide the result of an operation that completed one cycle earlier, a second memory bypass register may provide the result of an operation that completed two cycles earlier, etc.

Type: Grant

Filed: April 21, 2000

Date of Patent: April 27, 2004

Assignee: ATI International Srl

Inventors: Michael Andrew Mang, Michael Mantor, Robert Scott Hartog
Pipeline replay support for unaligned memory operations

Patent number: 6728865

Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.

Type: Grant

Filed: October 20, 1999

Date of Patent: April 27, 2004

Assignee: Transmeta Corporation

Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
System and method for providing multiprocessor speculation within a speculative branch path

Patent number: 6728873

Abstract: Disclosed is a method of operation within a processor, that enhances speculative branch processing. A speculative execution path contains an instruction sequence that includes a barrier instruction followed by a load instruction. While a barrier operation associated with the barrier instruction is pending, a load request associated with the load instruction is speculatively issued to memory. A flag is set for the load request when it is speculatively issued and reset when an acknowledgment is received for the barrier operation. Data which is returned by the speculatively issued load request is temporarily held and forwarded to a register or execution unit of the data processing system after the acknowledgment is received. All process results, including data returned by the speculatively issued load instructions are discarded when the speculative execution path is determined to be incorrect.

Type: Grant

Filed: June 6, 2000

Date of Patent: April 27, 2004

Assignee: International Business Machines Corporation

Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
System and method for coalescing data utilized to detect data hazards

Patent number: 6728868

Abstract: The present invention generally relates to a processing system and method for coalescing instruction data to efficiently detect data hazards between instructions of a computer program. In architecture, the system of the present invention utilizes a plurality of pipelines, coalescing circuitry, and hazard detection circuitry. The plurality of pipelines is configured to process instructions of a computer program, and the coalescing circuitry is configured to receive, from the pipelines, a plurality of register identifiers identifying a plurality of registers. The coalescing circuitry is configured to coalesce said register identifiers thereby generating a coalesced register identifier identifying each of said plurality of registers. The hazard detection circuitry is configured to receive the coalesced register identifier and to perform a comparison of the coalesced register identifier with other information received from the pipelines.

Type: Grant

Filed: October 28, 2002

Date of Patent: April 27, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Ronny Lee Arnold, Donald Charles Soltis, Jr.
Method and apparatus for providing fault-tolerance for temporary results within a CPU

Publication number: 20040078728

Abstract: One embodiment of the present invention provides a system that corrects bit errors in temporary results within a central processing unit (CPU). During operation, the system receives a temporary result during execution of an in-flight instruction. Next, the system generates a parity bit for the temporary result, and stores the temporary result and the parity bit in a temporary register within the CPU. Before the temporary result is committed to the architectural state of the CPU, the system checks the temporary result and the parity bit to detect a bit error. If a bit error is detected, the system performs a micro-trap operation to re-execute the instruction that generated the temporary result, thereby regenerating the temporary result. Otherwise, if a bit error is not detected, the system commits the temporary result to the architectural state of the CPU.

Type: Application

Filed: May 14, 2002

Publication date: April 22, 2004

Inventors: Marc Tremblay, Shailender Chaudhry, Quinn A. Jacobson
Method and system for dynamically shared completion table supporting multiple threads in a processing system

Patent number: 6721874

Abstract: A method and system for utilizing a completion table in a superscalar processor is disclosed. The method and system comprises providing a plurality of threads to the processor and associating a link list with each of the threads, wherein each entry associated with a thread is linked to a next entry. A method and system in accordance with the present invention implements the completion table as link lists. Each entry in the completion table in a thread is linked to the next entry via a pointer that is stored in a link list. In a second aspect a method of determining the relative order between instructions is provided. A method and system in accordance with the present invention implements a flush mask array which is accessed to determine the relative order of entries in the said completion table. A method and system in accordance with the present invention implements a restore head pointer table to save and restore the state of the pointer of said completion table.

Type: Grant

Filed: October 12, 2000

Date of Patent: April 13, 2004

Assignee: International Business Machines Corporation

Inventors: Hung Qui Le, Peichun Liu, Balaram Sinharoy
Data processor with individually writable register subword locations

Publication number: 20040064677

Abstract: A data processor includes program registers with individual byte-location write enables. Bypass networks allow a precision pipeline to respond to read requests by accessing a program register or pipeline stage on a byte-by-byte basis. The data processor can thus write to individual byte locations without overwriting other byte locations within the same register. The data processor has an instruction set with instructions that combine two operands and yield a one-byte result that is stored in a specified byte location of a specified result register. Eight instances of this instruction can pack eight results into a single 64-bit result register without additional packing instructions and without using a read port to read the result register before writing to it. As plural functional units can write concurrently to different subwords of the same result register, a system with four functional units can pack eight results into a result register in two instruction cycles.

Type: Application

Filed: September 30, 2002

Publication date: April 1, 2004

Inventor: Dale Morris
Method and apparatus for reducing register file access times in pipelined processors

Publication number: 20040064680

Abstract: One embodiment of the present invention provides a system that reduces the time required to access registers from a register file within a processor. During operation, the system receives an instruction to be executed, wherein the instruction identifies at least one operand to be accessed from the register file. Next, the system looks up the operands in a register pane, wherein the register pane is smaller and faster than the register file and contains copies of a subset of registers from the register file. If the lookup is successful, the system retrieves the operands from the register pane to execute the instruction. Otherwise, if the lookup is not successful, the system retrieves the operands from the register file, and stores the operands into the register pane. This triggers the system to reissue the instruction to be executed again, so that the re-issued instruction retrieves the operands from the register pane.

Type: Application

Filed: September 26, 2002

Publication date: April 1, 2004

Inventors: Sudarshan Kadambi, Adam R. Talcott, Wayne I. Yamamoto
Method and apparatus to execute an instruction with a semi-fast operation in a staggered ALU

Publication number: 20040054875

Abstract: A method for executing an instruction with a semi-fast operation in a staggered ALU. The method of one embodiment comprises generating a first operation and a second operation from a micro-instruction. The first and second operations are scheduled for execution in a staggered arithmetic logic unit (ALU). The first and second operations are separated by N clock cycles. Data from the first operation is communicated to the second operation for use with execution of the second operation.

Type: Application

Filed: September 13, 2002

Publication date: March 18, 2004

Inventor: Ross A. Segelken
Synchronising pipelines in a data processing apparatus

Publication number: 20040054876

Abstract: The present invention provides an apparatus and method for synchronizing a first pipeline and a second pipeline of a processor arranged to execute a sequence of instructions. The processor is arranged to route an instruction in the sequence through either the first or the second pipeline dependent on predetermined criteria, each pipeline having a plurality of pipeline stages including a retirement stage. Counter logic is provided for maintaining a first counter relating to the first pipeline and a second counter relating to the second pipeline. For each instruction in the first pipeline a determination is made as to when that instruction reaches a point within the first pipeline where an exception status of that instruction is resolved, and the counter logic is arranged to increment the first counter responsive to such determination.

Type: Application

Filed: September 13, 2002

Publication date: March 18, 2004

Inventors: Richard Roy Grisenthwaite, Ian Victor Devereux
Method and apparatus for multi-mode fencing in a microprocessor system

Patent number: 6708269

Abstract: In a multi-threaded system, such as in a multi-processor system, different types of fences are provided to force completion of programmatically earlier instructions in a program. The types of fences can be thread-specific, and different types of fences are used based on different kinds of conditions, instructions, operations, or memory types. When a fence is executed, senior stores, request buffers, bus queues, or any combination of these stages in an execution pipeline can be drained. Fetches at a front end of the pipeline can also be killed to ensure that the bus queue can be drained.

Type: Grant

Filed: December 30, 1999

Date of Patent: March 16, 2004

Assignee: Intel Corporation

Inventors: Keshavan K. Tiruvallur, Douglas M. Carmean, Robert J. Greiner, Muntaquim Chowdhury, Madhavan Parthasarathy
Method for compacting an instruction queue

Patent number: 6704856

Abstract: A method of compacting an instruction queue in an out of order processor includes determining the number of invalid instructions below and including each row in the queue, by counting invalid bits or validity indicators associated with rows below and up to the current row. For each row, multiplexor select signals are generated from the flat vector counts for the N rows above and including the present row, and from the validity indicators associated with the N rows, where N is a predetermined value. A multiplexor associated with a particular row selects one of the N rows according to the select value, and moves or passes the instruction held in the selected row to the present row. A row's select value is determined by forming a diagonal from the N count vectors corresponding to the N rows above and including the present row, and logically ANDing, each diagonal bit with the valid bit associated with the same row. Each row's count vector is determined in two stages.

Type: Grant

Filed: December 17, 1999

Date of Patent: March 9, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: James A. Farrell, Timothy C. Fischer, Daniel L. Leibholz, Bruce A. Gieseke
Method and system for early speculative store-load bypass

Publication number: 20040044881

Abstract: In an embodiment, the present invention describes a method and apparatus for detecting RAW condition earlier in an instruction pipeline. The store instructions are stored in a special store bypass buffer (SBB) within an instruction decode unit (IDU). The IDU compares the instruction fields that are used for address generation of all ‘load’ instructions against ‘store’ instructions within a group of fetched instructions and ‘store’ instructions previously stored in the SBB. If a match of instruction fields is found, the IDU ‘speculates’ that the load instruction has dependency on the ‘store’ instruction. A data cache unit (DCU) validates the dependency of the load instruction ‘speculated’ by the IDU. If a false dependency is ‘speculated’ by the IDU, the DCU forces a re-fetch of the load instruction.

Type: Application

Filed: August 28, 2002

Publication date: March 4, 2004

Applicant: Sun Microsystems, Inc.

Inventors: Robert M. Maier, Sorin Iacobovici, Rabin Sugumar, Robert Nuckolls, Ali Vahidsafa, Chandra M. R. Thimmannagari
selective bypassing of a multi-port register file

Publication number: 20040044882

Abstract: A multi-port register file may be selectively bypassed such that any element in a result vector is bypassed to the same index of an input vector of a succeeding operation when the element is requested in the succeeding operation in the same index as it was generated. Alternatively, the results to be placed in a register file may be bypassed to a succeeding operation when the N elements that dynamically compose a vector are requested as inputs to the next operation exactly in the same order as they were generated. That is, for the purposes of bypassing, the N vector elements are treated as a single entity. Similar rules apply for the write-through path.

Type: Application

Filed: August 29, 2002

Publication date: March 4, 2004

Applicant: International Business Machines Corporation

Inventors: Sameh Asaad, Jaime H. Moreno, Victor Zyuban
Memory access address comparison of load and store queques

Patent number: 6701425

Abstract: A computer system with parallel execution pipelines and a memory access controller has store address queues holding addresses for store operations, store data queues holding a plurality of data for storing in the memory and load address storage holding addresses for load operations, said access controller including comparator circuitry to compare load addresses received by the controller with addresses in the store address queue and locate any addresses which are the same, each of said addresses including a first set of bits representing a word address together with a second set of byte enable bits and said comparator having circuitry to compare the byte enable bits of two addresses as well as said first set of bits.

Type: Grant

Filed: May 2, 2000

Date of Patent: March 2, 2004

Assignee: STMicroelectronics S.A.

Inventors: Ahmed Dabbagh, Nicolas Grossier, Bruno Bernard, Pierre-Yves Taloud
Data processing apparatus and method for processing floating point instructions

Patent number: 6701427

Abstract: A data processing apparatus for processing floating point instructions is responsive to a floating point instruction to apply a floating point operation to a number of operands to produce a final result, result data being generated during a predetermined pipelined stage with further processing then being performed on the result data in one or more subsequent pipelined stages to generate the final result. Exception determination logic determines whether an exception may occur during application of the floating point operation to the operands, and to prevent the execution unit applying the floating point operation to those operands if it is determined that an exception may occur. The exception determination logic is arranged to use at least some of the predetermined control data to compensate for differences between the forwarded result data and the final result relevant when determining whether an exception may occur when processing the second floating point instruction.

Type: Grant

Filed: December 22, 1999

Date of Patent: March 2, 2004

Assignee: ARM Limited

Inventors: Christopher Neal Hinds, Arun Kumar Varadarajan Rajagopal
Processor system and method providing data to selected sub-units in a processor functional unit

Publication number: 20040039898

Abstract: A processor (50) operable in response to an instruction set comprising a plurality of instructions. The processor comprises a functional unit (52) comprising an integer number S of sub-units (541, 542, 543), wherein S is greater than one. Each of the sub-units is operable to execute, during an execution cycle, at least one of the instructions in the instruction set in response to at least two data arguments (A, B). The processor further comprises circuitry (58A1, 58A2, 58A3, 58B1, 58B2) for providing an updated value of the at least two data arguments to less than all S of the sub-units for a single execution cycle.

Type: Application

Filed: August 20, 2002

Publication date: February 26, 2004

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventor: Patrick W. Bosshart
Branch misprediction recovery using a side memory

Publication number: 20040034762

Abstract: A mispredicted path side memory is configured to be coupled to a stage in an instruction pipeline. As instructions advance through the pipeline, a result from the stage is stored into the mispredicted path side memory. The result is restored from the mispredicted path side memory into a pipeline stage when a branch is mispredicted.

Type: Application

Filed: August 19, 2003

Publication date: February 19, 2004

Inventor: Nicolas I. Kacevas
Non-stalling circular counterflow pipeline processor with recorder buffer

Patent number: 6691222

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: March 18, 2003

Date of Patent: February 10, 2004

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
Apparatus and method for maintaining a floating point data segment selector

Publication number: 20040024993

Abstract: An apparatus and method for maintaining a floating point data segment selector are described. In one embodiment, the method includes the detection of a micro-operation of a memory referencing macro-instruction from one or more micro-operations to be retired during a system clock cycle. When the detected micro-operation triggers an event, a micro-code event handler is triggered to initiate an update of a floating point data segment selector information associated with the detected micro-operation. Otherwise, FDS update device is triggered to update the floating point data segment selector information associated with the detected micro-operation.

Type: Application

Filed: August 5, 2002

Publication date: February 5, 2004

Inventor: Rajesh S. Parthasarathy
Maintaining processor ordering by checking load addresses of unretired load instructions against snooping store addresses

Patent number: 6687809

Abstract: An apparatus in a first processor includes a first data structure to store addresses of store instruction dispatched during a last predetermined number of cycles. The apparatus further includes logic to determine whether a load address of a load instruction being executed matches one of the store addresses in the first data structure. The apparatus still further includes logic to replay to the respective load instruction if the load address of the respective load instruction matches of the store addresses in the first data structure.

Type: Grant

Filed: October 24, 2002

Date of Patent: February 3, 2004

Assignee: Intel Corporation

Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
Mechanism for enabling efficient execution of an instruction

Publication number: 20040015679

Abstract: A mechanism is provided for execution of an instruction having one or more parameters that need to be resolved at runtime. Instructions being executed may be stored in non-rewritable storage. The present invention allows costly parameter resolution to be circumvented during subsequent executions of the same instruction. An interpreter invokes an optimization module when it encounters an instruction with one or more associated parameters that need to be resolved at runtime. If the optimization module determines that resolved values associated with the instruction are available in a cache, then optimization module obtains resolved values associated with the instruction from the cache. Resolving parameters into their corresponding object references is time-consuming and utilizes valuable computer resources. By obtaining resolved values stored during a previous execution of an instruction, the optimization module avoids repeatedly resolving parameters associated with an instruction.

Type: Application

Filed: July 17, 2002

Publication date: January 22, 2004

Inventor: Ioi K. Lam
Causality-based memory ordering in a multiprocessing environment

Patent number: 6681320

Abstract: Causality-based memory ordering in a multiprocessing environment. A disclosed embodiment includes a plurality of processors and arbitration logic coupled to the plurality of processors. The processors and arbitration logic maintain processor consistency yet allow stores generated in a first order by any two or more of the processors to be observed consistent with a different order of stores by at least one of the other processors. Causality monitoring logic coupled to the arbitration logic monitors any causal relationships with respect to observed stores.

Type: Grant

Filed: December 29, 1999

Date of Patent: January 20, 2004

Assignee: Intel Corporation

Inventor: Deborah T. Marr
Method and apparatus for emulating an instruction set extension in a digital computer system

Patent number: 6681322

Abstract: Methods for emulating an instruction set extension, comprising providing data to be operated upon, executing a first instruction with respect to a first portion of the data without committing the results of the first executed instruction, if no unmasked exceptions occur with respect to the first portion of the data, executing a second instruction with respect to a second portion of the data, and if no unmasked exceptions occur with respect to the second portion of the data, committing the results of the second executed instruction and again executing the first instruction with respect to the first portion of the data. If the first instruction is executed again, its results are committed. A handler is invoked if an unmasked exception occurs.

Type: Grant

Filed: November 26, 1999

Date of Patent: January 20, 2004

Assignee: Hewlett-Packard Development Company L.P.

Inventors: Kevin David Safford, Patrick Knebel
Instruction execution apparatus

Publication number: 20040006684

Abstract: An instruction execution apparatus comprising a register 43 for storing a copy of contents of the maximum number of entries that are executable simultaneously in one cycle with the entry storing the oldest unreleased instruction at the head among all entries in an instruction storage device 42 after execution of the instructions, a completion condition determination section 44 for determining whether the instructions stored in the entries of the register are completed in the cycle for determining completion conditions of the entries in the instruction storage device, and an entry release section 45 for releasing only the entries that are determined to be completed by the completion condition determination section among all entries in the instruction storage device, which allows the entries in the CSE to be released smoothly even though the number of entries in the CSE, or clock frequency, is increased.

Type: Application

Filed: December 31, 2002

Publication date: January 8, 2004

Applicant: FUJITSU LIMITED

Inventors: Yasunobu Akizuki, Aiichiro Inoue
Processor and instruction control method

Publication number: 20040006686

Abstract: A latest register update buffer which stores latest register update data is allocated and prepared every general register for storing source data. A latest register update processing unit stores a value in the general register as latest register update data into the latest register update buffer when a register update instruction is not speculatively executed, and overwrites a result of the speculative execution when the instruction is speculatively executed. Upon instruction decoding, a matching processing unit reads out the latest register update data from the latest register update allocation buffer and stores it into a data area in a reservation station.

Type: Application

Filed: January 21, 2003

Publication date: January 8, 2004

Applicant: Fujitsu Limited

Inventor: Toshio Yoshida
Processor and instruction control method

Publication number: 20040006685

Abstract: When a predetermined instruction is fetched and decoded, an instruction issuing unit develops the instruction operation into a multiflow of a previous flow and a following flow and issues the instruction by in-order. It is held into a reservation station. An instruction executing unit executes the instruction held in the reservation station by out-of-order. Further, an execution result of the instruction is committed by in-order. A multiflow guarantee processing unit guarantees an execution result of the previous flow stored in an allocation register on a register update buffer until the following flow is committed. Even if the previous flow is committed and the allocation register is released, the guaranteeing process is realized by stalling another instruction serving as a next register allocation destination in a decoding cycle until the following flow is committed.

Type: Application

Filed: January 21, 2003

Publication date: January 8, 2004

Applicant: Fujitsu Limited

Inventor: Toshio Yoshida
Apparatus for mapping instructions using a set of valid and invalid logical to physical register assignments indicated by bits of a valid vector together with a logical register list

Patent number: 6675288

Abstract: A technique for managing register assignments. The technique involves maintaining, in a register list memory circuit having entries that respectively correspond to physical registers, a list of register assignments that assign logical registers to the physical registers. The technique further involves maintaining, in a vector memory circuit having bits that respectively correspond to the physical registers, a valid vector that forms, in combination with the list of register assignments, a list of valid register assignments. Furthermore, the technique involves storing, for an instruction that is mapped by the data processor, a copy of the valid vector from the vector memory circuit to a silo memory circuit. Preferably, the processor using the technique has the ability to execute branches of instructions speculatively, and to recover if it is determined that the processor executed down an incorrect instruction branch.

Type: Grant

Filed: May 9, 2002

Date of Patent: January 6, 2004

Assignee: Hewlett-Packard Development Company L.P.

Inventors: James Arthur Farrell, Sharon Marie Britton, Harry Ray Fair, III, Bruce Gieseke, Daniel Lawrence Leibholz, Derrick R. Meyer
Program counter control method and processor

Publication number: 20040003207

Abstract: A program counter control method controls instructions by an out-of-order method using a branch prediction mechanism and controls an architecture having delay instructions for branching. The method includes the steps of simultaneously committing a plurality of instructions including a branch instruction, when a branch prediction is successful and the branch instruction branches, and simultaneously updating a program counter and a next program counter depending on a number of committed instructions.

Type: Application

Filed: January 28, 2003

Publication date: January 1, 2004

Applicant: FUJITSU LIMITED

Inventors: Ryuichi Sunayama, Kuniki Morita, Aiichiro Inoue
STREAMING VECTOR PROCESSOR WITH RECONFIGURABLE INTERCONNECTION SWITCH

Publication number: 20040003206

Abstract: A re-configurable, streaming vector processor (100) is provided which includes a number of function units (102), each having one or more inputs for receiving data values and an output for providing a data value, a re-configurable interconnection switch (104) and a micro-sequencer (118). The re-configurable interconnection switch (104) includes one or more links, each link operable to couple an output of a function unit (102) to an input of a function unit (102) as directed by the micro-sequencer (118). The vector processor may also include one or more input-stream units (122) for retrieving data from memory. Each input-stream unit is directed by a host processor and has a defined interface (116) to the host processor. The vector processor also includes one or more output-stream units (124) for writing data to memory or to the host processor. The defined interface of the input-stream and output-stream units forms a first part of the programming model.

Type: Application

Filed: June 28, 2002

Publication date: January 1, 2004

Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
Register window fill technique for retirement window having entry size less than amount of fill instructions

Publication number: 20030229772

Abstract: A register window fill technique for a retirement window having an entry size less than a number of fill instructions used in a fill condition is provided. The technique uses modified fill instructions that allow the retirement window to retire a portion of the fill instructions without having to determine whether a remaining portion of the fill instructions will execute without exceptions.

Type: Application

Filed: June 7, 2002

Publication date: December 11, 2003

Inventors: Chandra Thimmanagari, Sorin Iacobovici, Rabin Sugumar, Robert Nuckolls
Register window spill technique for retirement window having entry size less than amount of spill instructions

Publication number: 20030229771

Abstract: A register window spill technique for an retirement window having an entry size less than a number of spill instructions used in a spill condition is provided. The technique uses modified spill instructions that allow the retirement window to retire a portion of the spill instructions without having to determine whether a remaining portion of the spill instructions will execute without exceptions.

Type: Application

Filed: June 7, 2002

Publication date: December 11, 2003

Inventors: Chandra Thimmanagari, Sorin Iacobovici, Rabin Sugumar, Robert Nuckolls
Collapsible pipeline structure and method used in a microprocessor

Publication number: 20030226000

Abstract: A collapsible pipeline structure, suitable for use in a microprocessor. The contains a first pipeline stage, under control by a clock to export a sequence of instruction stage results with respect to a clock cycle of the clock. A bypassing storage unit receives the sequence of instruction stage results and, when operating in collapsed mode, forwards that sequence onto the subsequent pipeline stage, bypassing the storage unit through a mutiplexer. A second pipeline stage receives the output from the bypassing storage unit, and exports its instruction stage results under control of the clock. Wherein if the collapsing function of the bypassing storage unit is disabled, then the instruction pipeline functions in the conventional manner.

Type: Application

Filed: May 30, 2002

Publication date: December 4, 2003

Inventor: Mike Rhoades
System for rejecting and reissuing instructions after a variable delay time period

Patent number: 6654876

Abstract: A method, processor, and data processing system implementing a delayed reject mechanism are disclosed. The processor includes an issue unit suitable for issuing an instruction in a first cycle and a load store unit (LSU). The LSU includes an extend reject calculator circuit configured to receive a set of completion information signals and generate a delay value based thereon. The LSU is adapted to determine whether to reject the instruction in a determination cycle. The number of cycles between the first cycle and the determination cycle is a function of the delay value such that reject timing is variable with respect to the first cycle. In one embodiment, the processor is further configured to reissue the instruction after the determination cycle if the instruction was rejected in the determination cycle. The delay value is conveyed via a 2-bit bus in one embodiment. The 2 bit bus permits delaying the determination cycle from 0 to 3 cycles after a finish cycle.

Type: Grant

Filed: November 4, 1999

Date of Patent: November 25, 2003

Assignee: International Business Machines Corporation

Inventors: Hung Qui Le, David James Shippy
Assigning a group tag to an instruction group wherein the group tag is recorded in the completion table along with a single instruction address for the group to facilitate in exception handling

Patent number: 6654869

Abstract: A microprocessor includes a fetch unit, an instruction cracking unit, and dispatch and completion control logic. The fetch unit retrieves a set of instructions from an instruction cache. The instruction cracking unit receives the set of fetched instructions and organizes the set of instructions into an instruction group. The dispatch and completion logic assigns a group tag to the instruction group and records the group tag in an entry of the completion table for tracking the completion status of the instructions comprising the instruction group. The dispatch and control logic may record a single instruction address in the completion table entry corresponding to the each instruction group. Preferably, the single instruction address is the instruction address of the first instruction in the instruction group. The processor may flush the instruction group in response to detecting an exception generated by an instruction in the instruction group.

Type: Grant

Filed: October 28, 1999

Date of Patent: November 25, 2003

Assignee: International Business Machines Corporation

Inventors: James Allan Kahle, Hung Qui Le, Charles Roberts Moore
Instruction scheduling system of a processor

Patent number: 6643767

Abstract: The present invention is related to a processor capable of speculatively executing an instruction having a data dependence upon a preceding instruction in order to improve the efficiency of dynamically scheduling instructions. The reissue of instructions is possible without lowering the efficiency of the instruction scheduling process by dividing the function of scheduling instructions and the function of the reissue of instructions.

Type: Grant

Filed: January 27, 2000

Date of Patent: November 4, 2003

Assignee: Kabushiki Kaisha Toshiba

Inventor: Toshinori Sato
Parallel processor

Publication number: 20030200422

Abstract: A parallel processor has a plurality of operation units that execute operation instructions, and a multi-bank register file in which a plurality of banks each having a plurality of registers are formed. Each of simultaneously input machine instructions is split into a plurality of nano-instructions each of which includes at least one of an access instruction and operation instruction. The output clock cycles of operation instructions with respect to the operation units are arbitrated. Furthermore, the output clock cycles of access instructions to the multi-bank register file are arbitrated so as to prevent access instructions from contending in an identical bank in the multi-bank register file.

Type: Application

Filed: February 18, 2003

Publication date: October 23, 2003

Applicant: Semiconductor Technology Academic Research Center

Inventors: Tetsuo Hironaka, Mattausch Hans Juergen, Takeshi Hiramatsu
Memory disambiguation for large instruction windows

Publication number: 20030196075

Abstract: A memory disambiguation apparatus includes a store queue, a store forwarding buffer, and a version count buffer. The store queue includes an entry for each store instruction in the instruction window of a processor. Some store queue entries include resolved store addresses, and some do not. The store forwarding buffer is a set-associative buffer that has entries allocated for store instructions as store addresses are resolved. Each entry in the store forwarding buffer is allocated into a set determined in part by a subset of the store address. When the set in the store forwarding buffer is full, an older entry in the set is discarded in favor of the newly allocated entry. A version count buffer including an array of overflow indicators is maintained to track overflow occurrences. As load addresses are resolved for load instructions in the instruction window, the set-associative store forwarding buffer can be searched to provide memory disambiguation.

Type: Application

Filed: May 15, 2003

Publication date: October 16, 2003

Applicant: Intel Corporation

Inventors: Haitham Akkary, Sehastien Hily
Completion monitoring in a processor having multiple execution units with various latencies

Publication number: 20030196074

Abstract: A method, processor architecture, computer program product, and data processing system for determining when an instruction in a pipelined processor should be completed is provided. As each instruction is issued to an execution unit, an entry for that instruction is placed within a “finish pipe,” which consists of a series of consecutively numbered stages. Each clock cycle, the entries in the finish pipe advance one stage. When an entry has reached the stage corresponding to the latency of its associated execution unit, it becomes mature.

Type: Application

Filed: April 11, 2002

Publication date: October 16, 2003

Applicant: International Business Machines Corporation

Inventors: Hung Qui Le, Dung Quoc Nguyen
Arithmetic computation of potential addresses

Publication number: 20030196073

Abstract: An apparatus is presented for expediting the execution of address-dependent micro instructions in a pipeline microprocessor. The apparatus computes a speculative result associated with an arithmetic operation, where the arithmetic operation is prescribed by a preceding micro instruction that is yet to generate a result. The apparatus utilizes the speculative result to configure a speculative address operand that is provided to an address-dependent micro instruction The apparatus includes speculative operand calculation logic and an update forwarding cache. The speculative operand calculation logic performs the arithmetic operation to generate the speculative result prior to when execute logic executes the preceding micro instruction to generate the result.

Type: Application

Filed: May 5, 2003

Publication date: October 16, 2003

Applicant: IP-First LLC

Inventor: Gerard M. Col
Mechanism for forward data in a processor pipeline using a single pipefile connected to the pipeline

Patent number: 6633971

Abstract: A method for forwarding data within a pipeline of a pipelined data processor having a plurality of execution pipeline stages where each stage accepts a plurality of operand inputs and generates a result. The result generated by each execution pipeline stage is selectively coupled to an operand input of one of the execution pipeline stages.

Type: Grant

Filed: October 1, 1999

Date of Patent: October 14, 2003

Assignee: Hitachi, Ltd.

Inventors: Chih-Jui Peng, Lew Chua-Eoan
Processor with registers storing committed/speculative data and a RAT state history recovery mechanism with retire pointer

Patent number: 6633970

Abstract: A mechanism is provided for allowing a processor to recover from a failure of a predicted path of instructions (e.g., from a mispredicted branch or other event). The mechanism includes a plurality of physical registers, each physical register can store either architectural data or speculative data. The apparatus also includes a primary array to store a mapping from logical registers to physical registers, the primary array storing a speculative state of the processor. The apparatus also includes a buffer coupled to the primary array to store information identifying which physical registers store architectural data and which physical registers store speculative data. According to another embodiment, a history buffer is coupled to the secondary array and stores historical physical register to logical register mappings performed for each of a plurality of instructions part of a predicted path.

Type: Grant

Filed: December 28, 1999

Date of Patent: October 14, 2003

Assignee: Intel Corporation

Inventors: David W. Clift, Darrell D. Boggs, David J. Sager
Apparatus and method for selective control of condition code write back

Publication number: 20030188133

Abstract: A microprocessor apparatus and method are provided, for selectively controlling write back of condition codes. The microprocessor apparatus has translation logic and extended execution logic. The translation logic translates an extended instruction into corresponding micro instructions. The extended instruction includes an extended prefix and an extended prefix tag. The extended prefix disables write back of the condition codes, where the condition codes correspond to a result of a prescribed operation. The extended prefix tag indicates the extended prefix, where the extended prefix tag is an otherwise architecturally specified opcode within an instruction set for a microprocessor. The extended execution logic is coupled to the translation logic. The extended execution logic receives the corresponding micro instructions, and generates the result, and disables write back of the condition codes.

Type: Application

Filed: May 9, 2002

Publication date: October 2, 2003

Applicant: IP-First LLC

Inventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks
Method and apparatus for a byte lane selectable performance monitor bus

Patent number: 6629170

Abstract: A multi-stage byte lane selectable bus. In a preferred embodiment, the bus in performance monitor mode includes a plurality of byte lanes and a selection mechanism. The selection mechanism acquires, from a plurality of signals, a subset of those signals, which are desired to be monitored, and places this subset of signals on the byte lanes that are input to the PMU. The number of the plurality of signals that potentially may be monitored is greater than the number of byte lanes and is also greater than the number of PMU counters.

Type: Grant

Filed: November 8, 1999

Date of Patent: September 30, 2003

Assignee: International Business Machines Corporation

Inventors: Joel Roger Davidson, Michael Stephen Floyd, Paul Joseph Jordan, Judith E. K. Laurens, Alexander Erik Mericas, Kevin F. Reick
Pipeline decoupling buffer for handling early data and late data

Patent number: 6629167

Abstract: An apparatus for and a method of decoupling at least two multi-stage pipelines are described. At least two paths of data through which data from the first pipeline is send to the second pipeline are provided. During a pipelined execution of a task in the at least two pipelines, the second pipeline may not require every data produced in the first pipeline to process at least some subset of the task. The first pipeline may not be able to produce all data required by each of the stages of the second pipeline. One of the two data paths provides an early data path for a type of data that becomes available in a stage of the first pipeline and that may be processed in a stage of the second pipeline early in time. The other of the two data paths provides a late data path for a type of data that becomes available in a stage of the first pipeline and that may be processed in a stage of the second pipeline later in time. Each data path may comprise a buffer, e.g., a FIFO.

Type: Grant

Filed: February 18, 2000

Date of Patent: September 30, 2003

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Stephen Undy, James E. McCormick, Jr.
Secondary reorder buffer microprocessor

Patent number: 6629233

Abstract: A method, processor, and data processing system for enabling maximum instruction issue despite the presence of complex instructions that require multiple rename registers is disclosed. The method includes allocating a first rename register from a first reorder buffer for storing the contents of a first register affected by the complex instruction. A second rename register from a second reorder buffer is then allocated for storing the contents of a second register affected by the complex instruction. In an embodiment in which the first reorder buffer supports a maximum number of allocations per cycle, the allocation of the second register using the second reorder buffer prevents the complex instruction from requiring multiple allocation slots in the first reorder buffer. The method may further include issuing a second instruction that contains a dependency on a register that is allocated in the secondary reorder buffer.

Type: Grant

Filed: February 17, 2000

Date of Patent: September 30, 2003

Assignee: International Business Machines Corporation

Inventor: James Allan Kahle
Method for limiting physical resource usage in a virtual tag allocation environment of a microprocessor

Publication number: 20030182540

Abstract: A method of handling instructions in a load/store unit of a processor by dispatching instructions to the load/store unit, filling a portion of physical entries of a reorder queue with tags corresponding to the instructions while limiting usage of the physical entries of the reorder queue to less than a total number of physical entries, and further dispatching one or more additional instructions to the load/store unit while the filled physical entries in the reorder queue are still full, i.e., still contain tags for uncompleted instructions. The limiting of usage of the physical entries may be selectively applied. Multiple logical instruction tags are assigned in a count greater than the number of physical entries in the reorder queue. Of the multiple logical instruction tags assigned to a single one of the physical entries in the reorder queue, only the tag for the oldest instruction is allowed to execute.

Type: Application

Filed: January 30, 2003

Publication date: September 25, 2003

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William Elton Burky, Dung Quoc Nguyen, Balaram Sinharoy, Albert Thomas Williams
Computer system

Publication number: 20030177337

Abstract: A computer system comprising a data file having entries each of which is designed to hold data, an advanced and a completed mapping file each having entries each of which is designed to hold a data-file-entry address, an operation window that is a buffer to hold substances of operations waiting execution, and a state-modification queue that is designed to be able to hold a substance of a modification on the advanced mapping file for each clock cycle; wherein making a modification on the advanced mapping file, entering the substance of this modification into the state-modification queue, and entering substances of operations into the operation window are each to be done in one clock cycle, and operations held in the operation window are to be executed out of order. The system can attain high performance easily and utilize programs described in any machine language for traditional register-based/stack-based processors.

Type: Application

Filed: February 25, 2003

Publication date: September 18, 2003

Inventor: Hajime Seki

prev … 6 7 8 9 10 11 12 13 14 next