Commitment Control Or Register Bypass Patents (Class 712/218)
  • Publication number: 20030177339
    Abstract: A problem in a message-based pipelined processor system is that the pipelining features of the execution pipeline of the system can not be fully utilized when the first stages of the pipeline are awaiting the determination of a memory address by the last stage of the pipeline. The invention therefore proposes that the message-based memory addresses are determined before the messages are buffered, or even earlier, already at message sending, so that the memory addresses are ready for use as soon as message processing by the pipeline is intiated. This typically means that the address determination routine of the operating system is executed, and that the corresponding memory address is included in the relevant message before the message is buffered in the message buffers. In this way, the memory address can be loaded into the program counter and the instructions fetched right away as soon as message processing is initiated.
    Type: Application
    Filed: May 15, 2003
    Publication date: September 18, 2003
    Inventor: Nils Ola Linnermark
  • Patent number: 6615340
    Abstract: Extended operand management indicators stored during initial program execution enable management and regulation of operand values and streamline their handling. Operand values are stored in new types of stores. Operand location management indicators indicate current operand value locations among various store types for selected operands. Indicated operand-forwarding policies for selected operands streamline forwarding of operand values from source instructions to value receiving target instructions. Indicated loop iterations of operand source instructions enable forwarding of operands over more than one loop iteration. Stride indicators indicate strides of program loop accesses to matrix operands. Inter-loop indicators enable forwarding of operand values from source loop instructions directly to target loop instructions. Constant or nearly constant operands are indicated to enable their storage in special caches.
    Type: Grant
    Filed: March 22, 2000
    Date of Patent: September 2, 2003
    Inventor: Richard Byron Wilmot, II
  • Publication number: 20030163672
    Abstract: The invention provides a processor architecture that bypasses data hazards. The architecture has an array of pipelines and a register file. Each of the pipelines includes an array of execution units. The register file has a first section of n registers (e.g., 128 registers) and a second section of m registers (e.g., 16 registers). A write mux couples speculative data from the execution units to the second set of m registers and non-speculative data from a write-back stage of the execution units to the first section of n registers. A read mux couples the speculative data from the second set of m registers to the execution units to bypass data hazards within the execution units. The register file preferably includes column decode logic for each of the registers in the second section of m registers to architect speculative data without moving data. The decode logic first decodes, and then selects, an age of the producer of the speculative state; the newest producer enables the decode.
    Type: Application
    Filed: February 11, 2002
    Publication date: August 28, 2003
    Inventors: Eric S. Fetzer, Donald C. Soltis, Stephen R. Undy
  • Publication number: 20030159021
    Abstract: An instruction decode mechanism enables an instruction to control data flow bypassing hardware within a pipelined processor of a programmable processing engine. The control mechanism is defined by an instruction set of the processor as a unique register decode value that specifies either source operand bypassing (via a source bypass operand) or result bypassing (via a result bypass operand) from a previous instruction executing in pipeline stages of the processor. The source bypass operand allows source operand data to be shared among the parallel execution units of the pipelined processor, whereas the result bypass operand explicitly controls data flow within a pipeline of the processor through the use of result bypassing hardware of the processor. The instruction decode control mechanism essentially allows an instruction to directly identify a pipeline stage register for use as its source operand.
    Type: Application
    Filed: September 3, 1999
    Publication date: August 21, 2003
    Inventors: DARREN KERR, JOHN WILLIAM MARSHALL
  • Publication number: 20030154365
    Abstract: In the Retirement Payload Array (RPA) of a microprocessor, the pointer advance signal “ADVANCE POINTER” from the Instruction Retirement Logic (IRL) of the Instruction Scheduling Unit (ISU) is utilized to provide conditional read RPA signals. Consequently, according to the invention, a read of the RPA is completed only if it is determined that the read word line being read in the current cycle is not the same read word line that was read in the previous cycle.
    Type: Application
    Filed: February 8, 2002
    Publication date: August 14, 2003
    Applicant: Sun Microsystems, Inc.
    Inventors: Arjun P. Chandran, Gregg K. Tsujimoto, Anup S. Mehta
  • Publication number: 20030154364
    Abstract: A method for forwarding data within a pipeline of a pipelined data processor having a plurality of execution pipeline stages where each stage accepts a plurality of operand inputs and generates a result. The result generated by each execution pipeline stage is selectively coupled to an operand input of one of the execution pipeline stages.
    Type: Application
    Filed: October 1, 1999
    Publication date: August 14, 2003
    Inventors: CHIH-JUI PENG, LEW CHUA-EOAN
  • Patent number: 6606702
    Abstract: Disclosed is a method of operating a processor, by which a speculatively issued load request, which fetches incorrect data, is recycled. An instruction sequence, which includes a barrier instruction and a load instruction that follows the barrier instruction in program order, is received for execution. In response to the barrier instruction, a barrier operation is issued on an interconnect. Following, in response to the load instruction and while the barrier operation is pending, a load request is issued to memory. When a pre-determined type of invalidate, which is affiliated with the load request, is received before the receipt of an acknowledgment for the barrier operation, data that is returned by memory in response to the load request is discarded and the load request is re-issued. The pre-determined type of invalidate includes, for example, a snoop invalidate.
    Type: Grant
    Filed: June 6, 2000
    Date of Patent: August 12, 2003
    Assignee: International Business Machines Corporation
    Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
  • Patent number: 6604193
    Abstract: A processor has an instruction decoder including a register number translation unit for translating a register number specified by an instruction into the number of a physical register to be actually used in execution of the instruction. In an operation to decode an instruction, after a register number specified by the instruction is translated into the number of a physical register to be actually used in execution of the instruction, a register rename unit replaces the number of the physical register with the number of a rename register. As a result, the translation of a register number specified by the instruction into the number of a physical register to be actually used in execution of the instruction can be changed dynamically at run time even for a superscalar processor carrying out register renaming operations.
    Type: Grant
    Filed: December 10, 1999
    Date of Patent: August 5, 2003
    Assignee: Hitachi, Ltd.
    Inventors: Kentaro Shimada, Isao Kimura, Kazunari Tanaka
  • Patent number: 6601162
    Abstract: A bypass logic circuit (30) generates select signals (SelRs0, SelRt0, SelRs1 and SelRt1) by using prediction result flags (PrdNTkn1A and PrdNTkn1D) which are results of prediction about branch, instead of a branch condition not-taken signal (NTknA) actually output from a branch unit (52). Bypass multiplexers (44, 46, 54, 56) select operands to be output to ALU (42) or the branch unit (52) on the basis of these select signals (SelRs0, SelRt0, SelRs1 and SelRt1). Therefore, ample time is given for generating these select signals (SelRs0, SelRt0, SelRs1 and SelRt1).
    Type: Grant
    Filed: January 19, 2000
    Date of Patent: July 29, 2003
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Tatsuo Teruyama
  • Publication number: 20030140217
    Abstract: Embodiments are provided in which result forwarding for each execution unit in a processor is implemented for only one operand input of the execution unit. If another non-implemented operand input of the execution unit needs forwarded results, the forwarded results are passed through the implemented operand input. Non-forwarded operands are passed through the non-implemented operand input.
    Type: Application
    Filed: January 22, 2002
    Publication date: July 24, 2003
    Applicant: International Business Machines Corporation
    Inventor: David Arnold Luick
  • Patent number: 6598152
    Abstract: Enables a processor to quickly recover reliable use of a multi-cycle index used in a branch prediction mechanism for certain types of flush events occurring in the processor pipeline, whether the flush event occurs for a non-branch instruction or for a branch instruction contained in the same dispatch group. A GHV (global history vector) value is used in the generation of a multi-cycle index required for locating a prediction in a GBHT (global branch history table) for the instruction associated with the GHV value. The GHV value is captured in a BIQ (branch information queue) element representing each branch instruction selected for execution of a program. The BIQ element also captures an associated GHV count when the GHV value is captured.
    Type: Grant
    Filed: November 8, 1999
    Date of Patent: July 22, 2003
    Assignee: International Business Machines Corporation
    Inventor: Balaram Sinharoy
  • Publication number: 20030135715
    Abstract: In an enhanced virtual renaming scheme within a processor, multiple logical registers may be mapped to a single physical register. A value cache determines whether a new value generated pursuant to program instructions matches values associated with previously executed instructions. If so, the logical register associated with the newly executed instruction shares the physical register. Also, deadlock preventatives measures may be integrated into a register allocation unit in a manner that “steals” a physical register from a younger executed instruction when a value from an older instruction is generated-in a processor core.
    Type: Application
    Filed: January 27, 2003
    Publication date: July 17, 2003
    Inventors: Stephan J. Jourdan, Ronny Ronen, Michael Bekerman
  • Publication number: 20030135722
    Abstract: A system, method and apparatus is provided that splits a microprocessor load instruction into two (2) parts, a speculative load instruction and a check speculative load instruction. The speculative load instruction can be moved ahead in the instruction stream by the compiler as soon as the address and result registers are available. This is true even when the data to be loaded is not actually required. This speculative load instruction will not cause a fault in the memory if the access is invalid, i.e. the load misses and a token bit is set. The check speculative load instruction will cause the speculative load instruction to be retried in the event the token bit was set equal to one. In this manner, the latency associated with branching to an interrupt routine will be eliminated a significant amount of the time. It is very possible that the reasons for invalidating the speculative load operation are no longer present (e.g. page in memory is not present) and the load will be allowed to complete.
    Type: Application
    Filed: January 10, 2002
    Publication date: July 17, 2003
    Applicant: International Business Machines Corporation
    Inventor: Andrew Johnson
  • Publication number: 20030126411
    Abstract: A processor system and method that reduces the number of register value copying made from alias registers to corresponding real (architectural) registers. One method entails not performing an alias register to real register copying if the incoming instruction does not designate a real register. Another method entails delaying alias register to real register copying until the corresponding reorder buffer (ROB) entry is actually written to. Yet another method entails not performing an alias register to real register copying if the ROB entry is the same as the existing ROB entry. And, still another method entails further delaying or stalling the allocation of an ROB entry.
    Type: Application
    Filed: June 26, 2002
    Publication date: July 3, 2003
    Inventors: Guillermo Savransky, Ronny Ronen, Antonio Gonzalez
  • Publication number: 20030126410
    Abstract: A processor system and method that reduces the number of register value copying made from alias registers to corresponding real (architectural) registers. The method entails determining whether to copy the register value generated by executing an instruction from the alias register to the real register at the time the reorder buffer entry associated with the alias register is needed for a new instruction. If before the reorder buffer is needed for a new instruction, an interim instruction resulted in a new register value for the real register, then the original register value would be invalid at the time the reorder buffer entry is needed for the new instruction. Thus, there would not be a need to copy the original register value to the real register. The reduction in copying can make the processor system consume less power.
    Type: Application
    Filed: January 2, 2002
    Publication date: July 3, 2003
    Inventors: Guillermo Savransky, Ronny Ronen
  • Publication number: 20030126417
    Abstract: A method and apparatus to execute data speculative instructions in a processor comprising at least one source register, each source register comprising a bit to indicate validity of data in the at least one source register. A data validity circuit coupled to the one or more source registers to determine the validity of the data in the source registers, and to indicate the validity of the data in a destination register based upon the validity bit in the at least one source register. The processor optionally comprising a checker unit to retire those instructions from the execution unit which write valid data to the destination register, and to re-schedules those instructions for execution which write invalid data to the destination register.
    Type: Application
    Filed: January 2, 2002
    Publication date: July 3, 2003
    Inventors: Eric Sprangle, Michael J. Haertel, David J. Sager
  • Patent number: 6587941
    Abstract: A pipelined processor and method are disclosed including an improved history file unit. The pipelined processor processes a plurality of instructions in order. A register file is included which includes a different read port coupled to each register field in an instruction buffer for reading data from the register file. A history file unit is included and is coupled to each of the read ports of the register file for receiving a copy of all data read from the register file.
    Type: Grant
    Filed: February 4, 2000
    Date of Patent: July 1, 2003
    Assignee: International Business Machines Corporation
    Inventors: Brian King Flacks, Harm Peter Hofstee, Osamu Takahashi
  • Publication number: 20030088759
    Abstract: System and method to reduce execution of instructions involving unreliable data in a speculative processor. A method comprises identifying scratch values generated during speculative execution of a processor, and setting at least one tag associated with at least one data area of the processor to indicate that the data area holds a scratch value. Such data areas include registers, predicates, flags, and the like. Instructions may also be similarly tagged. The method may be executed by an execution engine in a computer processor.
    Type: Application
    Filed: November 5, 2001
    Publication date: May 8, 2003
    Inventor: Christopher B. Wilkerson
  • Publication number: 20030088760
    Abstract: According to one aspect of the invention, a method is provided in which store addresses of store instructions dispatched during a last predetermined number of cycles are maintained in a first data structure of a first processor. It is determined whether a load address of a first load instruction matches one of the store addresses in the first data structure. The first load instruction is replayed if the load address of the first load instruction matches one of the store addresses in the first data structure.
    Type: Application
    Filed: October 24, 2002
    Publication date: May 8, 2003
    Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
  • Publication number: 20030079116
    Abstract: One embodiment of the present invention provides a system that predicts a result produced by a section of code in order to support speculative program execution. The system begins by executing the section of code using a head thread in order to produce a result. Before the head thread produces the result, the system generates a predicted result to be used in place of the result. Next, the system allows a speculative thread to use the predicted result in speculatively executing subsequent code that follows the section of code. After the head thread finishes executing the section of code, the system determines if a difference between the predicted result and the result generated by the head thread has affected execution of the speculative thread. If so, the system executes the subsequent code again using the result generated by the head thread. If not, the system performs a join operation to merge state associated with the speculative thread with state associated with the head thread.
    Type: Application
    Filed: January 16, 2001
    Publication date: April 24, 2003
    Inventors: Shailender Chaudlhry, Marc Tremblay
  • Patent number: 6553485
    Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.
    Type: Grant
    Filed: January 22, 2002
    Date of Patent: April 22, 2003
    Assignee: Intel Corporation
    Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
  • Patent number: 6550003
    Abstract: The present invention achieves a fast handling of the predicted jumps by introducing an additional buffer (20) for not reported predicted jump instructions in a reorder buffer. This additional buffer (20) is separate from the main buffer (10) and in supplied only with information associated with instructions related to predicted jumps. A predicted jump instruction is preferably stored both in the main buffer (10) and the additional buffer (20). The additional buffer operates in parallel with the main buffer (10) and in designed as a linear first-in-first-out queue. The first not reported jump is then always easily available at the top of the queue for evaluating the jump conditions. If a mispredicted jump is determined, the reorder buffer is flushed by a flush generator unit (42).
    Type: Grant
    Filed: January 10, 2000
    Date of Patent: April 15, 2003
    Assignee: Telefonaktiebolaget LM Ericsson
    Inventors: Hans Christian Pettersson, Pär David Berglin
  • Publication number: 20030070060
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
    Type: Application
    Filed: October 30, 2002
    Publication date: April 10, 2003
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6542988
    Abstract: A processor performs precise trap handling for out-of-order and speculative load instructions. It keeps track of the age of load instructions in a shared scheme that includes a load buffer and a load annex. All precise exceptions are detected in a T phase of a load pipeline. Data and control information concerning load operations that hit in the data cache are staged in a load annex during the A1, A2, A3, and T pipeline stages until all exceptions in the same or earlier instruction packet are detected. Data and control information from all other load instructions is staged in the load annex after the load data is retrieved. Before the load data is retrieved, the load instruction is kept in a load buffer. If an exception occurs, any load in the same instruction packet as the instruction causing the exception is canceled. Any load instructions that are “younger” than the instruction that caused the exception are also canceled.
    Type: Grant
    Filed: October 1, 1999
    Date of Patent: April 1, 2003
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, Jeffrey Meng Wah Chan, Subramania Sudharsanan, Sharada Yeluri, Biyu Pan
  • Publication number: 20030061468
    Abstract: Multiple register input multiplexors select a respective one of the results generated by operation units, and store the selected results in respective architecture registers as specified by the corresponding instructions (from which the results are generated). A forwarding multiplexor receives the results before the results are provided to the register input multiplexors, and selects one of the results for use as an operand for execution of a dependent instruction. As the forwarding multiplexor receives the results at a point before the inputs of the register input multiplexors, the time duration required to forward the results may be minimized, and a greater instruction throughput performance may be attained in a processor.
    Type: Application
    Filed: July 17, 2002
    Publication date: March 27, 2003
    Applicant: Texas Instruments Incorporated
    Inventors: Ajit D. Gupte, Amitabh Menon
  • Publication number: 20030056085
    Abstract: An expanded arithmetic and logic unit (EALU) with special extra functions is integrated into a configurable unit for performing data processing operations. The EALU is configured by a function register, which greatly reduces the volume of data required for configuration. The cell can be cascaded freely over a bus system, the EALU being decoupled from the bus system over input and output registers. The output registers are connected to the input of the EALU to permit serial operations. A bus control unit is responsible for the connection to the bus, which it connects according to the bus register. The unit is designed so that distribution of data to multiple receivers (broadcasting) is possible. A synchronization circuit controls the data exchange between multiple cells over the bus system. The EALU, the synchronization circuit, the bus control unit, and registers are designed so that a cell can be reconfigured on site independently of the cells surrounding it.
    Type: Application
    Filed: May 28, 2002
    Publication date: March 20, 2003
    Applicant: Entire Interest
    Inventors: Martin Vorbach, Robert Munch
  • Patent number: 6535973
    Abstract: A method and system for speculatively issuing instructions which are dependent upon results from execution of other instructions. Instructions are speculatively issued, dependent upon a result from execution of a primary instruction, wherein the speculatively issued instructions are issued after execution of the primary instruction. N clock cycles are tracked after execution of the primary instruction, wherein the result from execution of said primary instruction is expected within n clock cycles. Execution of any speculatively issued instructions which are dependent upon the primary instruction is cancelled if the result is not returned from execution of the primary instruction within n clock cycles, such that for primary instructions for which the result is returned within the expected n clock cycles any speculatively issued instructions dependent upon said result are executed with increased efficiency.
    Type: Grant
    Filed: August 26, 1999
    Date of Patent: March 18, 2003
    Assignee: International Business Machines Corporation
    Inventors: Hoichi Cheong, Maureen A. Delaney, Hung Qui Le, Robert McDonald, Dung Quoc Nguyen, David Wayne Victor
  • Patent number: 6522934
    Abstract: A process control system includes a controller that executes a control routine which performs a series of unit procedures within a process. The control routine is written or created to specify the class of unit to be used for each unit procedure, but not the actual unit itself. At the start of each unit procedure of the control routine, a dynamic unit selection routine selects a particular unit as the unit to be used during operation of that unit procedure. When called, the dynamic unit selection routine determines a set of possible units to be used, determines if each of the set of possible units is suitable for use during that unit procedure of the control routine based on a suitability criterion, prioritizes the units that meet the suitability criterion based on a priority criterion and selects the particular unit from the prioritized list of suitable units in order of priority.
    Type: Grant
    Filed: July 2, 1999
    Date of Patent: February 18, 2003
    Assignee: Fisher-Rosemount Systems, Inc.
    Inventors: William G. Irwin, David L. Deitz
  • Patent number: 6510510
    Abstract: A computation block for use in a digital signal processor includes a register file for storage of operands and results and one or more computation units for executing digital signal computations. A first digital signal computation is performed with one of the computation units, and an intermediate result is produced. The intermediate result is transferred from a result output of the computation unit to an intermediate result input of one or more of the computation units without first transferring the intermediate result to the register file. A second digital signal computation is performed using the intermediate result to produce a final result or a second intermediate result.
    Type: Grant
    Filed: December 22, 1998
    Date of Patent: January 21, 2003
    Assignee: Analog Devices, Inc.
    Inventor: Douglas Garde
  • Publication number: 20030014614
    Abstract: There is disclosed a data processor that uses bypass circuitry to transfer result data from late pipeline stages to earlier pipeline stages in an efficient manner and with a minimum amount of wiring. The data processor comprises: 1) an instruction execution pipeline comprising a) a read stage; b) a write stage; and c) a first execution stage comprising E execution units that produce data results from data operands.
    Type: Application
    Filed: December 29, 2000
    Publication date: January 16, 2003
    Inventor: Anthony X. Jarvis
  • Patent number: 6505293
    Abstract: A processor architecture for providing many-to-one mappings between logical registers and physical registers, so that more than one logical register may map to the same physical register. Each physical register has an associated counter to indicate whether the physical register is free. A counter is incremented each time a mapping is made to its associated physical register, and is decremented when that mapping is no longer needed. If a logical register named in a decoded instruction is predicted to have the same value as a value stored in a physical register, then the logical register is mapped to the physical register.
    Type: Grant
    Filed: July 7, 1999
    Date of Patent: January 7, 2003
    Assignee: Intel Corporation
    Inventors: Stephan J. Jourdan, Ronny Ronen, Adi Yoaz
  • Publication number: 20030005264
    Abstract: The invention relates to an apparatus to control data flow for a processing unit having a plurality of data paths and a plurality of parallel processing units. Each computer unit of a data path is connected to an evaluating unit, which controls the acceptance of the results into the results register by setting a flag. The output of the evaluating unit is connected to one input of a logic gate, and the other input of the logic gate to the control output of the central program control unit. The output of the logic gate is connected to the control input of the output register. In this way, each evaluating unit can check the calculation by comparing the result of computation by the parallel processing unit with a preassigned value. Upon identification of nonsense values, or upon coincidence with a preassigned value, the results register may be cleared or blocked to prevent wrong or nonsense results.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 2, 2003
    Inventors: Wolfram Drescher, Matthias Weiss
  • Publication number: 20030005263
    Abstract: A queue, such as a first-in first-out queue, is incorporated into a processing device, such as a multithreaded pipeline processor. The queue may store the resources of more than one thread in the processing device such that the entries of one thread may be interspersed among the entries of another thread. The entries of each thread may be identified by a thread identification, a valid marker to indicate if the resources within the entry are valid, and a bank number. For a particular thread, the bank number tracks the number of times a head pointer pertaining to the first entry has passed a tail pointer. In this fashion, empty entries may be used and the resources may be efficiently allocated. In a preferred embodiment, the shared resource queue may be implemented into an in-order multithreaded pipelined processor as a queue storing resources to be dispatched for execution of instructions.
    Type: Application
    Filed: June 28, 2001
    Publication date: January 2, 2003
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard James Eickemeyer, Steven R. Kunkel, Hung Q. Le
  • Publication number: 20030005265
    Abstract: The present invention relates to data processing systems with built-in error recovery from a given checkpoint. In order to checkpoint more than one instruction per cycle it is proposed to collect updates of a predetermined maximum number of register contents performed by a respective plurality of CISC/RISC instructions in a buffer (CSB)(60) for checkpoint states, whereby a checkpoint state comprises as many buffer slots as registers can be updated by said plurality of CISC instructions and an entry for a Program Counter value associated with the youngest external instruction of said plurality, and to update an Architected Register Array (ARA)(64) with freshly collected register data after determining that no error was detected in the register data after completion of said youngest external instruction of said plurality of external instructions.
    Type: Application
    Filed: June 26, 2002
    Publication date: January 2, 2003
    Applicant: International Business Machines Corporation
    Inventors: Harry Stefan Barowski, Hartmut Schwermer, Hans-Werner Tast
  • Patent number: 6502186
    Abstract: An apparatus stores data corresponding to each type of instruction for each instruction, and includes: an information reservation station unit for performing integral control containing a resource updating process performed when the instruction is completely executed; and one or more function reservation station units for storing data corresponding to the function relating to the execution of the instruction, and controlling the execution of the function under the integral control of the instruction reservation station unit.
    Type: Grant
    Filed: December 20, 2000
    Date of Patent: December 31, 2002
    Assignee: Fujitsu Limited
    Inventor: Aiichiro Inoue
  • Publication number: 20020194457
    Abstract: In one embodiment of the invention, a processor includes a memory order buffer (MOB) including load buffers and store buffers, wherein the MOB orders load and store instructions so as to maintain data coherency between load and store instructions in different threads, wherein at least one of the threads is dependent on at least another one of the threads. In another embodiment of the invention, a processor includes an execution pipeline to concurrently execute at least portions of threads, wherein at least one of the threads is dependent on at least another one of the threads, the execution pipeline including a memory order buffer that orders load and store instructions. The processor also includes detection circuitry to detect speculation errors associated with load instructions in a load buffer.
    Type: Application
    Filed: August 1, 2002
    Publication date: December 19, 2002
    Inventor: Haitham Akkary
  • Patent number: 6496925
    Abstract: A method includes detecting a first event occurrence for a first thread being processed within a multithreaded processor. Responsive to the detection of this first event occurrence, a second thread being processed within the multithreaded processor is monitored to detect a clearing point for this second thread. Responsive to the detection of a clearing point for the second thread, a functional unit within the multithreaded processor is cleared of data for both the first and the second threads.
    Type: Grant
    Filed: December 9, 1999
    Date of Patent: December 17, 2002
    Assignee: Intel Corporation
    Inventors: Dion Rodgers, Darrell Boggs, Amit Merchant, Rajesh Kota, Rachel Hsu, Keshavan Tiruvallur
  • Publication number: 20020188829
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Application
    Filed: July 9, 2002
    Publication date: December 12, 2002
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 6490674
    Abstract: The present invention generally relates to a processing system and method for coalescing instruction data to efficiently detect data hazards between instructions of a computer program. In architecture, the system of the present invention utilizes a plurality of pipelines, coalescing circuitry, and hazard detection circuitry. Each of the pipelines receives and processes instructions of a computer program, and the coalescing circuitry receives a plurality of register identifiers from the pipelines. Each of the register identifiers identifies one of a plurality of registers, and the coalescing circuitry combines the plurality of register identifiers into a single register identifier such that the single register identifier identifies each of the registers identified by the register identifiers received by the coalescing circuitry.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: December 3, 2002
    Assignee: Hewlett-Packard Company
    Inventors: Ronny Lee Arnold, Donald Charles Soltis, Jr.
  • Publication number: 20020178347
    Abstract: An system and method for retiring instructions in a superscalar microprocessor which executes a program comprising a set of instructions having a predetermined program order, the retirement system for simultaneously retiring groups of instructions executed in or out of order by the microprocessor. The retirement system comprises a done block for monitoring the status of the instructions to determine which instruction or group of instructions have been executed, a retirement control block for determining whether each executed instruction is retirable, a temporary buffer for storing results of instructions executed out of program order, and a register array for storing retirable-instruction results.
    Type: Application
    Filed: May 22, 2002
    Publication date: November 28, 2002
    Inventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
  • Publication number: 20020174322
    Abstract: A first tag is assigned to a branch instruction. Dependent on the type of branch instruction, a second tag is assigned to an instruction in the branch delay slot of the branch instruction. If the branch is mispredicted, the first tag is broadcast to pipeline stages that may have speculative instructions, and the first tag is compared to tags in the pipeline stages to determine which instructions to cancel. The assignment of tags for a fetch group of concurrently fetched instructions may be performed in parallel. A plurality of branch sequence numbers may be generated, and one of the plurality may be selected for each instruction responsive to the cumulative number of branch instructions preceding that instruction within the fetch group. The selection may be further responsive to whether or not the instruction is in a conditional delay slot.
    Type: Application
    Filed: September 24, 2001
    Publication date: November 21, 2002
    Inventor: David A. Kruckemyer
  • Publication number: 20020174321
    Abstract: The present invention provides a system, method and apparatus for allocating resources by assigning resource identifiers to processor resources using at least a portion of a pseudorandom sequence. One or more resource identifiers are generated using at least a portion of each a pseudorandom sequence. Each resource identifier corresponds to one of the resources. One or more of the resource identifiers are then selected for allocation to the instruction.
    Type: Application
    Filed: December 19, 2000
    Publication date: November 21, 2002
    Inventors: Lizy Kurian John, Srivatsan Srinivasan
  • Patent number: 6484254
    Abstract: According to one aspect of the invention, a method is provided in which store addresses of store instructions dispatched during a last predetermined number of cycles are maintained in a first data structure of a first processor. It is determined whether a load address of a first load instruction matches one of the store addresses in the first data structure. The first load instruction is replayed if the load address of the first load instruction matches one of the store addresses in the first data structure.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: November 19, 2002
    Assignee: Intel Corporation
    Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
  • Patent number: 6477637
    Abstract: A method and apparatus for transporting store requests between functional units within a processor is disclosed. A data processing system includes a data dispatching unit, a data receiving unit, a segmented data pipeline coupled between the data dispatching unit and the data receiving unit, and a segmented feedback line coupled between the data dispatching unit and the data receiving unit. Having multiple latches interconnected between segments, the segmented data pipeline systolically transfers data from the data dispatching unit to the data receiving unit. The segmented feedback line has multiple control latches interconnected between segments. Each of the control latches sends a control signal to a respective one of the latches in the segmented instruction pipeline to forward data to a next segment within the segmented data pipeline.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: November 5, 2002
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Guy Lynn Guthrie
  • Patent number: 6477640
    Abstract: A branch prediction unit apparatus and method uses an instruction buffer (20), a completion unit (24), and a branch prediction unit (BPU) (28). The instruction buffer (20) and/or the completion unit (24) contain a plurality of instruction entries that contain valid bits and stream identifier (SID) bits. The branch prediction unit contains a plurality of branch prediction buffers (28a-28c). The SID bits are used to associate the pending and executing instructions in the units (20 and 24) into instruction streams related to predicted branches located in the buffers (28a-28c). The SID bits as well as age bits associated with the buffers (28a-28c) are used to perform efficient branch prediction, branch resolution/retirement, and branch misprediction recovery.
    Type: Grant
    Filed: September 11, 2000
    Date of Patent: November 5, 2002
    Assignee: Motorola, Inc.
    Inventors: Jeffrey Pidge Rupley, II, Marvin A. Denman, Bradley G. Burgess, David C. Holloway
  • Publication number: 20020156997
    Abstract: A technique for managing register assignments. The technique involves maintaining, in a register list memory circuit having entries that respectively correspond to physical registers, a list of register assignments that assign logical registers to the physical registers. The technique further involves maintaining, in a vector memory circuit having bits that respectively correspond to the physical registers, a valid vector that forms, in combination with the list of register assignments, a list of valid register assignments. Furthermore, the technique involves storing, for an instruction that is mapped by the data processor, a copy of the valid vector from the vector memory circuit to a silo memory circuit. Preferably, the processor using the technique has the ability to execute branches of instructions speculatively, and to recover if it is determined that the processor executed down an incorrect instruction branch.
    Type: Application
    Filed: May 9, 2002
    Publication date: October 24, 2002
    Inventors: James Arthur Farrell, Sharon Marie Britton, Harry Ray Fair, Bruce Gieseke, Daniel Lawrence Leibholz, Derrick R. Meyer
  • Patent number: 6470445
    Abstract: A processing system for processing instructions of computer programs utilizes a plurality of pipelines and a control mechanism in order to detect and prevent write-after-write data hazards. The plurality of pipelines receives and processes instructions of a computer program that includes a first instruction and a second instruction. The control mechanism is designed to detect a write-after-write data hazard associated with the first instruction and the second instruction, when the first and second instruction are configured to cause data to be written to the same location. After detecting the write-after-write data hazard, the control mechanism determines whether there is another instruction in the instructions being processed by the pipelines that is dependent on the data produced or retrieved by execution of the first instruction. If there is such an instruction, the control mechanism cancels the first instruction by transmitting a cancellation request.
    Type: Grant
    Filed: September 7, 1999
    Date of Patent: October 22, 2002
    Assignee: Hewlett-Packard Company
    Inventors: Ronny Lee Arnold, Donald Charles Soltis, Jr.
  • Publication number: 20020144094
    Abstract: The present invention, in various embodiments, provides techniques for retiring instructions that typically complete early as compared to most instructions. In a first embodiment, at each stage of the various processing stages, each instruction capable of early retirement is processed in accordance with that stage. At a particular stage, if the instruction meets the criteria for early retirement, then the instruction is terminated, e.g., “retired,” and the system is updated to reflect that the instruction has been terminated. However, if, at that particular stage, the instruction does not meet the criteria for early retirement, then the instruction is processed to the next stage, and it is determined again whether the instruction meets the criteria for early retirement. If the instruction meets the criteria, then the instruction is terminated, or if the instruction does not meet the criteria, then the instruction is processed to the next stage, and so on, until the instruction is retired.
    Type: Application
    Filed: March 30, 2001
    Publication date: October 3, 2002
    Inventor: Carl D. Burch
  • Publication number: 20020144096
    Abstract: The present invention, in various embodiments, provides techniques for retiring instructions that typically complete early as compared to most instructions. In a first embodiment, at each stage of the various processing stages, each instruction capable of early retirement is processed in accordance with that stage. At a particular stage, if the instruction meets the criteria for early retirement, then the instruction is terminated, e.g., “retired,” and the system is updated to reflect that the instruction has been terminated. However, if, at that particular stage, the instruction does not meet the criteria for early retirement, then the instruction is processed to the next stage, and it is determined again whether the instruction meets the criteria for early retirement. If the instruction meets the criteria, then the instruction is terminated, or if the instruction does not meet the criteria, then the instruction is processed to the next stage, and so on, until the instruction is retired.
    Type: Application
    Filed: March 30, 2001
    Publication date: October 3, 2002
    Inventor: Carl D. Burch
  • Publication number: 20020144093
    Abstract: In an embodiment, a pipelined processor may be adapted to process multi-cycle instructions (MCIs). Results generated in response to non-terminal sub-instructions may be written to a speculative commit register. When the MCI commits, i.e., a terminal sub-instruction reaches the WB stage, the value in the speculative commit register may be written to the architectural register.
    Type: Application
    Filed: March 28, 2001
    Publication date: October 3, 2002
    Inventors: Ryo Inoue, Gregory A. Overkamp