Commitment Control Or Register Bypass Patents (Class 712/218)

Retiring early-completion instruction to improve computer operation throughput

Publication number: 20020144095

Abstract: The present invention, in various embodiments, provides techniques for retiring instructions that typically complete early as compared to most instructions. In a first embodiment, at each stage of the various processing stages, each instruction capable of early retirement is processed in accordance with that stage. At a particular stage, if the instruction meets the criteria for early retirement, then the instruction is terminated, e.g., “retired,” and the system is updated to reflect that the instruction has been terminated. However, if, at that particular stage, the instruction does not meet the criteria for early retirement, then the instruction is processed to the next stage, and it is determined again whether the instruction meets the criteria for early retirement. If the instruction meets the criteria, then the instruction is terminated, or if the instruction does not meet the criteria, then the instruction is processed to the next stage, and so on, until the instruction is retired.

Type: Application

Filed: March 30, 2001

Publication date: October 3, 2002

Inventor: Carl D. Burch
Stitching parcels

Patent number: 6449710

Abstract: The invention provides a method and system for performing instructions in a microprocessor having a set of registers, in which instructions which operate on portions of a register are recognized, and “stitching” instructions are inserted into the instruction stream to couple the instructions operating on the portions of the register. The “stitching” parcels are serialized along with other instruction parcels, so that instructions which read from or write to portions of a register can proceed independently and out of their original order, while maintaining the results of that out-or-order operation to be the same as if all instructions were performed in the original order. In a preferred embodiment, the choice of stitching parcels is optimized to the Intel x86 architecture and instruction set.

Type: Grant

Filed: October 29, 1999

Date of Patent: September 10, 2002

Assignee: STMicroelectronics, Inc.

Inventor: David L. Isaman
Processor architecture

Publication number: 20020124155

Abstract: An architecture for a pipeline processor circuit, preferably of the VLIW type, comprises a plurality of stages and a network of forwarding paths which connect pairs of said stages, as well as a register file for operand write-back. An optimization-of-power-consumption function is provided via inhibition of writing and subsequent readings in said register file of operands retrievable from said forwarding network on account of their reduced liveness length.

Type: Application

Filed: October 11, 2001

Publication date: September 5, 2002

Applicant: STMicroelectronics S.r.l.

Inventors: Mariagiovanna Sami, Donatella Sciuto, Cristina Silvano, Vittorio Zaccaria, Danilo Pau, Roberto Zafalon
Method and apparatus for providing data to a processor pipeline

Patent number: 6442678

Abstract: In one method, a processor comprises both a speculative register file to store speculative register values and an architectural register file to store architectural register values. An output of the architectural register file is coupled to an input of the speculative register file to update the speculative register file when a misspeculation is detected.

Type: Grant

Filed: December 31, 1998

Date of Patent: August 27, 2002

Assignee: Intel Corporation

Inventor: Judge K. Arora
Apparatus and method for guard outcome prediction

Patent number: 6442679

Abstract: Guard prediction apparatus for predicting guard outcomes for predicated instructions, each of which specifies a guard operator to be applied to a guard source to generate the guard outcome. The guard prediction apparatus includes a cache, availability logic, a selection circuit, a deduction circuit and write back circuitry. The cache stores previous predictions of guard outcomes for a set of guard sources and guard operators. The availability logic determines whether the cache includes a previous prediction that is relevant to a first guard source and a first guard operator and, if so, couples that previous prediction to the selection circuit. The selection circuit generates the final guard outcome prediction by selecting between the previous prediction, if available, and an initial prediction, if a previous prediction is not available. The deduction circuit deduces from the initial prediction of the guard outcome other consistent guard outcomes for a set of guard operators when applied to the guard source.

Type: Grant

Filed: August 17, 1999

Date of Patent: August 27, 2002

Assignee: Compaq Computer Technologies Group, L.P.

Inventors: Arthur Klauser, Keith Istvan Farkas
DATA PROCESSING APPARATUS

Publication number: 20020116599

Abstract: To eliminate pipeline stall due to data hazard in a superscalar system and to increase the processing speed. An instruction decoder is provided with a circuit which detects two neighboring 2-operand instructions which are equivalent to one 3-operand instruction, and a circuit which, if it is equivalent, integrates the two instructions into the 3-operand instruction and sends it to a succeeding execution stage. Or, provision is made of a circuit which sends the source data of a preceding instruction to an arithmetic unit for a succeeding instruction when the two neighboring instructions have a relationship of data flow but cannot be integrated into one 3-operand instruction. It is allowed to execute the processing of two instructions in one clock, which so far required two clocks due to data flow between the neighboring instructions. Therefore, the number of execution clocks as a whole can be decreased.

Type: Application

Filed: March 13, 1997

Publication date: August 22, 2002

Inventors: MASAHIRO KAINAGA, YASUHIKO SAITOO
Method and apparatus for processing events in a multithreaded processor

Publication number: 20020116600

Abstract: In a multithreaded processor, events are categorized according to which of a “soft” state clearing (“nuke”) process and a “hard” nuke process should be performed in response to each event. When an event is detected for a thread, either the soft nuke or hard nuke process is executed, according to the type of event, prior to invoking an event handler. The soft nuke process performs less than all of the actions performed by the hard nuke process and requires much less time to execute. If multiple threads are being processed, the hard nuke process requires synchronization between the threads and clears state for each thread, whereas the soft nuke process does not require cross-thread synchronization and clears state only for the thread in which the event was detected. In one embodiment, the soft nuke process is implemented in microcode, while the hard nuke process is hardware-implemented.

Type: Application

Filed: November 30, 2001

Publication date: August 22, 2002

Inventors: Lawrence O. Smith, S. Dion Rodgers
Detection of data hazards between instructions by decoding register indentifiers in each stage of processing system pipeline and comparing asserted bits in the decoded register indentifiers

Patent number: 6438681

Abstract: A computer system utilizing a processing system capable of efficiently comparing register identifiers to detect data hazards between instructions of a computer program is used to execute the computer program. The processing system utilizes at least one pipeline, a first decoder, a second decoder, and comparison logic. The pipeline receives and simultaneously processes instructions of a computer program. The first and second decoders are coupled to the pipeline and decode register identifiers associated with instructions being processed by the pipeline. The comparison logic is interfaced with the first and second decoders and respectively compares the decoded register identifiers produced by the first and second decoders to other decoded register identifiers.

Type: Grant

Filed: January 24, 2000

Date of Patent: August 20, 2002

Assignee: Hewlett-Packard Company

Inventors: Ronny Lee Arnold, Donald Charles Soltis Jr.
Data processing apparatus with register file bypass

Publication number: 20020108026

Abstract: A data processing apparatus for increasing the speed of data transfer from one processor instruction to another processor instruction. First (78) and second (80) functional unit groups, each including a plurality of functional units, are connected to a register file (76) comprising a plurality of registers having corresponding register numbers. A comparator (181) receives an indication of the operand register number of a current instruction for a functional unit in the first functional unit group, and an indication of the destination register number of an immediately preceding instruction for the second functional unit group, and indicates whether the register numbers match.

Type: Application

Filed: December 8, 2000

Publication date: August 8, 2002

Inventors: Keith Balmer, Richard D. Simpson, Iain Robertson, John Keay
Pre-arbitrated bypasssing in a speculative execution microprocessor

Patent number: 6430679

Abstract: A pre-arbitrated bypassing system in a speculative execution microprocessor is provided. The bypassing system provides execution units enhanced to include a comparator and an enabled driver. The comparator compares a bypass address that is broadcast upon instruction decode with the destination address within each execution unit. If there is a match, then the result data is driven onto the bypass bus. Additionally, a suppress signal and validation scheme/apparatus are included to ensure that valid data is being driven onto the bypass bus. A bypass bus and associated apparatus may be included for every potential source operand.

Type: Grant

Filed: October 2, 1998

Date of Patent: August 6, 2002

Assignee: Intel Corporation

Inventor: Jay S. Heeb
Result forwarding cache

Patent number: 6427207

Abstract: An apparatus is presented for expediting the execution of dependent micro instructions in a pipeline microprocessor having design characteristics-complexity, power, and timing—that are not significantly impacted by the number of stages in the microprocessor's pipeline. In contrast to conventional result distribution schemes where an intermediate result is distributed to multiple pipeline stages, the present invention provides a cache for storage of multiple intermediate results. The cache is accessed by a dependent micro instruction to retrieve required operands. The apparatus includes a result forwarding cache, result update logic, and operand configuration logic. The result forwarding cache stores the intermediate results. The result update logic receives the intermediate results as they are generated and enters the intermediate results into the result forwarding cache.

Type: Grant

Filed: July 20, 2001

Date of Patent: July 30, 2002

Assignee: I.P. First L.L.C.

Inventors: Gerard M. Col, G. Glenn Henry
Optimization of instruction stream execution that includes a VLIW dispatch group

Patent number: 6425069

Abstract: A method and system for optimizing execution of an instruction stream which includes a very long instruction word (VLIW) dispatch group in which ordering is not maintained is disclosed. The method and system comprises examining an access which initiated a flush operation; capturing an indice related to the flush operation; and causing all storage access instructions related to this indice to be dispatched as single-IOP groups until the indice is updated. Storage access to address space which is safe such as Guarded (G=1) or Direct Store (E=DS) must be handled in a non-speculative manner such that operations which could potentially go to volatile I/O devices or control locations that do not get processed out of order. Since the address is not known in the front end of the processor, this can only be determined by the load store unit or functional block which performs translation.

Type: Grant

Filed: March 5, 1999

Date of Patent: July 23, 2002

Assignee: International Business Machines Corporation

Inventors: Larry Edward Thatcher, John Edward Derrick
System for implementing a register free-list by using swap bit to select first or second register tag in retire queue

Patent number: 6425072

Abstract: An apparatus and method for implementing a register free list scheme is provided. An instruction received in an execution unit can be assigned an absolute register number as its destination register. A new physical register tag from a free list can be assigned to the absolute register number and a tag future file can be updated with the new physical register tag. The old physical register tag can be read from the tag future file and stored in a retire queue entry corresponding to the instruction along with the new physical register tag and an architectural register identifier corresponding to the absolute register number. A valid bit corresponding to the entry can be set in response to the entry being written. In response to an abort signal, a swap bit corresponding to the entry can be set, the valid bit can be reset, and the new physical register tag can be conveyed to a rename unit in response to receiving a free register request.

Type: Grant

Filed: August 31, 1999

Date of Patent: July 23, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: Stephan Meier, Chetana N. Keltcher
Re-order buffer managing method and processor

Publication number: 20020091913

Abstract: By using an entry number (WRB number) of a re-order buffer 6, each of function units such as an operation unit 3, a store unit 4, a load unit 5, etc. notifies to the re-order buffer 6 the processing end for a instruction stored in the entry concerned in the unit thereof. The load unit 5 manages the latest speculation state of a load instruction issued on the basis of a branch prediction success/failure signal output from the branch unit 2, and makes no notification to the re-order buffer 6 on the basis of WRB number for subsequent load instructions of a branch-prediction failed branch instruction even when the processing of the instruction is finished. Accordingly, the re-order buffer 6 can re-use entries in which the subsequent instructions of the branch prediction failed branch instruction are stored.

Type: Application

Filed: January 9, 2002

Publication date: July 11, 2002

Applicant: NEC CORPORATION

Inventor: Masao Fukagawa
Multi-threading techniques for a processor utilizing a replay queue

Publication number: 20020091914

Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.

Type: Application

Filed: February 1, 2002

Publication date: July 11, 2002

Inventors: Amit A. Merchant, Darrell D. Buggs, David J. Sager
System and method for executing variable latency load operations in a date processor

Publication number: 20020087839

Abstract: There is disclosed a data processor that executes variable latency load operations using bypass circuitry that allows load word operations to avoid stalls caused by shifting circuitry.

Type: Application

Filed: December 29, 2000

Publication date: July 4, 2002

Inventors: Anthony X. Jarvis, Paolo Faraboschi
Method and processor for recovering registers for register renaming structure

Publication number: 20020087836

Abstract: A processor having a register renaming structure and method is disclosed to recover a free list. The processor includes a physical register file including physical registers. The processor also includes a decoder to decode an instruction to indicate a destination logical register. The processor also includes a register allocation table to map the destination logical register to an allocated physical register. The processor also includes an active list that includes an old field and a new field. The old field includes at least one evicted physical register from the register alias table. The new field includes the allocated physical register. The processor also includes the free list of unallocated physical registers reclaimed from the active list.

Type: Application

Filed: December 29, 2000

Publication date: July 4, 2002

Inventors: Stephan J. Jourdan, Michael Bekerman, Ronny Ronen
Processor pipeline stall apparatus and method of operation

Publication number: 20020087838

Abstract: There is disclosed a data processor for stalling the instruction execution pipeline after a cache miss and re-loading the correct cache data into any bypass devices before restarting the pipeline.

Type: Application

Filed: December 29, 2000

Publication date: July 4, 2002

Inventor: Anthony X. Jarvis
Rename finish conflict detection and recovery

Publication number: 20020083304

Abstract: An improved method and system for operating an out of order processor at a high frequency enabled by an increased pipeline length. It is proposed to shorten the pipeline by a considerable number of stages by accepting that a write after read conflict may occur, when directly after renaming, during the “read ROB” pipeline stage, all the information (tag, validity and data) is read from an Reorder Buffer ROB entry, and is next written, in a following pipeline stage “write RS”, into a reservation station (RS) entry. In order to assure the correctness of processing in particular in cases of dependencies, e.g., write after read conflicts a separate inventional add in logic covers these cases. The logic detects the write after read conflict case of an Instructional Execution Unit (IEU) writing into the particular entry that is selected by the renaming logic during “read ROB”.

Type: Application

Filed: December 20, 2001

Publication date: June 27, 2002

Applicant: IBM

Inventors: Jens Leenstra, Dieter Wendel
Dynamic pipelines with reusable logic elements controlled by a set of multiplexers for pipeline stage selection

Patent number: 6412061

Abstract: A method of dynamically adjusting a multiple stage pipeline to execute one of a set of instructions, wherein each stage has a latency and performs a selected data operation. An instruction to be executed is received and a number of stages of the pipeline is selected to execute the instruction as needed to perform a corresponding data operation. Unnecessary stages are bypassed to a reduced latency and the instruction is executed with the selected stages.

Type: Grant

Filed: January 14, 1998

Date of Patent: June 25, 2002

Assignee: Cirrus Logic, Inc.

Inventor: Thomas Anthony Dye
System and method for retiring approximately simultaneously a group of instructions in a superscalar microprocessor

Patent number: 6412064

Abstract: An system and method for retiring instructions in a superscalar microprocessor which executes a program comprising a set of instructions having a predetermined program order, the retirement system for simultaneously retiring groups of instructions executed in or out of order by the microprocessor. The retirement system comprises a done block for monitoring the status of the instructions to determine which instruction or group of instructions have been executed, a retirement control block for determining whether each executed instruction is retirable, a temporary buffer for storing results of instructions executed out of program order, and a register array for storing retirable-instruction results.

Type: Grant

Filed: August 2, 2000

Date of Patent: June 25, 2002

Assignee: Seiko Epson Corporation

Inventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
Speculative register adjustment

Publication number: 20020078326

Abstract: In one embodiment, a programmable processor is adapted to include a speculative count register. The speculative count register may be loaded with data associated with an instruction before the instruction commits. However, if the instruction is terminated before it commits, the speculative count register may be adjusted. A set of counters may monitor the difference between the speculative count register and its architectural counterpart.

Type: Application

Filed: December 20, 2000

Publication date: June 20, 2002

Applicant: Intel Corporation and Analog Devices, Inc.

Inventors: Charles P. Roth, Ravi P. Singh, Gregory A. Overkamp
Apparatus and method for executing floating-point store instructions in a microprocessor

Patent number: 6408379

Abstract: An apparatus and method for executing floating-point store instructions in a microprocessor is provided. If store data of a floating-point store instruction corresponds to a tiny number and an underflow exception is masked, then a trap routine can be executed to generate corrected store data and complete the store operation. In response to detecting that store data corresponds to a tiny number and the underflow exception is masked, the store data, store address information, and opcode information can be stored prior to initiating the trap routine. The trap routine can be configured to access the store data, store address information, and opcode information. The trap routine can be configured to generate corrected store data and complete the store operation using the store data, store address information, and opcode information.

Type: Grant

Filed: June 10, 1999

Date of Patent: June 18, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: Norbert Juffa, Stephan Meier, Stuart Oberman, Scott White
Data processing control device

Patent number: 6408372

Abstract: A RAM (12) used by the CPU comprises a work buffer (14) and a work register (151) for pipelined processing. The work buffer (14) consists of the first to fourth work buffers (141 to 144) each of which stores information on predetermined data, e.g., a current processing on the data. When the CPU accesses the first to fourth work buffers (141 to 144), an address decoder performs an address conversion on the basis of a value (R151) of the work register (151). For example, when the value (R151) of the work register (151) is “1”, addresses (P1, P2, P3 and P4) in an address space are converted (mapped) to addresses (AD141, AD142, AD143 and AD144) of work buffers (141, 142, 143 and 144). With this constitution, in performing a plurality of data processings in parallel, the CPU can improve its operation efficiency while controlling a currently performed processing on each data.

Type: Grant

Filed: March 2, 2000

Date of Patent: June 18, 2002

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventor: Shigenori Miyauchi
Multi-bit scoreboarding to handle write-after-write hazards and eliminate bypass comparators

Patent number: 6408378

Abstract: An apparatus and method of multi-bit scoreboarding to handle write-after-write hazards and eliminate bypass comparators. The method comprises providing a set of data storage units that include a set of scoreboard return path bits associated with the data storage units. The scoreboard return path bits include a set of return path indicators. A first execution unit is provided to provide an output based on an input. A first switching unit is provided to select the input to the first execution unit and to receive the output from the first execution unit. A first bypass control unit is provided to cause the first switching unit to couple the output the output from the first execution unit to the first execution unit based on the scoreboard return path bits, such that the input to the first execution unit is selected from among the data storage units and the output from the first execution unit.

Type: Grant

Filed: May 31, 2001

Date of Patent: June 18, 2002

Assignee: Intel Corporation

Inventor: Dennis M. O'Connor
Method for mapping instructions using a set of valid and invalid logical to physical register assignments indicated by bits of a valid vector together with a logical register list

Patent number: 6405304

Abstract: A technique for managing register assignments. The technique involves maintaining, in a register list memory circuit having entries that respectively correspond to physical registers, a list of register assignments that assign logical registers to the physical registers. The technique further involves maintaining, in a vector memory circuit having bits that respectively correspond to the physical registers, a valid vector that forms, in combination with the list of register assignments, a list of valid register assignments. Furthermore, the technique involves storing, for an instruction that is mapped by the data processor, a copy of the valid vector from the vector memory circuit to a silo memory circuit. Preferably, the processor using the technique has the ability to execute branches of instructions speculatively, and to recover if it is determined that the processor executed down an incorrect instruction branch.

Type: Grant

Filed: August 24, 1998

Date of Patent: June 11, 2002

Assignee: Compaq Information Technologies Group, L.P.

Inventors: James Arthur Farrell, Sharon Marie Britton, Harry Ray Fair, III, Bruce Gieseke, Daniel Lawrence Leibholz, Derrick R. Meyer
Method and apparatus for replacing data in an operand latch of a pipeline stage in a processor during a stall

Patent number: 6401195

Abstract: In one method, a hazard on a register is detected based on the register ID from a latch of a first stage of a processor pipeline. The pipeline is stalled after a stale value of the register is stored in a latch of a later stage of the pipeline. The stale value in the latch is then replaced with a fresh value while the pipeline is stalled.

Type: Grant

Filed: December 30, 1998

Date of Patent: June 4, 2002

Assignee: Intel Corporation

Inventors: Judge K. Arora, Harshvardhan P. Sharangpani, Ghassan W. Khadder
Data processor with an improved data dependence detector

Publication number: 20020066005

Abstract: The present invention provides a detector for detecting at least one kind of dependence in address between instructions executed by at least a processor, the detector being adopted to detect a possibility of presence of the at least one kind of dependence, wherein if the at least one kind of dependence is present in fact, then the detector detects a possibility of presence of the at least one kind of dependence, and if the at least one kind of dependence is not present in fact, then the detector is allowed to detect the at least one kind of dependence.

Type: Application

Filed: November 28, 2001

Publication date: May 30, 2002

Applicant: NEC CORPORATION

Inventors: Atsufumi Shibayama, Satoshi Matsushita, Sunao Torii, Naoki Nishi
Multi-threading for a processor utilizing a replay queue

Patent number: 6385715

Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.

Type: Grant

Filed: May 4, 2001

Date of Patent: May 7, 2002

Assignee: Intel Corporation

Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
Line-oriented reorder buffer configured to selectively store a memory operation result in one of the plurality of reorder buffer storage locations corresponding to the executed instruction

Patent number: 6381689

Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.

Type: Grant

Filed: March 13, 2001

Date of Patent: April 30, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: David B. Witt, Thang M. Tran
Method and apparatus for performing frame processing for a network

Patent number: 6377998

Abstract: An improved frame processing apparatus for a network that supports high speed frame processing is disclosed. The frame processing apparatus uses a combination of fixed hardware and programmable hardware to implement network processing, including frame processing and media access control (MAC) processing. Although generally applicable to frame processing for networks, the improved frame processing apparatus is particular suited for token-ring networks and ethernet networks. The invention can be implemented in numerous ways, including as an apparatus, an integrated circuit and network equipment.

Type: Grant

Filed: August 22, 1997

Date of Patent: April 23, 2002

Assignee: Nortel Networks Limited

Inventors: Michael Noll, Michael Clarke, Mark Smallwood
Pipelined microprocessor and a method relating thereto

Publication number: 20020038417

Abstract: A pipelined microprocessor for processing instructions includes at least one pipeline. The pipeline includes and instruction fetching functional stage, an instruction decoding functional stage, an execution functional stage comprising a number of execution units and a commit functional stage. The commit functional stage includes or is associated with a reorder buffer. Detecting means are provided for detecting instruction irregularities. When an instruction irregularity is detected, an irregularity indication and a flush instruction are generated. The irregularity indication is used to initiate a flush made whereas the flush instruction, when received in a stage or unit set in flush mode, resets the flush mode in said stage/unit.

Type: Application

Filed: September 27, 2001

Publication date: March 28, 2002

Inventors: Joachim Strombergsson, Magnus Carlesson, Jonas Vasell
Data cache having store queue bypass for out-of-order instruction execution and method for same

Patent number: 6360314

Abstract: A bypass mechanism is disclosed for a computer system that executes load and store instructions out of order. The bypass mechanism compares the address of each issuing load instruction with a set of recent store instructions that have not yet updated memory. A match of the recent stores provides the load data instead of having to retrieve the data from memory. A store queue holds the recently issued stores. Each store queue entry and the issuing load includes a data size indicator. Subsequent to a data bypass, the data size indicator of the issuing load is compared against the data size indicator of the matching store queue entry. A trap is signaled when the data size indicator of the issuing load differs from the data size indicator of the matching store queue entry. The trap signal indicates that the data provided by the bypass mechanism was insufficient to satisfy the requirements of the load instruction.

Type: Grant

Filed: July 14, 1998

Date of Patent: March 19, 2002

Assignee: Compaq Information Technologies Group, L.P.

Inventors: David Arthur James Webb, Jr., James B. Keller, Derrick R. Meyer
Method and system for managing registers in a data processing system supports out-of-order and speculative instruction execution

Patent number: 6356918

Abstract: A method and a system in a data processing system for managing registers in a register array wherein the data processing system has M architected registers and the register array has greater than M registers. A first physical register address is selected from a group of available physical register addresses in a renamed table in response to dispatching a register-modifying instruction that specifies an architected target register address. The architected target register address is then associated with the first physical register address, and a result of executing the register-modifying instruction is stored in a physical register pointed to by the first physical register address. In response to completing the register-modifying instruction, the first physical address in the rename table is exchanged with a second physical address in a completion renamed table, wherein the second physical address is located in the completion rename table at a location pointed to by the architected target register address.

Type: Grant

Filed: July 26, 1995

Date of Patent: March 12, 2002

Assignee: International Business Machines Corporation

Inventors: Chiao-Mei Chuang, Hung Qui Le
Non-stalling circular counterflow pipeline processor with reorder buffer

Patent number: 6351805

Abstract: A system and method of executing instructions within a counterflow pipeline processor. The counterflow pipeline processor includes an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units. An instruction and one or more operands issue into the instruction pipeline and a determination is made at one of the execution units whether the instruction is ready for execution. If so, the operands are loaded into the execution unit and the instruction executes. The execution unit is monitored for a result and, when the result arrives, it is stored into the result pipeline. If the instruction reaches the end of the pipeline without executing it wraps around and is sent down the instruction pipeline again.

Type: Grant

Filed: February 23, 2001

Date of Patent: February 26, 2002

Assignee: Intel Corporation

Inventors: Kenneth J. Janik, Shih-Lien L. Lu, Michael F. Miller
High-performance, superscalar-based computer system with out-of-order instruction execution and concurrent results distribution

Publication number: 20020016903

Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.

Type: Application

Filed: May 8, 2001

Publication date: February 7, 2002

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Line-oriented reorder buffer

Publication number: 20020007450

Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.

Type: Application

Filed: March 13, 2001

Publication date: January 17, 2002

Inventors: David B. Witt, Thang M. Tran
Instruction look-ahead system and hardware

Patent number: 6311266

Abstract: A method and system for executing instructions in a computer. Each instruction has a look-ahead code indicating the number of instructions after which may be executed before its own execution is completed. The look-ahead code increments a counter associated with the instruction one past the look-ahead location. The instruction then begins execution. The next instructions will also be cleared to begin execution if they are less than the look-ahead code away from the current instruction. A large number of instructions can thus begin execution and be executing at the same time, thus increasing the speed of the computer operation.

Type: Grant

Filed: December 23, 1998

Date of Patent: October 30, 2001

Assignee: Cray Inc.

Inventors: Burton J. Smith, Robert L. Alverson
Multi-bit scoreboarding to handle write-after-write hazards and eliminate bypass comparators

Publication number: 20010034825

Abstract: What is disclosed is an apparatus including a set of data storage units having a set of scoreboard bits associated with the set of data storage units; a first execution unit having an output coupled to the data storage unit and a first input; a first switching unit having an output coupled to the first input of the first execution unit and a first input coupled to the output of the first execution unit; and, a first bypass control unit coupled to the first switching unit; wherein the first bypass control unit is configured to cause the first switching unit to couple the output of the first switching unit to the first input of the first switching unit based upon the set of scoreboard bits. What is also disclosed is a method including the steps of receiving a first instruction; and, storing a first address location and a first access path specifier for a first operand associated with the first instruction; wherein the first access path specifier indicates a source of the first operand.

Type: Application

Filed: May 31, 2001

Publication date: October 25, 2001

Inventor: Dennis M. O'Connor
Simultaneous and redundantly threaded processor store instruction comparator

Publication number: 20010034824

Abstract: A simultaneous and redundantly threaded, pipelined processor executes the same set of instructions simultaneously as two separate threads to provide fault tolerance. One thread is processed ahead of the other thread so that the instructions in one thread are processed through the processor's pipeline ahead of the corresponding instructions from the other thread. The thread, whose instructions are processed earlier, places its committed stores in a store queue. Subsequently, the second thread places its committed stores in the store queue. A compare circuit periodically scans the store queue for matching store instructions. If otherwise matching store instructions differ in any way (address or data), then a fault has occurred in the processing and the compare circuits initiates fault recovery. If comparison of the two instructions reveals they are identical, the compare circuit allows only a single store instruction to pass to the data cache or the system main memory.

Type: Application

Filed: April 19, 2001

Publication date: October 25, 2001

Inventors: Shubhendu S. Mukherjee, Steven K. Reinhardt
System and method for permitting out-of-order execution of load and store instructions

Patent number: 6301654

Abstract: In a load/store unit within a microprocessor, load and store instructions are executed out of order. The load and store instructions are assigned tags in a predetermined manner, and then assigned to load and store reorder queues for keeping track of the program order of the load and store instructions. Then when new load or store instructions are issued, the new load or store instructions are compared to entries within the load and store reorder queues to detect out of order problems.

Type: Grant

Filed: December 16, 1998

Date of Patent: October 9, 2001

Assignee: International Business Machines Corporation

Inventors: Bruce Joseph Ronchetti, Dave Shippy, Larry Edward Thatcher
Information processing unit, and exception processing method for specific application-purpose operation instruction

Publication number: 20010023479

Abstract: In the control section, an operation instruction not prescribing a functional specification, and a unit for processing the specific application-purpose operation instruction is provided within the processor core. The structure of this unit can be changed based on a flexible pipeline structure, and is separately designed for each application field. A register that prescribes a latency from when an instruction of the above unit is issued till when a result can be utilized is also provided in the processor core so as to prevent contention of an output port. Another register that prescribes a latency relating to a constraint of an interval of issuing an instruction of the above unit is also provided in the processor core so as to prevent contention of a resource with the preceding instructions.

Type: Application

Filed: December 22, 2000

Publication date: September 20, 2001

Inventors: Michihide Kimura, Atsuhiro Suga, Hideo Miyake, Satoshi Imai, Yasuki Nakamura
Reorder buffer employing last in line indication

Patent number: 6292884

Abstract: A reorder buffer is provided which stores a last in buffer (LIB) indication corresponding to each instruction. The last in buffer indication indicates whether or not the corresponding instruction is last, in program order, of the instructions within the buffer to update the storage location defined as the destination of that instruction. The LIB indication is included in the dependency checking comparisons. A dependency is indicated for a given source operand and a destination operand within the reorder buffer if the operand specifiers match and the corresponding LIB indication indicates that the instruction corresponding to the destination operand is last to update the corresponding storage location. At most one of the dependency comparisons for a given source operand can indicate dependency. According to one embodiment, the reorder buffer employs a line-oriented configuration.

Type: Grant

Filed: December 30, 1999

Date of Patent: September 18, 2001

Assignee: Advanced Micro Devices, Inc.

Inventors: Thang M. Tran, David B. Witt
Microprocessor cache redundancy scheme using store buffer

Patent number: 6289438

Abstract: A method and apparatus for bypassing defective cache memory locations on-board a microprocessor integrated circuit chip, which includes a processor, a cache memory, a store buffer, a tag RAM, and a comparator. The cache memory has a plurality of valid cache memory locations and at least one defective cache memory location. The store buffer has buffer entries and redundancy entries for storing data sent by the processor for storage in the cache memory. The tag RAM has buffer tag entries for storing addresses of data stored in the buffer entries and redundancy tag entries that store addresses of defective cache memory locations in the cache memory. The comparator compares addresses of data sent by the processor for cache memory storage with addresses stored in the redundancy tag entries.

Type: Grant

Filed: July 29, 1998

Date of Patent: September 11, 2001

Assignee: Kabushiki Kaisha Toshiba

Inventor: Toshinari Takayanagi
Superscalar RISC instruction scheduling

Patent number: 6289433

Abstract: A register renaming system for out-of-order execution of a set of reduced instruction set computer instructions having addressable source and destination register fields, adapted for use in a computer having an instruction execution unit with a register file accessed by read address ports and for storing instruction operands. A data dependence check circuit is included for determining data dependencies between the instructions. A tag assignment circuit generates one of more tags to specify the location of operands, based on the data dependencies determined by the data dependence check circuit. A set of register file port multiplexers select the tags generated by the tag assignment circuit and pass the tags onto the read address ports of the register file for storing execution results.

Type: Grant

Filed: June 10, 1999

Date of Patent: September 11, 2001

Assignee: Transmeta Corporation

Inventors: Sanjiv Garg, Kevin Ray Iadonato, Le Trong Nguyen, Johannes Wang
Decentralized exception processing system

Patent number: 6282636

Abstract: A decentralized exception processing system includes a plurality of local exception units. Each local exception unit is coupled to process local exception signals from one or more processing resources that are proximate to it. Each local exception unit generates local commit signals, using order information for the instruction in an issue group and any local exception signals it receives. The local commit signals are combined to generate a global commit signal for each instruction in the issue group. Local exception signals are collected at a selected one of the local exception units and processed to generate a global exception unit. The selected local exception unit resteers control of the processing resources to an exception handler associated with the global exception unit.

Type: Grant

Filed: December 23, 1998

Date of Patent: August 28, 2001

Assignee: Intel Corporation

Inventors: Tse-Yu Yeh, Gregory Mathews, Steven Tu
High-performance, superscalar-based computer system with out-of-order instruction execution and concurrent results distribution

Patent number: 6282630

Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.

Type: Grant

Filed: September 10, 1999

Date of Patent: August 28, 2001

Assignee: Seiko Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Dynamic allocation of resources in multiple microprocessor pipelines

Publication number: 20010014940

Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.

Type: Application

Filed: April 26, 2001

Publication date: August 16, 2001

Applicant: RISE TECHNOLOGY COMPANY

Inventor: Kenneth K. Munson
Dynamic allocation of resources in multiple microprocessor pipelines

Publication number: 20010014939

Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.

Type: Application

Filed: April 26, 2001

Publication date: August 16, 2001

Applicant: RISE TECHNOLOGY COMPANY

Inventor: Kenneth K. Munson
System and method for register renaming

Publication number: 20010011343

Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.

Type: Application

Filed: April 5, 2001

Publication date: August 2, 2001

Inventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato

prev … 8 9 10 11 12 13 14 next