Superscalar Patents (Class 712/23)
  • Publication number: 20010034824
    Abstract: A simultaneous and redundantly threaded, pipelined processor executes the same set of instructions simultaneously as two separate threads to provide fault tolerance. One thread is processed ahead of the other thread so that the instructions in one thread are processed through the processor's pipeline ahead of the corresponding instructions from the other thread. The thread, whose instructions are processed earlier, places its committed stores in a store queue. Subsequently, the second thread places its committed stores in the store queue. A compare circuit periodically scans the store queue for matching store instructions. If otherwise matching store instructions differ in any way (address or data), then a fault has occurred in the processing and the compare circuits initiates fault recovery. If comparison of the two instructions reveals they are identical, the compare circuit allows only a single store instruction to pass to the data cache or the system main memory.
    Type: Application
    Filed: April 19, 2001
    Publication date: October 25, 2001
    Inventors: Shubhendu S. Mukherjee, Steven K. Reinhardt
  • Patent number: 6308259
    Abstract: An instruction queue is physically divided into two (or more) instruction queues. Each instruction queue is configured to store a dependency vector for each instruction operation stored in that instruction queue. The dependency vector is evaluated to determine if the corresponding instruction operation may be scheduled for execution. Instruction scheduling logic in each physical queue may schedule instruction operations based on the instruction operations stored in that physical queue independent of the scheduling logic in other queues. The instruction queues evaluate the dependency vector in portions, during different phases of the clock. During a first phase, a first instruction queue evaluates a first portion of the dependency vectors and generates a set of intermediate scheduling request signals. During a second phase, the first instruction queue evaluates a second portion of the dependency vector and the intermediate scheduling request signal to generate a scheduling request signal.
    Type: Grant
    Filed: July 25, 2000
    Date of Patent: October 23, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6304962
    Abstract: A method and apparatus for prefetching superblocks in a computer processing system having a fetch mechanism for fetching instructions for execution includes the step of controlling the fetch mechanism to begin fetching at a starting address of a current superblock. A superblock includes a set of instructions in consecutive address locations terminated by a branch instruction known to have been taken. A Superblock Target Buffer (STB) is supplied with the starting address of the current superblock. The STB has a plurality of entries each indexed by a starting address of a superblock and including a run length of the superblock and a target address of the terminating branch of the superblock. The run length corresponds to the sum of a length of the terminating branch and the difference between a starting address of the terminating branch of the superblock and the starting address of the superblock.
    Type: Grant
    Filed: June 2, 1999
    Date of Patent: October 16, 2001
    Assignee: International Business Machines Corporation
    Inventor: Ravindra K. Nair
  • Patent number: 6304953
    Abstract: One embodiment of the present invention is a computer processor that includes a first scheduler adapted to dispatch a first type of computer instructions, and a second scheduler coupled to the first scheduler and adapted to dispatch a second type of computer instructions. The first type of instructions all have a first latency and the second type of instructions all have a second latency. The first scheduler is skewed relative to the second scheduler so that when the first scheduler dispatches one of the first type of computer instructions having a first latency, the second scheduler will dispatch one of the second type of computer instructions that is dependent on the first type of computer instruction at a time equal to the first latency.
    Type: Grant
    Filed: July 31, 1998
    Date of Patent: October 16, 2001
    Assignee: Intel Corporation
    Inventors: Alexander Paul Henstrom, David J. Sager
  • Patent number: 6304954
    Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.
    Type: Grant
    Filed: September 11, 1998
    Date of Patent: October 16, 2001
    Assignee: Rise Technology Company
    Inventor: Kenneth K. Munson
  • Patent number: 6298436
    Abstract: A method and system for atomic memory accesses in a processor system, wherein the processor system is able to issue and execute multiple instructions out of order with respect to a particular program order. A first reservation instruction is speculatively issued to an execution unit of the processor system. Upon issuance, instructions queued for the execution unit which occur after the first reservation instruction in the program order are flushed from the execution unit, in response to detecting any previously executed reservation instructions in the execution unit which occur after the first reservation instruction in the program order.
    Type: Grant
    Filed: June 8, 1999
    Date of Patent: October 2, 2001
    Assignee: International Business Machines Corporation
    Inventors: James Allan Kahle, Hung Qui Le, Larry Edward Thatcher, David James Shippy
  • Patent number: 6295598
    Abstract: A split directory-based cache coherency technique utilizes a secondary directory in memory to implement a bit mask used to indicate when more than one processor cache in a multi-processor computer system contains the same line of memory which thereby reduces the searches required to perform the coherency operations and the overall size of the memory needed to support the coherency system. The technique includes the attachment of a “coherency tag” to a line of memory so that its status can be tracked without having to read each processor's cache to see if the line of memory is contained within that cache. In this manner, only relatively short cache coherency commands need be transmitted across the communication network (which may comprise a Sebring ring) instead of across the main data path bus thus freeing the main bus from being slowed down by cache coherency data transmissions while removing the bandwidth limitations inherent in other cache coherency techniques.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: September 25, 2001
    Assignee: SRC Computers, Inc.
    Inventors: Jonathan L. Bertoni, Lee A. Burton
  • Patent number: 6292884
    Abstract: A reorder buffer is provided which stores a last in buffer (LIB) indication corresponding to each instruction. The last in buffer indication indicates whether or not the corresponding instruction is last, in program order, of the instructions within the buffer to update the storage location defined as the destination of that instruction. The LIB indication is included in the dependency checking comparisons. A dependency is indicated for a given source operand and a destination operand within the reorder buffer if the operand specifiers match and the corresponding LIB indication indicates that the instruction corresponding to the destination operand is last to update the corresponding storage location. At most one of the dependency comparisons for a given source operand can indicate dependency. According to one embodiment, the reorder buffer employs a line-oriented configuration.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: September 18, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Thang M. Tran, David B. Witt
  • Patent number: 6292882
    Abstract: In one aspect, the invention includes an apparatus for filtering instructions within a digital system that eliminates the need to physically switch the valid instructions onto consecutive data lines of a buffer. The apparatus includes a filter for filtering instructions within a digital system. The filter includes an address generator capable of generating at least two addresses in response to receiving at least two micro-operations. The filter also includes a logic circuit coupled to the address generator. The logic circuit filters addresses corresponding to valid micro-operations in response to assessing the state of a portion of each of the micro-operations. In a second aspect, the invention includes a method for filtering instructions within a digital system that eliminates the need to physically switch the valid instructions onto consecutive data lines of a buffer. The method includes, generating at least two addresses in response to receiving at least two micro-operations.
    Type: Grant
    Filed: December 10, 1998
    Date of Patent: September 18, 2001
    Assignee: Intel Corporation
    Inventors: Nazar A. Zaidi, Umair A. Khan
  • Patent number: 6289433
    Abstract: A register renaming system for out-of-order execution of a set of reduced instruction set computer instructions having addressable source and destination register fields, adapted for use in a computer having an instruction execution unit with a register file accessed by read address ports and for storing instruction operands. A data dependence check circuit is included for determining data dependencies between the instructions. A tag assignment circuit generates one of more tags to specify the location of operands, based on the data dependencies determined by the data dependence check circuit. A set of register file port multiplexers select the tags generated by the tag assignment circuit and pass the tags onto the read address ports of the register file for storing execution results.
    Type: Grant
    Filed: June 10, 1999
    Date of Patent: September 11, 2001
    Assignee: Transmeta Corporation
    Inventors: Sanjiv Garg, Kevin Ray Iadonato, Le Trong Nguyen, Johannes Wang
  • Patent number: 6286095
    Abstract: A computer apparatus incorporating special instructions to force load and store operations to execute in program order. The present invention provides a new and novel store instruction that is suspended until all prior store instructions have been completed by an associated CPU. Also, a new load instruction is provided which blocks any subsequent load instructions from executing until this load instruction has been completed by an associated CPU. These instructions allow for high efficiency computer systems to be implemented which optimize instruction throughput by executing subsequent instructions while waiting for a prior instruction to complete.
    Type: Grant
    Filed: September 26, 1995
    Date of Patent: September 4, 2001
    Assignee: Hewlett-Packard Company
    Inventors: Dale C. Morris, Barry J. Flahive, Michael L. Ziegler, Jerome C. Huck, Stephen G. Burger, Ruby B. L. Lee, Bernard L. Stumpf, Jeff Kurtze
  • Patent number: 6282629
    Abstract: A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by said set of instructions, to reorder the issuance of said set of instructions from said instruction processor. The mapped register operand fields are associated with the corresponding instructions of said reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.
    Type: Grant
    Filed: March 30, 1999
    Date of Patent: August 28, 2001
    Assignee: Compaq Computer Corporation
    Inventor: David J. Sager
  • Patent number: 6282630
    Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.
    Type: Grant
    Filed: September 10, 1999
    Date of Patent: August 28, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6279099
    Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer floating point operations is provided. Independent execution paths are provided for different graphics instructions to allow parallel execution of instructions which commonly occur together. The invention also optimizes the use of register file accesses to avoid, as much as possible, interference between graphics instructions needing to access a register file and other instruction accesses which would occur in combination with graphics instructions, thereby avoiding pipeline stalls and allowing parallel execution.
    Type: Grant
    Filed: August 25, 2000
    Date of Patent: August 21, 2001
    Assignee: Sun Microsystems, Inc.
    Inventors: Timothy J. Van Hook, Leslie D. Kohn, Robert Yung
  • Patent number: 6272617
    Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.
    Type: Grant
    Filed: September 17, 1999
    Date of Patent: August 7, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
  • Patent number: 6272520
    Abstract: A method for detecting thread switch conditions provides first and second scoreboard bits for each register in a register file. The first scoreboard bit associated with a register is set when a load is generated to return data to the register. The second scoreboard bit is set if the load misses in a selected processor cache. Register read instructions are monitored, and a thread switch condition is indicated when a register read instruction to the register is detected while its first and second scoreboard bits are set.
    Type: Grant
    Filed: December 31, 1997
    Date of Patent: August 7, 2001
    Assignees: Intel Corporation, Hewlette Packard
    Inventors: Harshvardhan Sharangpani, Rajiv Gupta, Judge K. Arora
  • Patent number: 6272619
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instructions in-order.
    Type: Grant
    Filed: November 10, 1999
    Date of Patent: August 7, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6269436
    Abstract: A microprocessor is provided which is configured to predict return addresses for return instructions according to a return stack storage included therein. The return stack storage is a stack structure configured to store return addresses associated with previously detected call instructions. Return addresses may be predicted for return instructions early in the instruction processing pipeline of the microprocessor. In one embodiment, the return stack storage additionally stores a call tag and a return tag with each return address. The call tag and return tag respectively identify call and return instructions associated with the return address. These tags may be compared to a branch tag conveyed to the return prediction unit upon detection of a branch misprediction. The results of the comparisons may be used to adjust the contents of the return stack storage with respect to the misprediction. The microprocessor may continue to predict return addresses correctly following a mispredicted branch instruction.
    Type: Grant
    Filed: September 8, 1999
    Date of Patent: July 31, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Thang M. Tran, Rupaka Mahalingaiah
  • Patent number: 6266761
    Abstract: A method and system in an information processing system are disclosed for efficiently maintaining copies of values stored within a plurality of registers. The information processing system includes first circuitry, second circuitry, and a plurality of buffers. The first circuitry processes an execution state of a first type of instruction which always specifies a destination of at least one of a first type of register or a second type of register, and which outputs first information in response thereto. The first circuitry also processes an execution stage of a second type of instruction which always specifies a destination of only a third type of register, and outputs second information in response thereto.
    Type: Grant
    Filed: June 12, 1998
    Date of Patent: July 24, 2001
    Assignees: International Business Machines Corporation, Motorola, Inc.
    Inventors: Michael David Carlson, Thomas Alan Hoy, Terence Matthew Potter, David Domenic Putti
  • Patent number: 6266744
    Abstract: A processor employing a dependency link file. Upon detection of a load which hits a store for which store data is not available, the processor allocates an entry within the dependency link file for the load. The entry stores a load identifier identifying the load and a store data identifier identifying a source of the store data. The dependency link file monitors results generated by execution units within the processor to detect the store data being provided. The dependency link file then causes the store data to be forwarded as the load data in response to detecting that the store data is provided. The latency from store data being provided to the load data being forwarded may thereby be minimized. Particularly, the load data may be forwarded without requiring that the load memory operation be scheduled.
    Type: Grant
    Filed: May 18, 1999
    Date of Patent: July 24, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: William Alexander Hughes, Derrick R. Meyer
  • Patent number: 6266765
    Abstract: A system for issuing a family of instructions during a single clock includes a decoder for decoding the family of instructions and logic, responsive to the decode result, for determining whether resource conflicts would occur if the family were issued during one clock. If no resource conflicts occur, an execution unit executes the family regardless of whether dependencies among the instructions in the family exist.
    Type: Grant
    Filed: July 7, 2000
    Date of Patent: July 24, 2001
    Inventor: Robert W. Horst
  • Patent number: 6263416
    Abstract: In a superscalar processor, multiple instructions are executed in parallel to obtain multiple execution results, and the multiple execution results are stored in a working register file. Each execution result in the working register file has at least one status bit associated therewith which identifies the execution result as valid data. The multiple execution results contained in the working register data then retired by changing the status bits associated with each execution result to identify the execution result as final data. In this manner, the speculative data is retired as the final data without data movement of the speculative data, thus reducing a number of ports needed in the superscalar processor.
    Type: Grant
    Filed: June 27, 1997
    Date of Patent: July 17, 2001
    Assignee: Sun Microsystems, Inc.
    Inventor: Rajasekhar Cherabuddi
  • Patent number: 6256722
    Abstract: A data processing system comprises a plurality of nodes and a serial data bus interconnecting the nodes in series in a closed loop, for passing address and data information. At least one processing node includes a processor, a printed circuit board and a memory which is partitioned into a plurality of sections, including a first section for directly sharable memory located on the printed circuit board, and a second section for block sharable memory. A local bus connects the processor, block sharable memory and printed circuit board, for transferring data in parallel from the processor to the directly sharable memory on the printed circuit board, and for transferring data from the block sharable memory to the printed circuit board.
    Type: Grant
    Filed: December 13, 1999
    Date of Patent: July 3, 2001
    Assignee: Sun Microsystems, Inc.
    Inventors: John D. Acton, Michael D. Derbish, Gavin G. Gibson, Jack M. Hardy, Jr., Hugh M. Humphreys, Steven P. Kent, Steven E. Schelong, Ricardo Yong, William B. DeRolf
  • Patent number: 6256720
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instructions in-order.
    Type: Grant
    Filed: November 9, 1999
    Date of Patent: July 3, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6256721
    Abstract: An apparatus for accelerating move operations includes a lookahead unit which detects move instructions prior to the execution of the move instructions (e.g. upon selection of the move operations for dispatch within a processor). Upon detecting a move instruction, the lookahead unit signals a register rename unit, which reassigns the rename register associated with the source register to the destination register. In one particular embodiment, the lookahead unit attempts to accelerate moves from a base pointer register to a stack pointer register (and vice versa). An embodiment of the lookahead unit generates lookahead values for the stack pointer register by maintaining cumulative effects of the increments and decrements of previously dispatched instructions. The cumulative effects of the increments and decrements prior to a particular instruction may be added to a previously generated value of the stack pointer register to generate a lookahead value for that particular instruction.
    Type: Grant
    Filed: June 16, 2000
    Date of Patent: July 3, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6253312
    Abstract: An apparatus and method are provided for concurrently loading single-precision operands into registers in a microprocessor floating point register file. The apparatus includes translation logic, data logic, and write back logic. The translation logic receives a load macro instruction prescribing an address, and decodes the load macro instruction into a double load micro instruction. The double load micro instruction directs the microprocessor to retrieve the two single-precision operands from the address and to load the two single-precision operands into the two floating point registers. The data logic, coupled to the translation logic, executes the double load micro instruction and retrieves the two single-precision operands from the address. The write back logic, coupled to the data logic, loads the two single-precision operands into the two floating point registers during a single write cycle.
    Type: Grant
    Filed: August 7, 1998
    Date of Patent: June 26, 2001
    Assignee: IP First, L.L.C.
    Inventors: Timothy A. Elliott, G. Glenn Henry, Terry Parks
  • Patent number: 6253309
    Abstract: A microprocessor configured to rapidly decode variable-length instructions is disclosed. The microprocessor is configured with a predecoder and an instruction cache. The predecoder is configured to expand variable-length instructions to create fixed-length instructions by padding instruction fields within each variable-length instruction with constants until each field reaches a predetermined maximum width. The fixed-width instructions are then stored within the instruction cache and output for execution when a corresponding requested address is received. The instruction cache may store both variable- and fixed-width instructions, or just fixed-width instructions. An array of pointers may be used to access particular fixed-length instructions. The fixed-length instructions may be configured to all have the same fields and the same lengths, or they may be divided into groups, wherein instructions within each group have the same fields and the same lengths.
    Type: Grant
    Filed: September 21, 1998
    Date of Patent: June 26, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Rupaka Mahalingaiah
  • Patent number: 6253306
    Abstract: Accordingly, a prefetch instruction mechanism is desired for implementing a prefetch instruction which is non-faulting, non-blocking, and non-modifying of architectural register state. Advantageously, a prefetch mechanism described herein is provided largely without the addition of substantial complexity to a load execution unit. In one embodiment, the non-faulting attribute of the prefetch mechanism is provided though use of the vector decode supplied Op sequence that activates an alternate exception handler. The non-modifying of architectural register state attribute is provided (in an exemplary embodiment) by first decoding a PREFETCH instruction to an Op sequence targeting a scratch register wherein the scratch register has scope limited to the Op sequence corresponding to the PREFETCH instruction.
    Type: Grant
    Filed: July 29, 1998
    Date of Patent: June 26, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amos Ben-Meir, John G. Favor
  • Patent number: 6253310
    Abstract: A microprocessor capable of delaying the deallocation of an arithmetic flags register is described. A system processes instructions of a first instruction set architecture which has an arithmetic flags register. The system also processes instructions of a second instruction set architecture which is not compatible with the first instruction set architecture. In order to process a first instruction of the first instruction set architecture that implicitly updates the arithmetic flags register, the arithmetic flags register shares a physical destination register with a general register containing a result for the first instruction. An instruction that does not update the arithmetic flags but would deallocate the register containing the arithmetic flags triggers the delayed deallocation mechanism of the present invention.
    Type: Grant
    Filed: December 31, 1998
    Date of Patent: June 26, 2001
    Assignee: Intel Corporation
    Inventors: Ricardo Ramirez, Mike Morrison
  • Patent number: 6249857
    Abstract: In accordance with a first embodiment, a processing apparatus is provided. The processing apparatus (10) includes a register (12) including a first and a second programming instruction, a first processing unit (16) responsive to the first programming instruction, and a second processing unit (22) responsive to the second programming instruction. The second processing unit (22) includes a logarithm based processor having at least one digital logarithm converter (80), a digital logic device (82), and a digital inverse logarithm converter (84). In other embodiments, the processing apparatus (10) is incorporated into a communication device (100) and a video system (300).
    Type: Grant
    Filed: October 20, 1997
    Date of Patent: June 19, 2001
    Assignee: Motorola, Inc.
    Inventors: Matthew H. Klapman, Jeffrey G. Toler
  • Patent number: 6249855
    Abstract: An arbiter system for the instruction issue logic of a CPU has at least two encoder circuits that select instructions in an instruction queue for issue to first and second execution units, respectively, based upon the positions of the instructions within the queue and requests by the instructions for the first and/or second execution units. As a result, since the instruction can request different execution units, this system is compatible with architectures where the execution units may have different capabilities to execute different instructions, i.e., each integer execution unit may not be able to execute all of the instructions in the CPU's integer instruction set. According to the present invention, one of the encoder circuits is subordinate to the other circuit. The subordinate encoder circuit selects instructions from the instruction queue based not only on the positions of the instructions and their requests, but the instruction selection of the dominant encoder circuit.
    Type: Grant
    Filed: June 2, 1998
    Date of Patent: June 19, 2001
    Assignee: Compaq Computer Corporation
    Inventors: James A. Farrell, Bruce A. Gieseke
  • Patent number: 6249862
    Abstract: A dependency table stores a reorder buffer tag for each register. The stored reorder buffer tag corresponds to the last of the instructions within the reorder buffer (in program order) to update the register. Otherwise, the dependency table indicates that the value stored in the register is valid. When operand fetch is performed for a set of concurrently decoded instructions, dependency checking is performed including checking for dependencies between the set of concurrently decoded instructions as well as accessing the dependency table to select the reorder buffer tag stored therein. Either the reorder buffer tag of one of the concurrently decoded instructions, the reorder buffer tag stored in the dependency table, the instruction result corresponding to the stored reorder buffer tag, or the value from the register itself is forwarded as the source operand for the instruction. Information from the comparators and the information stored in the dependency table is sufficient to select which value is forwarded.
    Type: Grant
    Filed: November 15, 2000
    Date of Patent: June 19, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Muralidharan S. Chinnakonda, Thang M. Tran, Wade A. Walker
  • Patent number: 6249856
    Abstract: A register system for a data processor which operates in a plurality of modes. The register system provides multiple, identical banks of register sets, the data processor controlling access such that instructions and processes need not specify any given bank. An integer register set includes first (RA[23:0]) and second (RA[31:24]) subsets, and a shadow subset (RT[31:24]). While the data processor is in a first mode, instructions access the first and second subsets. While the data processor is in a second mode, instructions may access the first subset, but any attempts to access the second subset are re-routed to the shadow subset instead, transparently to the instructions, allowing system routines to seemingly use the second subset without having to save and restore data which user routines have written to the second subset.
    Type: Grant
    Filed: January 10, 2000
    Date of Patent: June 19, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Sanjiy Garg, Derek J. Lentz, Le Trong Nguyen, Sho Long Chen
  • Patent number: 6247114
    Abstract: A microprocessor having an instruction queue capable of out-of-order instruction dispatch and rapidly selecting one or more oldest eligible entries is disclosed. The microprocessor may comprise a plurality of instruction execution pipelines, an instruction cache, and an instruction queue coupled to the instruction cache and execution pipelines. The instruction queue may comprise a plurality of instruction storage locations and may be configured to output up to a predetermined number of non-sequential out of order instructions per clock cycle. The microprocessor may be further configured with high speed control logic coupled to the instruction queue. The control logic may comprise a number of pluralities of multiplexers, wherein the first plurality of multiplexers are configured to select a first subset of the instructions stored in the queue. The second plurality of multiplexers then select a second subset of instructions from the first subset.
    Type: Grant
    Filed: February 19, 1999
    Date of Patent: June 12, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Jeffrey E. Trull
  • Patent number: 6247124
    Abstract: A computing system contains an apparatus having an instruction memory to store a plurality of lines of a plurality of instructions, and a branch memory to store a plurality of branch prediction entries, each branch prediction entry containing information for predicting whether a branch designated by a branch instruction stored in the instruction memory will be taken when the branch instruction is executed. Each branch prediction entry includes a branch target field for indicating a target address of a line containing a target instruction to be executed if the branch is taken, a destination field indicating where the target instruction is located within the line indicated by the branch target address, and a source field indicating where the branch instruction is located within the line corresponding to the target address.
    Type: Grant
    Filed: July 30, 1999
    Date of Patent: June 12, 2001
    Assignee: MIPS Technologies, Inc.
    Inventors: Chandra Joshi, Paul Rodman, Peter Hsu, Monica R. Nofal
  • Patent number: 6247122
    Abstract: An apparatus and method for improving microprocessor performance by improving the prediction accuracy of conditional branch instructions is provided. A static branch predictor makes a prediction of the outcome of a conditional branch instruction based on the branch test type and the branch target address displacement sign. A branch history table stores a bit indicating whether the prediction of the static predictor agreed with the outcome of the last execution of the branch instruction. If the history table bit agrees, then the static prediction is used. Otherwise, the opposite of the static prediction is used.
    Type: Grant
    Filed: December 2, 1998
    Date of Patent: June 12, 2001
    Assignee: IP-First, L.L.C.
    Inventors: G. Glenn Henry, Terry Parks
  • Patent number: 6247106
    Abstract: A processor employing a map unit including register renaming hardware is shown. The map unit may assign virtual register numbers to source registers by scanning instruction operations to detect intraline dependencies. Subsequently, physical register numbers are mapped to the source register numbers responsive to the virtual register numbers. The map unit may stores (e.g. in a map silo) a current lookahead state corresponding to each line of instruction operations which are processed by the map unit Additionally, the map unit stores an indication of which instruction operations within the line update logical registers, which logical registers are updated, and the physical register numbers assigned to the instruction operations. Upon detection of an exception condition for an instruction operation with a line, the current lookahead state corresponding to the line is restored from the map silo.
    Type: Grant
    Filed: July 27, 2000
    Date of Patent: June 12, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6240503
    Abstract: A processor is configured to generate lookahead values using a cumulative constant. The processor classifies operations to a particular register (e.g. the stack pointer register, or ESP in an embodiment employing the x86 instruction set architecture) as either accelerated or non-accelerated. For example, instructions which are defined to increment/decrement the particular register by an explicit or implicit constant value may be accelerated operations. Upon the occurrence of a non-accelerated operation, the processor may begin accumulating the cumulative effect of accelerated operations to the result of the non-accelerated operation as a cumulative offset. The result of the non-accelerated operation (upon execution thereof) may then be added to the cumulative offset values corresponding to each accelerated operation to generate the particular register value corresponding to that accelerated operation. Accordingly, dependencies upon the register due to the accelerated operations may be alleviated.
    Type: Grant
    Filed: November 12, 1998
    Date of Patent: May 29, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6237082
    Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.
    Type: Grant
    Filed: August 22, 2000
    Date of Patent: May 22, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David B. Witt, Thang M. Tran
  • Patent number: 6237076
    Abstract: A method and system for renaming registers of said system is proposed in which mixed instruction sets, e.g. 32 bit and 64 bit instructions are carried out concurrently in one program. In case of an instruction sequence of a preceding 64 bit instruction and one or more 32 bit instructions to be executed in-order after the 64 bit instruction and where the 32 bit instructions having a data dependence to the preceding 64 bit instruction, said rest of the register range changed by the preceding 64 bit instruction is copied to the corresponding location in a target register of the succeeding 32 bit instruction, at least if the same logical register is specified by the 32 bit instruction as it was specified by the preceding 64 bit instruction. The copy source is addressed by the register number and hold in a list (28).
    Type: Grant
    Filed: August 28, 1998
    Date of Patent: May 22, 2001
    Assignee: International Business Machines Corporation
    Inventors: Ute Gaertner, Klaus Jörg Getzlaff, Oliver Laub, Erwin Pfeffer
  • Patent number: 6230254
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load/store unit is provided whose main purpose is to make load requests out-of-order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out-of-order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Grant
    Filed: November 12, 1999
    Date of Patent: May 8, 2001
    Assignee: Seiko Epson Corporation
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 6223277
    Abstract: A packed data structure processor (25) is disclosed. The packed data structure processor (25) includes a register file (24) of multiple registers (REG0 through REG31), each of which is connected to an input of each of a plurality of operand multiplexers (26). Each operand multiplexer (26) is associated with a shift/mask circuit (28), which permits the selection of a particular portion (e.g., BYIE, WORD, DWORD) of the contents of a selected register file, for use as an operand. An arithmetic logic unit (ALU) (30) performs data processing operations upon the operands, and presents results on writeback bus (WBBUS), to external memory (18) over a memory interface (37), or to a register file (42) associated with other circuitry (44) over a coprocessor interface (41). A destination selector (40) is capable of writing to only a selected portion of a selected register, thus permitting a packed data structure to be present within the register file (24).
    Type: Grant
    Filed: December 22, 1997
    Date of Patent: April 24, 2001
    Assignee: Texas Instruments Incorporated
    Inventor: Brian J. Karguth
  • Patent number: 6223278
    Abstract: A method for performing floating point (FP) instruction handling is provided. A floating point store status word (FSTSW) instruction is inserted within a plurality of micro-ops corresponding to a plurality of FP instructions and the plurality of micro-ops are ordered for execution. In another aspect, a processor is provided for executing a plurality of floating point (FP) instructions. The processor includes a fetcher/decoder unit to retrieve a plurality of FP instructions from a memory structure and generate a plurality of micro-ops from the FP instructions. The processor further generates a floating point store status word (FSTSW) instruction and includes a scheduler unit to re-order the micro-ops for execution.
    Type: Grant
    Filed: November 5, 1998
    Date of Patent: April 24, 2001
    Assignee: Intel Corporation
    Inventor: Michael J. Morrison
  • Patent number: 6219778
    Abstract: A processor including at least one execution unit generating out-of-order results and out-of-order condition codes. Precise architectural state of the processor is maintained by providing a results buffer having a number of slots and providing a condition code buffer having the same number of slots as the results buffer, each slot in the condition code buffer in one-to-one correspondence with a slot in the results buffer. Each live instruction in the processor is assigned a slot in the results buffer and the condition code buffer. Each speculative result produced by the execution units is stored in the assigned slot in the results buffer. When an instruction is retired, the results for that instruction are transferred to an architectural result register and any condition codes generated by that instruction are transferred to an architectural condition code register.
    Type: Grant
    Filed: December 20, 1999
    Date of Patent: April 17, 2001
    Assignee: Sun Microsystems, Inc.
    Inventors: Ramesh Panwar, Arjun Prabhu
  • Patent number: 6219775
    Abstract: A massively-parallel computer includes a plurality of processing nodes and at least one control node interconnected by a network. The network faciliates the transfer of data among the processing nodes and of commands from the control node to the processing nodes. Each processing node includes an interface for transmitting data over, and receiving data and commands from, the network, at least one memory module for storing data, a node processor and an auxiliary processor. The node processor receives commands received by the interface and processes data in response thereto, in the process generating memory access requests for facilitating the retrieval of data from or storage of data in the memory module. The node processor further controlling the transfer of data over the network by the interface. The auxiliary processor is connected to the memory module and the node processor.
    Type: Grant
    Filed: March 18, 1998
    Date of Patent: April 17, 2001
    Assignee: Thinking Machines Corporation
    Inventors: Jon P. Wade, Daniel R. Cassiday, Robert D. Lordi, Guy Lewis Steele, Jr., Margaret A. St. Pierre, Monica C. Wong-Chan, Zahi S. Abuhamdeh, David C. Douglas, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Scott J. Smith, Shaw-Wen Yang, Robert C. Zak, Jr.
  • Patent number: 6219777
    Abstract: Disclosed is a register file used in a multiprocessor composition composed of a plurality of processor elements, the register file having a plurality of words and being provided for each of the plurality of processor elements, wherein: the plurality of words are divided into a word part that can be simultaneously accessed by some of the plurality of processor elements to use in common with other processor element, and a word part that can be accessed only by its own processor element.
    Type: Grant
    Filed: July 10, 1998
    Date of Patent: April 17, 2001
    Assignee: NEC Corporation
    Inventor: Toshiaki Inoue
  • Patent number: 6219723
    Abstract: A system and method for thermal overload detection and protection for a processor which allows the processor to run at near maximum potential for the vast majority of its execution life. This is effectuated by the provision of circuitry to detect when the processor has exceeded its thermal thresholds and which then causes the processor to automatically reduce the clock rate to a fraction of the nominal clock while execution continues. When the thermal condition has stabilized, the clock may be raised in a stepwise fashion back to the nominal clock rate. Throughout the period of cycling the clock frequency from nominal to minimum and back, the program continues to be executed. Also provided is a queue activity rise time detector and method to control the rate of acceleration of a functional unit from idle to full throttle by a localized stall mechanism at the boundary of each stage in the pipe.
    Type: Grant
    Filed: March 23, 1999
    Date of Patent: April 17, 2001
    Assignee: Sun Microsystems, Inc.
    Inventors: Ricky C. Hetherington, Ramesh Panwar
  • Patent number: 6219833
    Abstract: The compilation of source code to a primary and a secondary processor. The method relates to reconfigurable secondary processors, and is especially relevant to secondary processors which can be reconfigured to some degree during execution of code. Selective extraction of dataflows from the source code is followed by transformation of the extracted dataflows into trees. The trees are then matched against each other to determine minimum edit cost relationships for transformation of one tree into another, where these minimum edit cost relationships are determined by the architecture of the secondary processor. A group or a plurality of groups of dataflows is determined on the basis of said minimum edit cost relationships and for each group a generic dataflow capable of supporting each dataflow in that group is created.
    Type: Grant
    Filed: December 11, 1998
    Date of Patent: April 17, 2001
    Assignee: Hewlett-Packard Company
    Inventors: Charles Reed Solomon, Andrea Olgiati
  • Patent number: 6216220
    Abstract: The data processing system, a combination of multithreaded architecture and a VLIW (Very Long Instruction Word) processor is adapted to process plural threads. The system uses multiple program counters for context-switching only a subinstruction which causes a long latency. A method is provided for processing instructions in a data processing system having an active thread block, a ready thread block and a waiting thread block, and a instruction execution block, for processing a plurality of threads. The method includes combining instructions issued from the respective active threads into one new instruction, each active thread having a plurality of instructions, and the issued instructions being used as subinstructions in the combined one instruction. The combined instruction as processed by the instruction execution block, while tracing contexts relating to the threads which provide the respective subinstructions by using multiple program counters.
    Type: Grant
    Filed: October 8, 1998
    Date of Patent: April 10, 2001
    Assignee: Hyundai Electronics Industries Co., Ltd.
    Inventor: Myeong Eun Hwang
  • Patent number: 6216215
    Abstract: The present invention discloses a method and apparatus for implementing a senior load instruction type. An instruction requesting a memory reference is decoded. The decoded instruction is then dispatched to a memory ordering unit. The instruction is retired from a load buffer and is executed after retiring.
    Type: Grant
    Filed: April 2, 1998
    Date of Patent: April 10, 2001
    Assignee: Intel Corporation
    Inventors: Salvador Palanca, Shekoufeh Qawami, Niranjan L. Cooray, Angad Narang, Subramaniam Maiyuran