Superscalar Patents (Class 712/23)
-
Patent number: 6006317Abstract: An apparatus for performing speculative stores is provided. The apparatus reads the original data from a cache line being updated by a speculative store, storing the original data in a restore buffer. The speculative store data is then stored into the affected cache line. Should the speculative store later be canceled, the original data may be read from the restore buffer and stored into the affected cache line. The cache line is thereby returned to a pre-store state. In one embodiment, the cache is configured into banks. The data read and restored comprises the data from one of the banks which comprise the affected cache line. Instead of forwarding store data to subsequent load memory accesses, the store is speculatively performed to the data cache and the loads may subsequently access the data cache. Dependency checking between loads and stores prior to the speculative performance of the store may stall the load memory access until the corresponding store memory access has been performed.Type: GrantFiled: October 28, 1998Date of Patent: December 21, 1999Assignee: Advanced Micro Devices, Inc.Inventors: H. S. Ramagopal, Rajiv M. Hattangadi
-
Patent number: 6006326Abstract: A system for restraining over-eager boosting of load instructions past store instructions in an out-of-order processor. The system comprises a memory disambiguation buffer for storing load and store instruction addresses and associated data and an instruction scheduling window in operative association with the memory disambiguation buffer. The instruction scheduling window and the memory disambiguation buffer determine load/store dependencies and effectuate replay of the store and load instructions wherein a dependent load instruction has been executed prior to a store instruction. An instruction cache is provided in operative association with the memory disambiguation buffer, together to associate the dependent load instructions with a store instruction such that the store instruction is subsequently executed prior to the dependent load instructions.Type: GrantFiled: June 25, 1997Date of Patent: December 21, 1999Assignee: Sun Microsystems, Inc.Inventors: Ramesh Panwar, Ricky C. Hetherington
-
Patent number: 6003126Abstract: A method and system in a superscalar data processing system are disclosed for the temporary designation of a physical register as a particular general register. The data processing system is capable of processing multiple instructions during a single clock cycle. Physical registers are established. None of the physical registers are initially designated as a particular general register. No general registers exist which are initially designated as particular general registers. For each of the multiple instructions, a determination is made as to whether the instruction is a load register instruction. If the instruction is a load register instruction, a determination is made as to whether the instruction is associated with a logical register name. Each one of the logical register names identifies a different general register.Type: GrantFiled: July 1, 1997Date of Patent: December 14, 1999Assignee: International Business MachinesInventors: Dieu Huynh, Wan L. Leung
-
Patent number: 6003128Abstract: An apparatus for prediction of loop instructions. Loop instructions decrement the value in a counter register and branch to a target address (specified by an instruction operand) if the decremented value of the counter register is greater than zero. The apparatus comprises a loop detection unit that detects the presence of a loop instruction in the instruction stream. An indication of the loop instruction is conveyed to a reorder buffer which stores speculative register values. If the apparatus is not currently processing the loop instruction, a compare value corresponding to the counter register prior to execution of the loop instruction is conveyed to a loop prediction unit. The loop prediction unit also increments a counter value upon receiving each indication of the loop instruction. This counter value is then compared to the compare value conveyed from the reorder buffer.Type: GrantFiled: May 1, 1997Date of Patent: December 14, 1999Assignee: Advanced Micro Devices, Inc.Inventor: Thang M. Tran
-
Patent number: 5996056Abstract: An intermediate result signal arising from a manipulation of data signals is checked and reduced without using conditional branches, thereby improving instruction processing. Data signals are represented as signed 8-bit binary values in a two's compliment format. This requires that the intermediate result signal be stored in a register that is greater than 8-bits wide to allow for the proper checking of an overflow condition. A processor operating under program control with the program has the following operations. The program determines whether the intermediate result signal is in a positive overflow state or a negative overflow state. A first mask signal is set to have 8 lower bits in an OFF position when the intermediate result signal is inside the range of a signed 8 bit integer. Otherwise, the first mask signal is set to have 8 lower bits in an ON position.Type: GrantFiled: June 24, 1997Date of Patent: November 30, 1999Assignee: Sun Microsystems, Inc.Inventor: Vladimir Y. Volkonsky
-
Patent number: 5987596Abstract: A register rename unit employs a rename map stack upon which a register rename map corresponding to each dispatched instruction is pushed. Upon occurrence of an exception, the register rename maps corresponding to instructions subsequent to the instruction experiencing the exception are popped from the stack. In this manner, the architected register to implemented register mapping consistent with the instruction experiencing the exception is restored. According to one embodiment, the rename map stack can be recovered from an exception in one clock cycle. In one particular implementation, the rename map stack comprises multiple independent stacks. Each independent stack corresponds to one of the architected registers, and stores implemented register specifiers corresponding to that architected register.Type: GrantFiled: May 12, 1999Date of Patent: November 16, 1999Assignee: Advanced Micro Devices, Inc.Inventor: Wade A. Walker
-
Patent number: 5987588Abstract: A processor architecture is described which operates with improved computational efficiency using instruction fetching functions that are decoupled from instruction execution functions by a dynamic register file. The instruction fetching function operates in free-running mode which does not stop if a fetched instruction cannot be executed due to data being unavailable or due to other instruction dependencies. Branch instructions are taken in a predicted direction and the results of execution of all instructions are provisionally stored pending validation or invalidation on the basis of the dependencies becoming available later.Type: GrantFiled: August 28, 1998Date of Patent: November 16, 1999Assignee: Hyundai Electronics America, Inc.Inventors: Valeri Popescu, Merle A. Schultz, Gary A. Gibson, John E. Spracklen, Bruce D. Lightner
-
Patent number: 5987587Abstract: The present invention relates to multiprocessors which has several microprocessors on a single chip. Efficiency is improved by stripping certain functions that are used less freely from the microprocessor and sharing these functions between several symmetric microprocessors. This method allows each CPU to occupy a smaller area while preserving complete symmetry of capability for software simplification. For example, the shared execution units can include the floating point unit and multimedia execution units.Type: GrantFiled: June 6, 1997Date of Patent: November 16, 1999Assignee: International Business Machines CorporationInventor: David Meltzer
-
Patent number: 5987561Abstract: A superscalar microprocessor employing a data cache configured to perform store accesses in a single clock cycle is provided. The superscalar microprocessor speculatively stores data within a predicted way of the data cache after capturing the data currently being stored in that predicted way. During a subsequent clock cycle, the cache hit information for the store access validates the way prediction. If the way prediction is correct, then the store is complete, utilizing a single clock cycle of data cache bandwidth. Additionally, the way prediction structure implemented within the data cache bypasses the tag comparisons of the data cache to select data bytes for the output. Therefore, the access time of the associative data cache may be substantially similar to a direct-mapped cache access time. The superscalar microprocessor may therefore be capable of high frequency operation.Type: GrantFiled: June 3, 1997Date of Patent: November 16, 1999Assignee: Advanced Micro Devices, Inc.Inventors: David B. Witt, Rajiv M. Hattangadi
-
Patent number: 5987594Abstract: A processor that executes coded instructions using an instruction scheduling unit receiving the coded instructions and issuing an instruction for execution. A replay signaling device generates a signal indicating when the instruction failed to execute properly within a predetermined time. A replay device within the instruction scheduling unit responsive to the signaling device then reissues the instruction for execution.Type: GrantFiled: June 25, 1997Date of Patent: November 16, 1999Assignee: Sun Microsystems, Inc.Inventors: Ramesh Panwar, Ricky C. Hetherington
-
Patent number: 5983336Abstract: An unpacking circuit and operating method in a very long instruction word (VLIW) processor provides for parallel handling of a packed wide instruction in which a packed wide instruction is divided into groups of syllables. An unpacked instruction representation includes a plurality of syllables, which generally correspond to operations for execution by an execution unit. The syllables in the unpacked instruction representation are assigned to groups. The packed instruction word includes a sequence of syllables and a header. The header includes a descriptor for each group. The descriptor includes a mask and may include a displacement designator. The multiple groups are handled in parallel as the displacement designator identifies a starting syllable. The mask designates the syllables which are transferred from the packed instruction to the unpacked representation and identifies the position of NOPs in the unpacked representation.Type: GrantFiled: October 18, 1996Date of Patent: November 9, 1999Assignee: Elbrush International LimitedInventors: Yuli Kh. Sakhin, Alexander M. Artyomov, Alexey P. Lizorkin, Vladimir V. Rudometov, Leonid N. Nazarov
-
Patent number: 5983334Abstract: A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions.Type: GrantFiled: January 16, 1997Date of Patent: November 9, 1999Assignee: Seiko Epson CorporationInventors: Brett Coon, Yoshiyuki Miyayama, Le Trong Nguyen, Johannes Wang
-
Patent number: 5983335Abstract: Computer system with multiple, out-of-order, instruction issuing system suitable for superscalar processors with a RISC organization, also has a Fast Dispatch Stack (FDS), a dynamic instruction scheduling system that may issue multiple, out-of-order, instructions each cycle to functional units as dependencies allow. The basic issuing mechanism supports a short cycle time and its capabilities are augmented. Condition code dependent instructions issue in multiples and out-of-order. A fast register renaming scheme is presented. An instruction squashing technique enables fast precise interrupts and branch prediction. Instructions preceding and following one or more predicted conditional branch instructions may issue out-of-order and concurrently. The effects of executed instructions following an incorrectly predicted branch instruction or an instruction that causes a precise interrupt are undone in one machine cycle.Type: GrantFiled: April 30, 1997Date of Patent: November 9, 1999Assignee: International Business Machines CorporationInventor: Harry Dwyer, III
-
Patent number: 5978900Abstract: A microprocessor capable of renaming a numeric register and a segment register includes a plurality of general registers and a data dependency unit. The data dependency unit is configured to receive instructions to be executed, wherein the instructions include accessing the numeric register and accessing the segment register. The data dependency unit renames the numeric register as one of the plurality of general registers for each of the instructions accessing said numeric register, renames the segment register as one of the plurality of general registers for each of the instructions accessing the segment register, and generates a dependency vector for each of the instructions. The microprocessor may include a scheduler configured to receive the instructions and dependency vector and schedule the instructions for execution based on the dependency vector, and an execution engine adapted to receive the instructions from the scheduler and execute the instructions.Type: GrantFiled: December 30, 1996Date of Patent: November 2, 1999Assignee: Intel CorporationInventors: Kin-Yip Liu, Gary Hammond, Kenneth Shoemaker, Anand Pai
-
Patent number: 5978896Abstract: A method and system for increased instruction dispatch efficiency in a superscalar processor system having an instruction queue for receiving a group of instructions in an application specified sequential order and an instruction dispatch unit for dispatching instructions from an associated instruction buffer to multiple execution units on an opportunistic basis. The dispatch status of instructions within the associated instruction buffer is periodically determined and, in response to a dispatch of the instructions at the beginning of the instruction buffer, the remaining instructions are shifted within the instruction buffer in the application specified sequential order and a partial group of instructions are loaded into the instruction buffer from the instruction queue utilizing a selectively controlled multiplex circuit. In this manner additional instructions may be dispatched to available execution units without requiring a previous group of instructions to be dispatched completely.Type: GrantFiled: August 12, 1994Date of Patent: November 2, 1999Assignee: International Business Machines CorporationInventors: James Allan Kahle, Chin-Cheng Kau, David Steven Levitan, Aubrey Deene Ogden
-
Patent number: 5978875Abstract: A continuous data server includes a storage unit connected to a buffer memory which is in turn connected to a plurality of communication control units which transfer data of the buffer memory to a network. The right to use a bus interconnecting the buffer memory and communication control units is deterministically assigned by a micro-scheduler in accordance with a program stored in a micro-schedule table. The micro-scheduler allocates the right to use the bus in accordance with a predetermined schedule, rather than by arbitration.Type: GrantFiled: March 17, 1997Date of Patent: November 2, 1999Assignee: Kabushiki Kaisha ToshibaInventors: Shigehiro Asano, Masaki Suzuki
-
Patent number: 5978901Abstract: A superscalar microprocessor includes a combination floating point and multimedia unit. The floating point and multimedia unit includes one set of registers. The multimedia core and floating point core share the one set of registers. Each register as a type field associated with the register. The type field identifies whether the associated register contains valid data and whether the data is of multimedia type or floating point type. If the register stores floating point type data, the type field further indicates which of a plurality of floating point types the register stores such as: zero, infinity and normal. The floating point core relies on the type field to identify special floating point numbers such as zero and infinity. To ensure predictable results when a floating point instruction is executed subsequent to a multimedia instruction, a retyping algorithm retypes registers typed as multimedia type when the first floating point instruction subsequent to a multimedia instruction is executed.Type: GrantFiled: August 21, 1997Date of Patent: November 2, 1999Assignee: Advanced Micro Devices, Inc.Inventors: Mark R. Luedtke, Paul K. Miller, Chris N. Hinds, Ashraf Ahmed
-
Patent number: 5974526Abstract: A register renaming system for out-of-order execution of a set of reduced instruction set computer instructions having addressable source and destination register fields, adapted for use in a computer having an instruction execution unit with a register file accessed by read address ports and for storing instruction operands. A data dependance check circuit is included for determining data dependencies between the instructions. A tag assignment circuit generates one of more tags to specify the location of operands, based on the data dependencies determined by the data dependance check circuit. A set of register file port multiplexers select the tags generated by the tag assignment circuit and pass the tags onto the read address ports of the register file for storing execution results.Type: GrantFiled: December 15, 1997Date of Patent: October 26, 1999Assignee: Seiko CorporationInventors: Sanjiv Garg, Kevin Ray Iadonato, Le Trong Nguyen, Johannes Wang
-
Patent number: 5974524Abstract: According to one aspect of the invention, a method is provided for maintaining the state of a processor having a plurality of physical registers and a rename register map which stores rename pairs that associate architected and physical registers, the rename register map having a plurality of entries which are associated with the physical registers, individual entries having an architected register field, an architected status bit and a history status bit.Type: GrantFiled: October 28, 1997Date of Patent: October 26, 1999Assignee: International Business Machines CorporationInventors: Hoichi Cheong, Paul Joseph Jordan, Quan Nguyen, Hung Qui Le
-
Patent number: 5974525Abstract: A technique for increasing the number of physical segment registers by renaming logical segment registers into a larger register space. The remapping of the segment registers allows for instructions accessing the segment registers to be executed non-serially. The renaming of segment registers is achieved by assigning a shadow register to a segment register name. Thus, a pair of registers are physically available for a specified logical register in an instruction set to be renamed. Two bits, designated as the PSEG and SPEC bits, are used to control the remapping.Type: GrantFiled: December 5, 1997Date of Patent: October 26, 1999Assignee: Intel CorporationInventors: Derrick Chu Lin, Ramamohan Rao Vakkalagadda, Satchitanand Jain, Varsha P. Tagare, Nimish H. Modi
-
Patent number: 5974523Abstract: A mechanism for efficiently overlapping multiple operand types is used in a microprocessor which includes a plurality of execution units and a mechanism to provide operations, which include one or more operands, to the plurality of execution units. Each of the plurality of execution units interprets the one or more operands as different types of operands, and the mechanism to provide operations overlaps the different types of operands.Type: GrantFiled: September 6, 1996Date of Patent: October 26, 1999Assignee: Intel CorporationInventors: Andrew F. Glew, Darrell D. Boggs, Michael A. Fetterman, Glenn J. Hinton, Robert P. Colwell, David B. Papworth
-
Patent number: 5974522Abstract: A processor having multiple functional units. The processor is capable of executing multiple instructions concurrently. An instruction issuing unit is connected to a mechanism for handling an interrupt of the processor. The interrupt handler has an instruction window (IW), which includes a vector element number (VEN) field that indicates the uncompleted elements to be executed. Upon termination of the interrupt, normal processing of the instruction issuing unit continues.Type: GrantFiled: March 9, 1993Date of Patent: October 26, 1999Assignee: Cornell Research Foundation, Inc.Inventors: Hwa C. Torng, Martin Day
-
Patent number: 5968160Abstract: A data processing system having flexibility coping with parallelism of a program comprises a plurality of processor elements for executing instructions, a main memory shared by the plurality of processor elements, and a plurality of parallel operation control facilities for enabling the plurality of processor elements to operate in synchronism. The plurality of parallel operation control facilities are provided in correspondence to the plurality of processor elements, respectively. The data processing system further comprises a multiprocessor operation control facility for enabling the plurality of processor elements to operate independently, and a flag for holding a value indicating which of the parallel operation mode or the multiprocessor mode is to be activated. The shared cache memory is implemented in a blank instruction and controlled by a cache controller so that inconsistency of the data stored in the cache memory is eliminated.Type: GrantFiled: September 4, 1997Date of Patent: October 19, 1999Assignee: Hitachi, Ltd.Inventors: Masahiko Saito, Kenichi Kurosawa, Yoshiki Kobayashi, Tadaaki Bandoh, Masahiro Iwamura, Takashi Hotta, Yasuhiro Nakatsuka, Shigeya Tanaka, Takeshi Takemoto
-
Patent number: 5964866Abstract: The invention relates to a processor having a data flow unit for processing data in a plurality of steps. In one version, the data flow unit includes a plurality of consecutive stages which include logic for performing steps of the data processing, the stages being coupled together by a data path, at least one stage being coupled to a transceiver which causes data to be provided to the stage for processing or to bypass the stage unprocessed in response to a stage enable signal; a synchronizer which receives processed data from the stages and causes the processed data to be provided to external logic in synchronization with a clock signal.Type: GrantFiled: October 24, 1996Date of Patent: October 12, 1999Assignee: International Business Machines CorporationInventors: Christopher McCall Durham, Peter Juergen Klim
-
Patent number: 5964861Abstract: A method for designing a processor. The method utilises the full flexibility of an original instruction set in writing programs for operation of the processor the subset of instruction words used in writing the program are then used in defining the instruction decoder of the processor.Type: GrantFiled: December 17, 1996Date of Patent: October 12, 1999Assignee: Nokia Mobile Phones LimitedInventors: Rebecca Gabzdyl, Brian McGovern
-
Patent number: 5964862Abstract: A CPU (central processing unit) of a computer. The CPU comprises a dispatch controller, a pipeline, a working register file, and an architectural register file. The dispatch controller dispatches instructions for execution and determines whether the dispatched instructions are valid or invalid. The pipeline executes the dispatched instructions using selected operands in the pipeline and generates operands in response. The working register file stores the generated operands before the executed instructions are determined to be valid or invalid by the dispatch controller such that the stored operands may be subsequently selected for use in executing an instruction in the pipeline. The architectural register file stores the generated operands for those of the executed instructions that are determined to be valid by the dispatch controller and transfer operands currently stored therein when one of the executed instructions is determined to be invalid by the dispatch logic.Type: GrantFiled: June 30, 1997Date of Patent: October 12, 1999Assignee: Sun Microsystems, Inc.Inventors: Arthur T. Leung, Gary R. Lauterbach
-
Patent number: 5961630Abstract: A method for handling dynamic structural hazards and exceptions by using post-ready latency, including: receiving a plurality of instructions; selecting a first instruction whose execution can cause an exception; assigning a post-ready latency to a second instruction that follows the first instruction; and scheduling for execution the first instruction and the second instruction separated from the first instruction by an amount of time at least equal to the post-ready latency of the second instruction.Type: GrantFiled: December 30, 1997Date of Patent: October 5, 1999Assignee: Intel CorporationInventors: Nazar A. Zaidi, Michael J. Morrison, Elango Ganesan
-
Patent number: 5961629Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instructions in-order.Type: GrantFiled: September 10, 1998Date of Patent: October 5, 1999Assignee: Seiko Epson CorporationInventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
-
Patent number: 5958043Abstract: A multiple instruction parallel issue/execution management system including a forward map buffer for storing forward map information indicating whether or not the result value generated by execution of a given instruction is used an input operand in other instructions. The forward map buffer previously stores the forward map information for the result value, before the result value corresponding to the given instruction is actually generated, and when the result value corresponding to the given instruction is actually generated, the operands using the result value are specified by using the previously stored forward map information corresponding to the result value, and supplied to an instruction using the result value as an input operand.Type: GrantFiled: August 21, 1997Date of Patent: September 28, 1999Assignee: NEC CorporationInventor: Masato Motomura
-
Patent number: 5954814Abstract: A microprocessor includes an instruction fetch unit, a branch prediction unit, and a decode unit. The instruction fetch unit is adapted to retrieve a plurality of program instructions. The program instructions include serialization initiating instructions and branch instructions. The branch prediction unit is adapted to generate branch predictions for the branch instructions, direct the instruction fetch unit to retrieve the program instructions in an order corresponding to the branch predictions, and redirect the instruction fetch unit based on a branch misprediction. The branch prediction unit is further adapted to store a redirect address corresponding to the branch misprediction. The decode unit is adapted to decode the program instructions into microcode.Type: GrantFiled: December 19, 1997Date of Patent: September 21, 1999Assignee: Intel CorporationInventors: Nazar A. Zaidi, Deepak J. Aatresh, Michael J. Morrison
-
Patent number: 5951689Abstract: A power control system for a microprocessor, having multiple parallel operated execution units, functions to disable some of the execution units to conserve power and/or reduce heat. The execution units are disabled by preventing the application of clock pulses to these execution units. This operation is effected by a power control unit which enables and disables gates coupled between a source of clock signals and the execution units.Type: GrantFiled: December 31, 1996Date of Patent: September 14, 1999Assignee: VLSI Technology, Inc.Inventors: David R. Evoy, Desi Rhoden
-
Patent number: 5951670Abstract: A processor for executing a plurality of instructions. The processor comprises a plurality of logical segment registers, wherein the logical segment registers define an architectural state for memory segmentation of the processor. A plurality of physical segment registers are coupled to the logical segment registers. The processor further comprises an issue cluster that issues the instructions and that maps the logical segment registers, specified by the operations, to the physical segment registers to provide segment register renaming in the processor.Type: GrantFiled: September 4, 1997Date of Patent: September 14, 1999Assignee: Intel CorporationInventors: Andrew F. Glew, Michael A. Fetterman
-
Patent number: 5951671Abstract: A multiprocessor system capable of sharing instruction predecode information is disclosed. By storing predecode information as it is calculated, and then allowing other processors in the system to access the information, subsequent prefetches of instructions are made without repeating predecode calculations. The multiprocessor system may comprise a bus connecting at least two microprocessors together. The microprocessors may be configured to generate predecode information for a plurality of instructions and then share the predecode information with other microprocessors coupled to the bus. The predecode information may be stored in a single storage location or in multiple locations, and the information may be stored internally within the microprocessors or externally. The microprocessors in the system may be configured to search for predecode information corresponding to instructions being accessed.Type: GrantFiled: December 18, 1997Date of Patent: September 14, 1999Assignee: Advanced Micro Devices, Inc.Inventor: Thomas S. Green
-
Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch
Patent number: 5951675Abstract: A microprocessor includes an instruction alignment unit for locating instructions and conveying the located instructions to a set of decode units. The instruction alignment unit includes dual instruction queues. The first instruction queue receives instruction blocks fetched from the instruction cache. The instruction alignment unit uses instruction identification information provided by the instruction cache to select instructions from the first instruction queue for conveyance to the second instruction queue. Additionally, the instruction alignment unit applies a predetermined selection criteria to the instructions within the second instruction queue in order to select instructions for dispatch to the decode units. Selection logic for the first instruction queue need not consider the type of instruction, etc., in selecting instructions for conveyance to the second instruction queue. Selection logic for the second instruction queue considers instruction type, etc.Type: GrantFiled: October 27, 1998Date of Patent: September 14, 1999Assignee: Advanced Micro Devices, Inc.Inventors: Rammohan Narayan, Venkateswara Rao Madduri -
Patent number: 5948097Abstract: A method and apparatus for performing a system call in a system having a user privilege level and a kernel privilege level, wherein the kernel privilege level is higher than the user privilege level is disclosed. A sequence of instructions is executed at the user privilege level including a first instruction that requires a resource provided at the kernel privilege level. Control is transferred to a first procedure executing at the user privilege level by performing a near call and saving only a pointer to the first instruction. The first procedure includes a calling instruction that does not save an architectural state prior to transferring control. Control is transferred from the first procedure to a second procedure executing at the kernel privilege level. The second procedure determines the resource required by the first instruction. Control is transferred from the second procedure to a third procedure that is determined by the second procedure.Type: GrantFiled: August 29, 1996Date of Patent: September 7, 1999Assignee: Intel CorporationInventors: Andrew Glew, Scott Dion Rodgers
-
Patent number: 5948106Abstract: A system and method for thermal overload detection and protection for a processor which allows the processor to run at near maximum potential for the vast majority of its execution life. This is effectuated by the provision of circuitry to detect when the processor has exceeded its thermal thresholds and which then causes the processor to automatically reduce the clock rate to a fraction of the nominal clock while execution continues. When the thermal condition has stabilized, the clock may be raised in a stepwise fashion back to the nominal clock rate. Throughout the period of cycling the clock frequency from nominal to minimum and back, the program continues to be executed. Also provided is a queue activity rise time detector and method to control the rate of acceleration of a functional unit from idle to full throttle by a localized stall mechanism at the boundary of each stage in the pipe.Type: GrantFiled: June 25, 1997Date of Patent: September 7, 1999Assignee: Sun Microsystems, Inc.Inventors: Ricky C. Hetherington, Ramesh Panwar
-
Patent number: 5944811Abstract: In a superscalar processor for fetching a prescribed peak number of instructions in parallel in each period until such instructions are fetched to a predetermined peak number, such as ten, an instruction parallel issue and execution administrating device comprises a forward map buffer for a forward map indicative of a result of each instruction for use as an operand by which one of other instructions of the predetermined peak number. The forward map is developed before the result is actually produced and is used, after the actual production, to indicate which one of such results should be used as the operand by the above-mentiond one of the other instructions.Type: GrantFiled: August 29, 1997Date of Patent: August 31, 1999Assignee: NEC CorporationInventor: Masato Motomura
-
Patent number: 5944810Abstract: In a superscalar processor, multiple instructions are executed in parallel to obtain multiple execution results, and the multiple execution results are stored in a working register file. Each execution result in the working register file has at least one status bit associated therewith which identifies the execution result as valid data. The multiple execution results contained in the working register data then retired by changing the status bits associated with each execution result to identify the execution result as an architectural copy of the data. In this manner, the speculative data is retired without data movement of the speculative data, thus reducing a number of ports needed in the superscalar processor.Type: GrantFiled: June 27, 1997Date of Patent: August 31, 1999Assignee: Sun Microsystems, Inc.Inventor: Rajasekhar Cherabuddi
-
Patent number: 5944812Abstract: A register rename unit employs a rename map stack upon which a register rename map corresponding to each dispatched instruction is pushed. Upon occurrence of an exception, the register rename maps corresponding to instructions subsequent to the instruction experiencing the exception are popped from the stack. In this manner, the architected register to implemented register mapping consistent with the instruction experiencing the exception is restored. According to one embodiment, the rename map stack can be recovered from an exception in one clock cycle. In one particular implementation, the rename map stack comprises multiple independent stacks. Each independent stack corresponds to one of the architected registers, and stores implemented register specifiers corresponding to that architected register.Type: GrantFiled: December 10, 1998Date of Patent: August 31, 1999Assignee: Advanced Micro Devices, Inc.Inventor: Wade A. Walker
-
Patent number: 5944816Abstract: A microprocessor including a context file configured to store multiple contexts is provided. The microprocessor may execute multiple threads, each thread having its own context within the microprocessor. In one embodiment, the present microprocessor is capable of executing at least two threads concurrently: a task and an interrupt service routine. Interrupt service routines may be executed without disturbing a task's context and without performing a context save operation. Instead, the interrupt service routine accesses a context which is independent of the context of the task. In another embodiment, the context file includes multiple interrupt service routine contexts. Multiple ISR context storages allow for nested interrupts to be performed concurrently. In yet another embodiment, the microprocessor is configured to execute multiple tasks and multiple interrupt service routines concurrently.Type: GrantFiled: May 17, 1996Date of Patent: August 31, 1999Assignee: Advanced Micro Devices, Inc.Inventors: Drew J. Dutton, David S. Christie, Brian C. Barnes
-
Patent number: 5941983Abstract: A method for executing instructions out-of-order to improve performance of a processor includes compiling the instructions of a program into separate queues along with encoded dependencies between instructions in the different queues. The processor then issues instructions from each of these queues independently, except that it enforces the encoded dependencies among instructions from different queues. If an instruction is dependent on instructions in other queues, the processor waits to issue it until the instructions on which it depends are issued. The processor includes a stall unit, comprised of a number of instruction counters for each queue, that enforces the dependencies between instructions in different queues.Type: GrantFiled: June 24, 1997Date of Patent: August 24, 1999Assignee: Hewlett-Packard CompanyInventors: Rajiv Gupta, William S. Worley, Jr.
-
Patent number: 5941977Abstract: In a processor speculatively executing instructions which specify logical addresses, a method and apparatus for speculatively converting logical addresses to physical addresses. The processor has a register window movable within a register file, a window pointer register maintaining a value corresponding to the location of the window in the register file, a speculative window pointer register maintaining a speculative value of the window pointer register. A controller identifies an instruction expected to modify the value in the window pointer register, and in response to identifying the instruction the controller modifies the speculative value. A mapper, coupled to the speculative window pointer register, converts the instruction specified logical addresses to physical addresses based on the speculative value contained in the speculative window pointer register.Type: GrantFiled: June 25, 1997Date of Patent: August 24, 1999Assignee: Sun Microsystems, Inc.Inventors: Ramesh Panwar, Dani Y. Dakhil
-
Patent number: 5941980Abstract: A process is provided for determining the beginning and ending of each instruction of a variable length instruction. Data lines are stored in a first memory area which illustratively is an instruction cache. Each data line comprises a sequence of data words that are stored at sequential address in a main memory. The data lines contain multiple encoded variable length instructions that are contiguously stored in the main memory. Multiple indicators are stored in a second memory area, including one indicator associated with each data word of the data lines stored in the first memory area. Each indicator indicates whether or not its associated data word is the initial data word of a variable length instruction. A sequence of data words may be fetched from the cache. The fetched sequence of data words includes a starting data word and at least the number of data words in the longest permissible instruction. Plural indicators (i.e.Type: GrantFiled: February 27, 1997Date of Patent: August 24, 1999Assignee: Industrial Technology Research InstituteInventors: Shi-Sheng Shang, Dze-Chaung Wang
-
Patent number: 5941984Abstract: A VLIW microprocessor in which bypaths for transferring data among pipelines are incorporated between a plurality of execution units such as a memory access unit and an integer operation unit. The data on the bypaths is directly transferred to target units according to a control signal generated by a bypath processing control circuit.Type: GrantFiled: May 16, 1997Date of Patent: August 24, 1999Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Atsushi Mohri, Akira Yamada, Toyohiko Yoshida
-
Patent number: 5938760Abstract: A performance monitor implementing a plurality of counters counts several events to provide an instruction fetch bandwidth analysis, a cycles per instruction (CPI) infinite and finite analysis, an operand fetch bandwidth analysis, an instruction parallelism analysis, and a trailing edge analysis. Such analyses are performed on the performance of a data processing system in order that the designer may develop an improved processor architecture.Type: GrantFiled: December 17, 1996Date of Patent: August 17, 1999Assignee: International Business Machines CorporationInventors: Frank Eliot Levine, Roy Stuart Moore, Charles Philip Roth, Edward Hugh Welbon
-
Patent number: 5938756Abstract: The integer execution unit (IEU) of a central processing unit (CPU) is provided with a graphics status register (GSR) for storing a graphics data scaling factor and a graphics data alignment address offset. Additionally, the CPU is provided with a graphics execution unit (GRU) for executing a number of graphics operations in accordance to the graphics data scaling factor and alignment address offset, the graphics data having a number of graphics data formats. In one embodiment, the GRU is also used to execute a number of graphics data addition, subtraction, rounding, expansion, merge, alignment, multiplication, logical, compare, and pixel distance operations. The graphics data operations are categorized into a first and a second category, and the GRU concurrently executes one graphics operations from each category.Type: GrantFiled: April 19, 1996Date of Patent: August 17, 1999Assignee: Sun Microsystems, Inc.Inventors: Timothy J. Van Hook, Leslie Dean Kohn, Robert Yung
-
Patent number: 5935239Abstract: A mask decoder circuit is provided. The mask decoder circuit receives an input value indicative of one of a plurality of masks. The mask decoder circuit independently and in parallel processes portions of the input value to produce a submask (containing the portion of the output mask in which a transition from binary zeros to binary ones occurs) and to select either the submask, binary zeros, or binary ones for each of a plurality of regions within an output mask. The region receiving the submask is identified by the portion of the input value not processed to produce the submask. Other regions are filled with either binary zeros or binary ones according to the desired output mask.Type: GrantFiled: July 17, 1998Date of Patent: August 10, 1999Assignee: Advanced Micro Devices, Inc.Inventor: Rammohan Narayan
-
Patent number: 5931938Abstract: Global address and data routers interconnect individual system units each having its own processors, memory, and I/O. A domain filter coupled to the routers dynamically defines groups of system units as domains and clusters of domains which have both software and hardware isolation from each other. Clusters can share dynamically definable ranges of memory with each other. The domain filter has software-loadable registers on the system units and in the global routers to set the parameters of the domains and clusters. The registers label individual inter-system transactions on the routers as invalid for system units not in the same domain or cluster as the originating unit.Type: GrantFiled: December 12, 1996Date of Patent: August 3, 1999Assignee: Sun Microsystems, Inc.Inventors: Daniel P. Drogichen, Andrew J. McCrocklin, Nicholas E. Aneshansley
-
Patent number: 5925123Abstract: A dual instruction set processor decodes and executes code received from a network and code supplied from a local memory. Thus, the dual instruction set processor is capable of executing instructions in two different instructions sets from two different sources. The dual instruction set processor includes a computer platform independent instruction decoder, another decoder, and an execution unit that executes decoded instructions from both of the decoders. A computer system with the foregoing described dual instruction set processor, a local memory, and a communication interface device, such as a modem, for connection to a network, such as the Internet or an Intranet, can be optimized to execute, for example, JAVA code, in example of one set of computer platform independent instructions, from the network, and to execute non-JAVA code stored locally, or on the network but in a trusted environment or an authorized environment.Type: GrantFiled: January 23, 1997Date of Patent: July 20, 1999Assignee: Sun Microsystems, Inc.Inventors: Marc Tremblay, James Michael O'Connor
-
Patent number: 5904732Abstract: A method and apparatus for dynamically switching the relative priorities of the load buffer and store buffer with respect to external memory resources in a superscalar processor. According to a first embodiment, a protocol dictates that the load buffer always prevails until the store buffer reaches a certain "high water mark," (an upper threshold) at which time the store buffer gains priority. After the store buffer has gained priority, it continues to access the memory until it is depleted to a "low water mark," (a lower threshold) at which time the load buffer regains priority. Whenever the store buffer reaches the high water mark, it gains priority until it drains down to the low water mark. This reduces the tendency for the store buffer to become full and block the processor. According to a second embodiment, the load buffer prevails if it is above its high water mark.Type: GrantFiled: April 30, 1996Date of Patent: May 18, 1999Assignee: Sun Microsystems, Inc.Inventors: Dale Greenley, Leslie Kohn