Superscalar Patents (Class 712/23)

Central processing unit method and apparatus for extending general instructions with extension data of an extension register

Patent number: 6499099

Abstract: A central processing unit having an extension instruction comprises a memory address, an offset and a fixed length instruction of varying immediate data. The central processing unit comprises a general register, a special register, a register file constituted as an inner register, a function block for executing the calculation function; an instruction register for memorizing the instruction, a control block for generating/outputting a control signal to the instruction register and a plurality of status flags, in which the special register enables access by a programmer and includes an extension data field for memorizing extension data or an extension register having the extension data field as one element and an extension flag for changing its status when the instruction memorizing the extension data in the extension register is executed and having one or a plurality of bits that is accessible to a programmer.

Type: Grant

Filed: January 27, 1999

Date of Patent: December 24, 2002

Assignee: Asia Design Co., Ltd.

Inventor: Kyung Youn Cho
Method and apparatus for speculatively locking objects in an object-based system

Patent number: 6487652

Abstract: Methods and apparatus for speculatively locking an object are disclosed. According to one aspect of the present invention, a method for acquiring use of an object using a current thread includes a determination of whether a first bit included in the object is set to indicate that the object is speculatively owned by a speculative owner thread. When the object is speculatively owned, the speculative owner thread is allowed to use the object without locking the object. The method also includes checking a stored identifier that is associated with the object and identifies the speculative owner thread, as well as determining whether the stored identifier identifies the current thread. When the stored identifier identifies the current thread, the current thread already has use of the object; i.e., the current thread is the speculative owner thread.

Type: Grant

Filed: September 30, 1999

Date of Patent: November 26, 2002

Assignee: Sun Microsystems, Inc.

Inventors: Benedict A. Gomes, Lars Bak, David P. Stoutamire
Updating condition status register based on instruction specific modification information in set/clear pair upon instruction commit in out-of-order processor

Patent number: 6484251

Abstract: A processor including a register, an execution unit, a temporary result buffer, and a commit function circuit. The register includes at least one register bit and may include one or more sticky bits. The execution unit is suitable for executing a set of computer instructions. The temporary result buffer is configured to receive, from the execution unit, register bit modification information provided by the instructions. The temporary result buffer is suitable for storing the modification information in set/clear pairs of bits corresponding to respective register bits of the register. The commit function circuit is configured to receive the set/clear pairs of bits from the temporary result buffer when the instruction is committed. The commit function circuit is suitable for generating an updated bit in response to receiving the set/clear pairs of bits. The updated bit is then committed to the corresponding register bit of the register.

Type: Grant

Filed: October 14, 1999

Date of Patent: November 19, 2002

Assignee: International Business Machines Corporation

Inventors: Robert Greg McDonald, Peichun Peter Liu, Christopher Hans Olson
Method, apparatus, and system for maintaining processor ordering by checking load addresses of unretired load instructions against snooping store addresses

Patent number: 6484254

Abstract: According to one aspect of the invention, a method is provided in which store addresses of store instructions dispatched during a last predetermined number of cycles are maintained in a first data structure of a first processor. It is determined whether a load address of a first load instruction matches one of the store addresses in the first data structure. The first load instruction is replayed if the load address of the first load instruction matches one of the store addresses in the first data structure.

Type: Grant

Filed: December 30, 1999

Date of Patent: November 19, 2002

Assignee: Intel Corporation

Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
Prioritized instruction scheduling for multi-streaming processors

Patent number: 6477562

Abstract: A multi-streaming processor has multiple streams for processing multiple threads, and an instruction scheduler including a priority record of priority codes for one or more of the streams. The priority codes determine in some embodiments relative access to resources as well as which stream has access at any point in time. In other embodiments priorities are determined dynamically and altered on-the-fly, which may be done by various criteria, such as on-chip processing statistics, by executing one or more priority algorithms, by input from off-chip, according to stream loading, or by combinations of these and other methods. In one embodiment a special code is used for disabling a stream, and streams may be enabled and disabled dynamically by various methods, such as by on-chip events, processing statistics, inpu from off-chip, and by processor interrupts. Some specific applications are taught, including for IP-routers and digital signal processors.

Type: Grant

Filed: December 16, 1998

Date of Patent: November 5, 2002

Assignee: Clearwater Networks, Inc.

Inventors: Mario D. Nemirovsky, Adolfo M. Nemirovsky, Narendra Sankar
Merging single precision floating point operands

Patent number: 6463525

Abstract: Where it is desired to perform a double precision operation using single precision operands, first and second single precision operands are loaded into first and second respective rows of a re-order buffer, and third and fourth single precision operands are loaded into third and fourth respective rows of the re-order buffer. A first merge instruction copies the first and second single precision operands from respective first and second rows of the re-order buffer into first and second portions of a fifth row of the re-order buffer, thereby concatenating the first and second single precision operands to represent a first double precision operand. A second merge instruction copies the third and fourth single precision operands from respective third and fourth rows of the re-order buffer into first and second portions of a sixth row of the re-order buffer, thereby concatenating the third and fourth single precision operands to represent a second double precision operand.

Type: Grant

Filed: August 16, 1999

Date of Patent: October 8, 2002

Assignee: Sun Microsystems, Inc.

Inventor: J. Arjun Prabhu
Software-based speculative pre-computation and multithreading

Publication number: 20020144083

Abstract: Speculative pre-computation and multithreading (SP), allows a processor to use spare hardware contexts to spawn speculative threads to very effectively pre-fetch data well in advance of the main thread. The burden of spawning threads may fall on the main thread via basic triggers. The speculative threads may also spawn other speculative threads via chaining triggers.

Type: Application

Filed: March 30, 2001

Publication date: October 3, 2002

Inventors: Hong Wang, Jamison Collins, John P. Shen, Bryan Black, Perry H. Wang, Edward T. Grochowski, Ralph M. Kling
Multiple-thread processor with single-thread interface shared among threads

Publication number: 20020138717

Abstract: A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, “pollution”, or “cross-talk” between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.

Type: Application

Filed: May 23, 2002

Publication date: September 26, 2002

Inventors: William N. Joy, Marc Tremblay, Gary Lauterbach, Joseph I. Chamdani
Multiprocessor servers with controlled numbered of CPUs

Patent number: 6453344

Abstract: A multiprocessor system having a total number of available CPUs partitioned into one or more smaller pools of CPUs called servers where the number of CPUs available to a server is reduced below the total number of available CPUs. Software licensing costs are thereby reduced because the number of CPUs available to run the operating system or ISV software has been reduced to the number of CPUs in the pool of the server rather than the total number of available CPUs in the multiprocessor system. In order to enforce the isolation of CPUs required by software licensing, separate identification codes, CPUIDs, that contain unique system serial numbers are assigned to each server in the multiprocessing system. The multiprocessor system has multiple CPUIDs, one for each server (each pool of CPUs that can execute operating systems and ISV software).

Type: Grant

Filed: March 31, 1999

Date of Patent: September 17, 2002

Assignee: Amdahl Corporation

Inventors: Robert Scott Ellsworth, Jonathan Russell Nolting, Keith Joseph Philipp
Data processing system including a shared memory resource circuit

Patent number: 6442670

Abstract: A data processing system comprises a plurality of nodes and a serial data bus interconnecting the nodes in series in a closed loop, for passing address and data information. At least one processing node includes a processor, a printed circuit board and a memory which is partitioned into a plurality of sections, including a first section for directly sharable memory located on the printed circuit board, and a second section for block sharable memory. A local bus connects the processor, block sharable memory and printed circuit board, for transferring data in parallel from the processor to the directly sharable memory on the printed circuit board, and for transferring data from the block sharable memory to the printed circuit board.

Type: Grant

Filed: July 2, 2001

Date of Patent: August 27, 2002

Assignee: Sun Microsystems, Inc.

Inventors: John D. Acton, Michael D. Derbish, Gavin G. Gibson, Jack M. Hardy, Jr., Hugh M. Humphreys, Steven P. Kent, Steven E. Schelong, Ricardo Yong, William B. DeRolf
Dynamic handling of object versions to support space and time dimensional program execution

Patent number: 6438677

Abstract: One embodiment of the present invention provides a system that supports space and time dimensional program execution by facilitating accesses to different versions of a memory element. The system supports a head thread that executes program instructions and a speculative thread that executes program instructions in advance of the head thread. The head thread accesses a primary version of the memory element, and the speculative thread accesses a space-time dimensioned version of the memory element. During a reference to the memory element by the head thread, the system accesses the primary version of the memory element. During a reference to the memory element by the speculative thread, the speculative thread accesses a pointer associated with the primary version of the memory element, and accesses a version of the memory element through the pointer. Note that the pointer points to the space-time dimensioned version of the memory element if the space-time dimensioned version of the memory element exists.

Type: Grant

Filed: October 20, 1999

Date of Patent: August 20, 2002

Assignee: Sun Microsystems, Inc.

Inventors: Shailender Chaudhry, Marc Tremblay
Microprocessor

Patent number: 6438680

Abstract: When a decision circuit (217) incorporated in a control circuit (21) in an instruction decode unit (2) in a microprocessor (1) decides that an integer operation unit (4) can not execute a following sub instruction, the decision circuit (217) controls each of selectors (211, 214, and 215) and an exchange circuit (216) so that a memory access unit (3) that has already executed a preceding sub instruction can execute the following sub instruction.

Type: Grant

Filed: June 3, 1999

Date of Patent: August 20, 2002

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventors: Akira Yamada, Isao Minematsu
System and method for handling load and/or store operations in a superscalar microprocessor

Patent number: 6434693

Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address-collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.

Type: Grant

Filed: November 12, 1999

Date of Patent: August 13, 2002

Assignee: Seiko Epson Corporation

Inventors: Cheryl D. Senter, Johannes Wang
Data processing unit with interface for sharing registers by a processor and a coprocessor

Patent number: 6434689

Abstract: An apparatus is described that comprises a data processing unit and at least one coprocessor. The data processing unit comprises a register file having registers, a memory, a plurality of execution units, a coprocessor interface for coupling the at least one coprocessor with the data processing unit, and a pipeline configuration for processing instructions having a fetch stage for fetching an instruction from the memory, a decode stage for decoding an operational code from the instruction, an execution stage for activating one of the execution units, and a write-back stage for God writing back from the execution unit. The data processing unit comprises read-and write-lines coupling the register file with the coprocessor for exchanging operands, at least one control line indicating that the coprocessor is busy, and a plurality of control lines from the decode stage for controlling the coprocessor which are operated upon detection of a coprocessor instruction.

Type: Grant

Filed: November 9, 1998

Date of Patent: August 13, 2002

Assignee: Infineon Technologies North America Corp.

Inventors: Rod G. Fleck, Roger D. Arnold, Bruce Holmer, Danielle G. Lemay
Programmed load precession machine

Publication number: 20020103990

Abstract: An architecture and method are presented for a computer processor supporting interleaved execution of multiple concurrently-active threads, and capable of independently allocating a portion of the total processor execution time to each of the threads. Compared to existing architectures, in which the portion of processor time allocated to each thread is fixed, the processor architecture described herein is believed to offer higher performance for applications such as communications protocol processing, in which the workload of individual threads may vary, and in which the workload requires real time facilities.

Type: Application

Filed: February 1, 2001

Publication date: August 1, 2002

Inventor: Hanan Potash
Information processing apparatus for entertainment system utilizing DMA-controlled high-speed transfer and processing of routine data

Patent number: 6427201

Abstract: Routine processing for routine data, non-routine processing for routine data and general non-routine processing are to be processed efficiently. To this end, a main CPU 20 has a CPU core 21, having a parallel computational mechanism, a command cache 22 and a data cache 23, as ordinary cache units, and a scratch-pad memory SPR 24 which is an internal high-speed memory capable of performing direct memory accessing (DMA) suited for routine processing. A floating decimal point vector processor (VPE) 30 has an internal high-speed memory (VU-MEM) capable of DMA processing and is tightly connected to the main CPU to form a co-processor. The VPE 40 has a high-speed internal memory 40 (VU-MEM) capable of DMA processing. The DMA controller (DMAC) 14 controls DMA transfer between the main memory 50 and the SPR 24, between the main memory 50 and the (VU-MEM) 34 and between the (VU-MEM) 44 and the SPR 24.

Type: Grant

Filed: August 18, 1998

Date of Patent: July 30, 2002

Assignee: Sony Computer Entertainment Inc.

Inventor: Akio Ohba
Pipelined methods and apparatus for weight selection and content addressable memory searches

Patent number: 6415354

Abstract: When a search key is supplied to a content addressable memory (CAM), the CAM signals indicate which CAM entries have matched the key. These signals are provided to a weight array to select the entry of the highest priority. Each entry's priority is indicated by a weight in the weight array. The weight array processing is pipelined. In pipeline stage 0, the most significant bits (bits 0) of the weights are examined, and the highest priorities are selected based on the most significant bits. At pipeline stage 1, the next most significant bits (bits 1) are examined, and so on.

Type: Grant

Filed: July 15, 1999

Date of Patent: July 2, 2002

Assignee: Applied Micro Circuits Corporation

Inventors: Alexander Joffe, Oran Uzrad-Nali, Simon H. Milner
Injection control mechanism for external events

Patent number: 6412062

Abstract: The present invention is a method and apparatus to inject an external event to a first pipeline stage in a pipeline chain. A target instruction address corresponding to an instruction is specified. The external event is asserted when there is a match between the target instruction address and a pipeline instruction pointer corresponding to a second pipeline stage. The second pipeline stage is earlier than the first pipeline stage in the pipeline chain. The external event is unmasked via a delivery path between a signal representing the asserted external event and the first pipeline stage.

Type: Grant

Filed: June 30, 1999

Date of Patent: June 25, 2002

Assignee: Intel Corporation

Inventors: Yan Xu, Steven J. Tu
Dynamic pipelines with reusable logic elements controlled by a set of multiplexers for pipeline stage selection

Patent number: 6412061

Abstract: A method of dynamically adjusting a multiple stage pipeline to execute one of a set of instructions, wherein each stage has a latency and performs a selected data operation. An instruction to be executed is received and a number of stages of the pipeline is selected to execute the instruction as needed to perform a corresponding data operation. Unnecessary stages are bypassed to a reduced latency and the instruction is executed with the selected stages.

Type: Grant

Filed: January 14, 1998

Date of Patent: June 25, 2002

Assignee: Cirrus Logic, Inc.

Inventor: Thomas Anthony Dye
Dynamic allocation of resources in multiple microprocessor pipelines

Patent number: 6408377

Abstract: A microprocessor having M parallel pipelines and N arithmetic logic units, where N is less than M. A single instruction fetch stage fetches multi-stage instructions, and a single instruction decoder provides a parallel set of three instructions to the three pipelines. The two ALUs are dynamically connected to two of the pipelines having instructions requiring an ALU, while the third pipeline executes an instruction in parallel that does not require an ALU. The third pipeline may have a move unit connected to it.

Type: Grant

Filed: April 26, 2001

Date of Patent: June 18, 2002

Assignee: Rise Technology Company

Inventor: Kenneth K. Munson
System and method for register renaming

Patent number: 6408375

Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.

Type: Grant

Filed: April 5, 2001

Date of Patent: June 18, 2002

Assignee: Seiko Epson Corporation

Inventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
Method for mapping instructions using a set of valid and invalid logical to physical register assignments indicated by bits of a valid vector together with a logical register list

Patent number: 6405304

Abstract: A technique for managing register assignments. The technique involves maintaining, in a register list memory circuit having entries that respectively correspond to physical registers, a list of register assignments that assign logical registers to the physical registers. The technique further involves maintaining, in a vector memory circuit having bits that respectively correspond to the physical registers, a valid vector that forms, in combination with the list of register assignments, a list of valid register assignments. Furthermore, the technique involves storing, for an instruction that is mapped by the data processor, a copy of the valid vector from the vector memory circuit to a silo memory circuit. Preferably, the processor using the technique has the ability to execute branches of instructions speculatively, and to recover if it is determined that the processor executed down an incorrect instruction branch.

Type: Grant

Filed: August 24, 1998

Date of Patent: June 11, 2002

Assignee: Compaq Information Technologies Group, L.P.

Inventors: James Arthur Farrell, Sharon Marie Britton, Harry Ray Fair, III, Bruce Gieseke, Daniel Lawrence Leibholz, Derrick R. Meyer
Process for executing highly efficient VLIW

Patent number: 6397319

Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.

Type: Grant

Filed: June 20, 2000

Date of Patent: May 28, 2002

Assignee: Matsushita Electric Ind. Co., Ltd.

Inventors: Shuichi Takayama, Nobuo Higaki
MECHANISM AND METHOD FOR PIPELINE CONTROL IN A PROCESSOR

Publication number: 20020056034

Abstract: A data processing system including a memory system and a plurality of peripheral components. A processor is coupled to the memory and peripheral components. A plurality of pipeline stages are implemented within the processor where each stage is configured to perform specific operations according to instructions then associated with that stage. A snapshot register is associated with at least some of the pipeline stages where the snapshot register configured to store data describing the state of execution of the instruction then associated with that stage.

Type: Application

Filed: October 1, 1999

Publication date: May 9, 2002

Inventors: MARGARET GEARTY, CHIH-JUI PENG
Multi-threading for a processor utilizing a replay queue

Patent number: 6385715

Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.

Type: Grant

Filed: May 4, 2001

Date of Patent: May 7, 2002

Assignee: Intel Corporation

Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
Method and apparatus for synchronizing parallel pipelines in a superscalar microprocessor

Patent number: 6385719

Abstract: A transfer tag is generated by the Instruction Fetch Unit and passed to the decode unit in the instruction pipeline with each group of instructions fetched during a branch prediction by a fetcher. Individual instructions within the fetched group for the branch pipeline are assigned a concatenated version (group tag concatenated with instruction lane) of the transfer tag which is used to match on requests to flush any newer instructions. All potential instruction or Internal Operation latches in the decode pipeline must perform a match and if a match is encountered, all valid bits associated with newer instructions or internal operations upstream from the match are cleared. The transfer tag representing the next instruction to be processed in the branch pipeline is passed to the Instruction Dispatch Unit. The Instruction Dispatch Unit queries the branch pipeline to compare its transfer tag with transfer tags of instructions in the branch pipeline.

Type: Grant

Filed: June 30, 1999

Date of Patent: May 7, 2002

Assignee: International Business Machines Corporation

Inventors: John Edward Derrick, Brian R. Konigsburg, Lee Evan Eisen, David Stephen Levitan
System and method for assigning tags to control instruction processing in a superscalar processor

Publication number: 20020053014

Abstract: A tag monitoring system for assigning tags to instructions. A source supplies instructions to be executed by a functional unit. A register file stores information required for the execution of each instruction. A queue having a plurality of slots containing tags which are used for tagging the instructions. The tags are arranged in the queue in an order specified by the program order of their corresponding instructions. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file stores an instruction's information at a location in the register file defined by the tag assigned to that instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.

Type: Application

Filed: January 3, 2002

Publication date: May 2, 2002

Inventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
Line-oriented reorder buffer configured to selectively store a memory operation result in one of the plurality of reorder buffer storage locations corresponding to the executed instruction

Patent number: 6381689

Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.

Type: Grant

Filed: March 13, 2001

Date of Patent: April 30, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: David B. Witt, Thang M. Tran
System to implement a cross-bar switch of a broadband processor

Patent number: 6378060

Abstract: The present invention provides a cross-bar circuit that implements a switch of a broadband processor. In an exemplary embodiment, the present invention provides a cross-bar circuit that, in response to partially-decoded instruction information and in response to datapath information, (1) allows any bit from a 2n-bit (e.g. 256-bit) input source word to be switched into any bit position of a 2m-bit (e.g. 128-bit) output destination word and (2) provides the ability to set-to-zero any bit in said 2m-bit output destination word. The cross-bar circuit includes: (1) a switch circuit which includes 2m 2n:1 multiplexor circuits, where each of the 2n:1 multiplexor circuits (a) has a unique n-bit (e.g.

Type: Grant

Filed: February 11, 2000

Date of Patent: April 23, 2002

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, Bruce Bateman, John Moussouris
System and method for assigning tags to control instruction processing in a superscalar processor

Patent number: 6360309

Abstract: A tag monitoring system for assigning tags to instructions. A source supplies instructions to be executed by a functional unit. A register file stores information required for the execution of each instruction. A queue having a plurality of slots containing tags which are used for tagging the instructions. The tags are arranged in the queue in an order specified by the program order of their corresponding instructions. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file stores an instruction's information at a location in the register file defined by the tag assigned to that instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.

Type: Grant

Filed: May 19, 2000

Date of Patent: March 19, 2002

Assignee: Seiko Epson Corporation

Inventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
High-performance, superscalar-based computer system with out-of-order instruction execution

Publication number: 20020029328

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Application

Filed: May 10, 2001

Publication date: March 7, 2002

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Supporting space-time dimensional program execution by selectively versioning memory updates

Patent number: 6353881

Abstract: A system is provided that facilitates space and time dimensional execution of computer programs through selective versioning of memory elements located in a system heap. The system includes a head thread that executes program instructions and a speculative thread that simultaneously executes program instructions in advance of the head thread with respect to the time dimension of sequential execution of the program. The collapsing of the time dimensions is facilitated by expanding the heap into two space-time dimensions, a primary dimension (dimension zero), in which the head thread operates, and a space-time dimension (dimension one), in which the speculative thread operates. In general, each dimension contains its own version of an object and objects created by the thread operating in the dimension. The head thread generally accesses a primary version of a memory element and the speculative thread generally accesses a corresponding space-time dimensioned version of the memory element.

Type: Grant

Filed: May 17, 1999

Date of Patent: March 5, 2002

Assignee: Sun Microsystems, Inc.

Inventors: Shailender Chaudhry, Marc Tremblay
Control bit vector storage for a microprocessor

Patent number: 6351804

Abstract: A control bit vector storage is provided. The present control bit vector storage (preferably included within a functional unit) stores control bits indicative of a particular instruction. The control bits are divided into multiple control vectors, each vector indicative of one instruction operation. The control bits control dataflow elements within the functional unit to cause the instruction operation to be performed. Additionally, the present control bit vector storage allows complex instructions (or instructions which produce multiple results) to be divided into simpler operations. The hardware included within the functional unit may be reduced to that employed to perform the simpler operations. In one embodiment, the control bit vector storage comprises a plurality of vector storages. Each vector storage comprises a pair of individual vector storages and a shared vector storage. The shared vector storage stores control bits common to both control vectors.

Type: Grant

Filed: October 10, 2000

Date of Patent: February 26, 2002

Assignee: Advanced Micro Devices, Inc.

Inventor: Marty L. Pflum
High-performance, superscalar-based computer system with out-of-order instruction execution and concurrent results distribution

Publication number: 20020016903

Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.

Type: Application

Filed: May 8, 2001

Publication date: February 7, 2002

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Result forwarding cache

Patent number: 6343359

Abstract: An apparatus is presented for expediting the execution of dependent micro instructions in a pipeline microprocessor having design characteristics—complexity, power, and timing—that are not significantly impacted by the number of stages in the microprocessor's pipeline. In contrast to conventional result distribution schemes where an intermediate result is distributed to multiple pipeline stages, the present invention provides a cache for storage of multiple intermediate results. The cache is accessed by a dependent micro instruction to retrieve required operands. The apparatus includes a result forwarding cache, result update logic, and operand configuration logic. The result forwarding cache stores the intermediate results. The result update logic receives the intermediate results as they are generated and enters the intermediate results into the result forwarding cache.

Type: Grant

Filed: May 18, 1999

Date of Patent: January 29, 2002

Assignee: IP-First, L.L.C.

Inventors: Gerard M. Col, G. Glenn Henry
Thread switch logic in a multiple-thread processor

Patent number: 6341347

Abstract: A processor includes a thread switching control logic that performs a fast thread-switching operation in response to an L1 cache miss stall. The fast thread-switching operation implements one or more of several thread-switching methods. A first thread-switching operation is “oblivious” thread-switching for every N cycle in which the individual flip-flops locally determine a thread-switch without notification of stalling. The oblivious technique avoids usage of an extra global interconnection between threads for thread selection. A second thread-switching operation is “semi-oblivious” thread-switching for use with an existing “pipeline stall” signal (if any). The pipeline stall signal operates in two capacities, first as a notification of a pipeline stall, and second as a thread select signal between threads so that, again, usage of an extra global interconnection between threads for thread selection is avoided.

Type: Grant

Filed: May 11, 1999

Date of Patent: January 22, 2002

Assignee: Sun Microsystems, Inc.

Inventors: William N. Joy, Marc Tremblay, Gary Lauterbach, Joseph I. Chamdani
Parallel processing instructions routed through plural differing capacity units of operand address generators coupled to multi-ported memory and ALUs

Patent number: 6341343

Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.

Type: Grant

Filed: April 26, 2001

Date of Patent: January 22, 2002

Assignee: Rise Technology Company

Inventor: Kenneth K. Munson
Line-oriented reorder buffer

Publication number: 20020007450

Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.

Type: Application

Filed: March 13, 2001

Publication date: January 17, 2002

Inventors: David B. Witt, Thang M. Tran
Using padded instructions in a block-oriented cache

Patent number: 6339822

Abstract: A microprocessor configured to cache basic blocks of instructions is disclosed. The microprocessor may comprise decoding logic, a basic block cache, and a branch prediction unit. The decoding logic is coupled to receive and decode variable-length instructions into padded instructions that have one of a predetermined number of predetermined lengths. The decoding logic is further configured to form basic blocks of instructions from the padded and decoded instructions. Basic blocks are natural divisions in instruction streams resulting from branch instructions. The start of a basic block is a target of a branch, and the end is another branch instruction. The basic block cache is configured to store the basic blocks in a plurality of storage locations, wherein each storage location is configured to store an address tag, a link bit, and at least a portion of one basic block. The link bit indicates whether the basic block stored in said storage location extends into another storage location.

Type: Grant

Filed: October 2, 1998

Date of Patent: January 15, 2002

Assignee: Advanced Micro Devices, Inc.

Inventor: Paul K. Miller
Method and system for dividing a computer processor register into sectors and storing frequently used values therein

Patent number: 6336160

Abstract: A method and system for dividing computer processor registers into sectors and storing frequently used data in the most significant unused sectors. The method includes sector renaming that is performed on each individual sector (i.e., on a sector-by-sector basis) rather than renaming an entire processor register. A register is divided into sectors such that the smallest accessible unit for an instruction in each register can be uniquely addressed and renamed. A register file is divided into sectors so that each process register can be uniquely addressed and renamed. The most significant sectors of the processor registers are used to hold pre-assigned values therein. Data previously loaded into processor register sectors is stored in the most significant sectors of the processor registers for possible future referencing and use. The method also includes establishing a sign-extend memory that includes at least one sign-extend bit in a sector status table.

Type: Grant

Filed: June 19, 1998

Date of Patent: January 1, 2002

Assignee: International Business Machines Corporation

Inventors: Richard James Eickemeyer, Nadeem Malik, Alan Vicha Pita, Avijit Saha
System and method for utilizing a conditional split for aligning internal operation (IOPs) for dispatch

Patent number: 6336182

Abstract: A method and system for aligning internal operations (IOPs) for dispatch are disclosed. The method and system comprise conditionally asserting a predecode based on a particular dispatch slot that an instruction is going to be placed. The method and system further include using the information related to the predecode to expand an instruction into at least one dummy operation and an IOP operation whenever the instruction would not be supported in the particular dispatch slot.

Type: Grant

Filed: March 5, 1999

Date of Patent: January 1, 2002

Assignee: International Business Machines Corporation

Inventors: John Edward Derrick, Lee Evan Eisen, Paul Joseph Jordan, Robert William Hay
RISC86 instruction set

Patent number: 6336178

Abstract: An internal RISC-type instruction structure furnishes a fixed bit-length template including a plurality of defined bit fields for a plurality of operation (Op) formats. One format includes an instruction-type bit field, two source-operand bit fields and one destination-operand bit field for designating a register-to-register operation. Another format is a load-store format that includes an instruction-type bit field, an identifier of a source or destination register for the respective load or store operation, and bit fields for specifying the segment, base and index parameters of an address.

Type: Grant

Filed: September 11, 1998

Date of Patent: January 1, 2002

Assignee: Advanced Micro Devices, Inc.

Inventor: John G. Favor
Cumulative lookahead to eliminate chained dependencies

Patent number: 6332187

Abstract: A processor is configured to generate lookahead values using a cumulative constant. The processor classifies operations to a particular register (e.g. the stack pointer register, or ESP in an embodiment employing the x86 instruction set architecture) as either accelerated or non-accelerated. For example, instructions which are defined to increment/decrement the particular register by an explicit or implicit constant value may be accelerated operations. Upon the occurrence of a non-accelerated operation, the processor may begin accumulating the cumulative effect of accelerated operations to the result of the non-accelerated operation as a cumulative offset. The result of the non-accelerated operation (upon execution thereof) may then be added to the cumulative offset values corresponding to each accelerated operation to generate the particular register value corresponding to that accelerated operation. Accordingly, dependencies upon the register due to the accelerated operations may be alleviated.

Type: Grant

Filed: March 8, 2001

Date of Patent: December 18, 2001

Assignee: Advanced Micro Devices, Inc.

Inventor: David B. Witt
Reducing inherited logical to physical register mapping information between tasks in multithread system using register group identifier

Patent number: 6330661

Abstract: A register content inheriting system contributes for realization of register content inheriting with a hardware of simple construction in a multithread multi-processor. Respective thread execution units and physical common register are provided. Using a register mapping table, a register number to be made reference to from each program is placed in the physical common register. Only as required in inheriting of register content, a relationship of the register mapping table is updated. Upon inheriting the content of the register, the content of the register mapping table is copied.

Type: Grant

Filed: April 26, 1999

Date of Patent: December 11, 2001

Assignee: NEC Corporation

Inventor: Sunao Torii
Method and apparatus for saturated multiplication and accumulation in an application specific signal processor

Patent number: 6330660

Abstract: An application specific signal processor (ASSP) performs vectorized and nonvectorized operations. Nonvectorized operations may be performed using a saturated multiplication and accumulation operation. The ASSP includes a serial interface, a buffer memory, a core processor for performing digital signal processing which includes a reduced instruction set computer (RISC) processor and four signal processing units. The four signal processing units execute the digital signal processing algorithms in parallel including the execution of the saturated multiplication and accumulation operation. The ASSP is utilized in telecommunication interface devices such as a gateway. The ASSP is well suited to handling voice and data compression/decompression in telecommunication systems where a packetized network is used to transceive packetized data and voice.

Type: Grant

Filed: October 25, 1999

Date of Patent: December 11, 2001

Assignee: VxTel, Inc.

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
Pairing of micro instructions in the instruction queue

Patent number: 6330657

Abstract: An apparatus and method are presented for increasing the throughput within a single-channel of a pipeline microprocessor. Back-to-back pairs of micro instructions are evaluated to determine if they can be combined for execution in parallel. If so, then they are combined and issued for concurrent execution. The apparatus includes a micro instruction queue that buffers and orders micro instructions for sequential execution by the pipeline microprocessor. Within the micro instruction queue, a second micro instruction is ordered to execute immediately following execution of a first micro instruction. Pairing logic is coupled to the micro instruction queue. The pairing logic combines the first and second micro instructions so that the first and second micro instructions are executed in parallel by the pipeline microprocessor.

Type: Grant

Filed: May 18, 1999

Date of Patent: December 11, 2001

Assignee: IP-First, L.L.C.

Inventors: Gerard M. Col, G. Glenn Henry
Instruction converting apparatus using parallel execution code

Patent number: 6324639

Abstract: A processor can decode short instructions with a word length equal to one unit field and long instructions with a word length equal to two unit fields. An opcode of each kind of instruction is arranged into the first unit field assigned to the instruction. The number of instructions to be executed by the processor in parallel is s. When the ratio of short to long instructions is s-1:1, the s-1 short instructions are assigned to the first unit field to the s-1th unit field in the parallel execution code, and the long instruction is assigned to the sth unit field to the (s+k−1)th unit field in the same parallel execution code.

Type: Grant

Filed: March 29, 1999

Date of Patent: November 27, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Taketo Heishi, Tetsuya Tanaka, Nobuo Higaki, Shuishi Takayama, Kensuke Odani
Single-chip multiprocessor with cycle-precise program scheduling of parallel execution

Publication number: 20010042189

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Application

Filed: February 20, 2001

Publication date: November 15, 2001

Inventors: Boris A. Babaian, Yuli Kh Sakhin, Vladimir Yu Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Method and apparatus for accelerating software decode of variable length encoded information

Patent number: 6313766

Abstract: A method and apparatus to accelerate variable length decode is disclosed. The system includes a logic device to receive a bit stream of variable length encoded information. The logic device outputs a fixed length value corresponding to a variable length code received as part of the bit stream of the variable length encoded information. The system also includes a processor to receive the fixed length value. The processor to performs a write of a coefficient to a system memory device, the coefficient corresponding to the fixed length value received from the logic device.

Type: Grant

Filed: July 1, 1998

Date of Patent: November 6, 2001

Assignee: Intel Corporation

Inventors: Brian K. Langendorf, Brian Tucker
Apparatus and method for improving superscalar processors

Patent number: 6311261

Abstract: The invention involves new microarchitecture apparatus and methods for superscalar microprocessors that support multi-instruction issue, decoupled dataflow scheduling, out-of-order execution, register renaming, multi-level speculative execution, and precise interrupts. These are the Distributed Instruction Queue (DIQ) and the Modified Reorder Buffer (MRB). The DIQ is a new distributed instruction shelving technique that is an alternative to the reservation station (RS) technique and offers a more efficient (improved performance/cost) implementation. The Modified Reorder Buffer (MRB) is an improved reorder buffer (RB) result shelving technique eliminates the slow and expensive prioritized associative lookup, shared global buses, and dummy branch entries (to reduce entry usage). The MRB has an associateive key unit which uses a unique associative key.

Type: Grant

Filed: September 15, 1997

Date of Patent: October 30, 2001

Assignee: Georgia Tech Research Corporation

Inventors: Joseph I. Chamdani, Cecil O. Alford

prev … 3 4 5 6 7 8 9 10 11 next