Superscalar Patents (Class 712/23)
  • Patent number: 6499099
    Abstract: A central processing unit having an extension instruction comprises a memory address, an offset and a fixed length instruction of varying immediate data. The central processing unit comprises a general register, a special register, a register file constituted as an inner register, a function block for executing the calculation function; an instruction register for memorizing the instruction, a control block for generating/outputting a control signal to the instruction register and a plurality of status flags, in which the special register enables access by a programmer and includes an extension data field for memorizing extension data or an extension register having the extension data field as one element and an extension flag for changing its status when the instruction memorizing the extension data in the extension register is executed and having one or a plurality of bits that is accessible to a programmer.
    Type: Grant
    Filed: January 27, 1999
    Date of Patent: December 24, 2002
    Assignee: Asia Design Co., Ltd.
    Inventor: Kyung Youn Cho
  • Patent number: 6487652
    Abstract: Methods and apparatus for speculatively locking an object are disclosed. According to one aspect of the present invention, a method for acquiring use of an object using a current thread includes a determination of whether a first bit included in the object is set to indicate that the object is speculatively owned by a speculative owner thread. When the object is speculatively owned, the speculative owner thread is allowed to use the object without locking the object. The method also includes checking a stored identifier that is associated with the object and identifies the speculative owner thread, as well as determining whether the stored identifier identifies the current thread. When the stored identifier identifies the current thread, the current thread already has use of the object; i.e., the current thread is the speculative owner thread.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: November 26, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Benedict A. Gomes, Lars Bak, David P. Stoutamire
  • Patent number: 6484251
    Abstract: A processor including a register, an execution unit, a temporary result buffer, and a commit function circuit. The register includes at least one register bit and may include one or more sticky bits. The execution unit is suitable for executing a set of computer instructions. The temporary result buffer is configured to receive, from the execution unit, register bit modification information provided by the instructions. The temporary result buffer is suitable for storing the modification information in set/clear pairs of bits corresponding to respective register bits of the register. The commit function circuit is configured to receive the set/clear pairs of bits from the temporary result buffer when the instruction is committed. The commit function circuit is suitable for generating an updated bit in response to receiving the set/clear pairs of bits. The updated bit is then committed to the corresponding register bit of the register.
    Type: Grant
    Filed: October 14, 1999
    Date of Patent: November 19, 2002
    Assignee: International Business Machines Corporation
    Inventors: Robert Greg McDonald, Peichun Peter Liu, Christopher Hans Olson
  • Patent number: 6484254
    Abstract: According to one aspect of the invention, a method is provided in which store addresses of store instructions dispatched during a last predetermined number of cycles are maintained in a first data structure of a first processor. It is determined whether a load address of a first load instruction matches one of the store addresses in the first data structure. The first load instruction is replayed if the load address of the first load instruction matches one of the store addresses in the first data structure.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: November 19, 2002
    Assignee: Intel Corporation
    Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
  • Patent number: 6477562
    Abstract: A multi-streaming processor has multiple streams for processing multiple threads, and an instruction scheduler including a priority record of priority codes for one or more of the streams. The priority codes determine in some embodiments relative access to resources as well as which stream has access at any point in time. In other embodiments priorities are determined dynamically and altered on-the-fly, which may be done by various criteria, such as on-chip processing statistics, by executing one or more priority algorithms, by input from off-chip, according to stream loading, or by combinations of these and other methods. In one embodiment a special code is used for disabling a stream, and streams may be enabled and disabled dynamically by various methods, such as by on-chip events, processing statistics, inpu from off-chip, and by processor interrupts. Some specific applications are taught, including for IP-routers and digital signal processors.
    Type: Grant
    Filed: December 16, 1998
    Date of Patent: November 5, 2002
    Assignee: Clearwater Networks, Inc.
    Inventors: Mario D. Nemirovsky, Adolfo M. Nemirovsky, Narendra Sankar
  • Patent number: 6463525
    Abstract: Where it is desired to perform a double precision operation using single precision operands, first and second single precision operands are loaded into first and second respective rows of a re-order buffer, and third and fourth single precision operands are loaded into third and fourth respective rows of the re-order buffer. A first merge instruction copies the first and second single precision operands from respective first and second rows of the re-order buffer into first and second portions of a fifth row of the re-order buffer, thereby concatenating the first and second single precision operands to represent a first double precision operand. A second merge instruction copies the third and fourth single precision operands from respective third and fourth rows of the re-order buffer into first and second portions of a sixth row of the re-order buffer, thereby concatenating the third and fourth single precision operands to represent a second double precision operand.
    Type: Grant
    Filed: August 16, 1999
    Date of Patent: October 8, 2002
    Assignee: Sun Microsystems, Inc.
    Inventor: J. Arjun Prabhu
  • Publication number: 20020144083
    Abstract: Speculative pre-computation and multithreading (SP), allows a processor to use spare hardware contexts to spawn speculative threads to very effectively pre-fetch data well in advance of the main thread. The burden of spawning threads may fall on the main thread via basic triggers. The speculative threads may also spawn other speculative threads via chaining triggers.
    Type: Application
    Filed: March 30, 2001
    Publication date: October 3, 2002
    Inventors: Hong Wang, Jamison Collins, John P. Shen, Bryan Black, Perry H. Wang, Edward T. Grochowski, Ralph M. Kling
  • Publication number: 20020138717
    Abstract: A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, “pollution”, or “cross-talk” between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.
    Type: Application
    Filed: May 23, 2002
    Publication date: September 26, 2002
    Inventors: William N. Joy, Marc Tremblay, Gary Lauterbach, Joseph I. Chamdani
  • Patent number: 6453344
    Abstract: A multiprocessor system having a total number of available CPUs partitioned into one or more smaller pools of CPUs called servers where the number of CPUs available to a server is reduced below the total number of available CPUs. Software licensing costs are thereby reduced because the number of CPUs available to run the operating system or ISV software has been reduced to the number of CPUs in the pool of the server rather than the total number of available CPUs in the multiprocessor system. In order to enforce the isolation of CPUs required by software licensing, separate identification codes, CPUIDs, that contain unique system serial numbers are assigned to each server in the multiprocessing system. The multiprocessor system has multiple CPUIDs, one for each server (each pool of CPUs that can execute operating systems and ISV software).
    Type: Grant
    Filed: March 31, 1999
    Date of Patent: September 17, 2002
    Assignee: Amdahl Corporation
    Inventors: Robert Scott Ellsworth, Jonathan Russell Nolting, Keith Joseph Philipp
  • Patent number: 6442670
    Abstract: A data processing system comprises a plurality of nodes and a serial data bus interconnecting the nodes in series in a closed loop, for passing address and data information. At least one processing node includes a processor, a printed circuit board and a memory which is partitioned into a plurality of sections, including a first section for directly sharable memory located on the printed circuit board, and a second section for block sharable memory. A local bus connects the processor, block sharable memory and printed circuit board, for transferring data in parallel from the processor to the directly sharable memory on the printed circuit board, and for transferring data from the block sharable memory to the printed circuit board.
    Type: Grant
    Filed: July 2, 2001
    Date of Patent: August 27, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: John D. Acton, Michael D. Derbish, Gavin G. Gibson, Jack M. Hardy, Jr., Hugh M. Humphreys, Steven P. Kent, Steven E. Schelong, Ricardo Yong, William B. DeRolf
  • Patent number: 6438677
    Abstract: One embodiment of the present invention provides a system that supports space and time dimensional program execution by facilitating accesses to different versions of a memory element. The system supports a head thread that executes program instructions and a speculative thread that executes program instructions in advance of the head thread. The head thread accesses a primary version of the memory element, and the speculative thread accesses a space-time dimensioned version of the memory element. During a reference to the memory element by the head thread, the system accesses the primary version of the memory element. During a reference to the memory element by the speculative thread, the speculative thread accesses a pointer associated with the primary version of the memory element, and accesses a version of the memory element through the pointer. Note that the pointer points to the space-time dimensioned version of the memory element if the space-time dimensioned version of the memory element exists.
    Type: Grant
    Filed: October 20, 1999
    Date of Patent: August 20, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Marc Tremblay
  • Patent number: 6438680
    Abstract: When a decision circuit (217) incorporated in a control circuit (21) in an instruction decode unit (2) in a microprocessor (1) decides that an integer operation unit (4) can not execute a following sub instruction, the decision circuit (217) controls each of selectors (211, 214, and 215) and an exchange circuit (216) so that a memory access unit (3) that has already executed a preceding sub instruction can execute the following sub instruction.
    Type: Grant
    Filed: June 3, 1999
    Date of Patent: August 20, 2002
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Akira Yamada, Isao Minematsu
  • Patent number: 6434693
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address-collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Grant
    Filed: November 12, 1999
    Date of Patent: August 13, 2002
    Assignee: Seiko Epson Corporation
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 6434689
    Abstract: An apparatus is described that comprises a data processing unit and at least one coprocessor. The data processing unit comprises a register file having registers, a memory, a plurality of execution units, a coprocessor interface for coupling the at least one coprocessor with the data processing unit, and a pipeline configuration for processing instructions having a fetch stage for fetching an instruction from the memory, a decode stage for decoding an operational code from the instruction, an execution stage for activating one of the execution units, and a write-back stage for God writing back from the execution unit. The data processing unit comprises read-and write-lines coupling the register file with the coprocessor for exchanging operands, at least one control line indicating that the coprocessor is busy, and a plurality of control lines from the decode stage for controlling the coprocessor which are operated upon detection of a coprocessor instruction.
    Type: Grant
    Filed: November 9, 1998
    Date of Patent: August 13, 2002
    Assignee: Infineon Technologies North America Corp.
    Inventors: Rod G. Fleck, Roger D. Arnold, Bruce Holmer, Danielle G. Lemay
  • Publication number: 20020103990
    Abstract: An architecture and method are presented for a computer processor supporting interleaved execution of multiple concurrently-active threads, and capable of independently allocating a portion of the total processor execution time to each of the threads. Compared to existing architectures, in which the portion of processor time allocated to each thread is fixed, the processor architecture described herein is believed to offer higher performance for applications such as communications protocol processing, in which the workload of individual threads may vary, and in which the workload requires real time facilities.
    Type: Application
    Filed: February 1, 2001
    Publication date: August 1, 2002
    Inventor: Hanan Potash
  • Patent number: 6427201
    Abstract: Routine processing for routine data, non-routine processing for routine data and general non-routine processing are to be processed efficiently. To this end, a main CPU 20 has a CPU core 21, having a parallel computational mechanism, a command cache 22 and a data cache 23, as ordinary cache units, and a scratch-pad memory SPR 24 which is an internal high-speed memory capable of performing direct memory accessing (DMA) suited for routine processing. A floating decimal point vector processor (VPE) 30 has an internal high-speed memory (VU-MEM) capable of DMA processing and is tightly connected to the main CPU to form a co-processor. The VPE 40 has a high-speed internal memory 40 (VU-MEM) capable of DMA processing. The DMA controller (DMAC) 14 controls DMA transfer between the main memory 50 and the SPR 24, between the main memory 50 and the (VU-MEM) 34 and between the (VU-MEM) 44 and the SPR 24.
    Type: Grant
    Filed: August 18, 1998
    Date of Patent: July 30, 2002
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Akio Ohba
  • Patent number: 6415354
    Abstract: When a search key is supplied to a content addressable memory (CAM), the CAM signals indicate which CAM entries have matched the key. These signals are provided to a weight array to select the entry of the highest priority. Each entry's priority is indicated by a weight in the weight array. The weight array processing is pipelined. In pipeline stage 0, the most significant bits (bits 0) of the weights are examined, and the highest priorities are selected based on the most significant bits. At pipeline stage 1, the next most significant bits (bits 1) are examined, and so on.
    Type: Grant
    Filed: July 15, 1999
    Date of Patent: July 2, 2002
    Assignee: Applied Micro Circuits Corporation
    Inventors: Alexander Joffe, Oran Uzrad-Nali, Simon H. Milner
  • Patent number: 6412062
    Abstract: The present invention is a method and apparatus to inject an external event to a first pipeline stage in a pipeline chain. A target instruction address corresponding to an instruction is specified. The external event is asserted when there is a match between the target instruction address and a pipeline instruction pointer corresponding to a second pipeline stage. The second pipeline stage is earlier than the first pipeline stage in the pipeline chain. The external event is unmasked via a delivery path between a signal representing the asserted external event and the first pipeline stage.
    Type: Grant
    Filed: June 30, 1999
    Date of Patent: June 25, 2002
    Assignee: Intel Corporation
    Inventors: Yan Xu, Steven J. Tu
  • Patent number: 6412061
    Abstract: A method of dynamically adjusting a multiple stage pipeline to execute one of a set of instructions, wherein each stage has a latency and performs a selected data operation. An instruction to be executed is received and a number of stages of the pipeline is selected to execute the instruction as needed to perform a corresponding data operation. Unnecessary stages are bypassed to a reduced latency and the instruction is executed with the selected stages.
    Type: Grant
    Filed: January 14, 1998
    Date of Patent: June 25, 2002
    Assignee: Cirrus Logic, Inc.
    Inventor: Thomas Anthony Dye
  • Patent number: 6408377
    Abstract: A microprocessor having M parallel pipelines and N arithmetic logic units, where N is less than M. A single instruction fetch stage fetches multi-stage instructions, and a single instruction decoder provides a parallel set of three instructions to the three pipelines. The two ALUs are dynamically connected to two of the pipelines having instructions requiring an ALU, while the third pipeline executes an instruction in parallel that does not require an ALU. The third pipeline may have a move unit connected to it.
    Type: Grant
    Filed: April 26, 2001
    Date of Patent: June 18, 2002
    Assignee: Rise Technology Company
    Inventor: Kenneth K. Munson
  • Patent number: 6408375
    Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.
    Type: Grant
    Filed: April 5, 2001
    Date of Patent: June 18, 2002
    Assignee: Seiko Epson Corporation
    Inventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
  • Patent number: 6405304
    Abstract: A technique for managing register assignments. The technique involves maintaining, in a register list memory circuit having entries that respectively correspond to physical registers, a list of register assignments that assign logical registers to the physical registers. The technique further involves maintaining, in a vector memory circuit having bits that respectively correspond to the physical registers, a valid vector that forms, in combination with the list of register assignments, a list of valid register assignments. Furthermore, the technique involves storing, for an instruction that is mapped by the data processor, a copy of the valid vector from the vector memory circuit to a silo memory circuit. Preferably, the processor using the technique has the ability to execute branches of instructions speculatively, and to recover if it is determined that the processor executed down an incorrect instruction branch.
    Type: Grant
    Filed: August 24, 1998
    Date of Patent: June 11, 2002
    Assignee: Compaq Information Technologies Group, L.P.
    Inventors: James Arthur Farrell, Sharon Marie Britton, Harry Ray Fair, III, Bruce Gieseke, Daniel Lawrence Leibholz, Derrick R. Meyer
  • Patent number: 6397319
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Grant
    Filed: June 20, 2000
    Date of Patent: May 28, 2002
    Assignee: Matsushita Electric Ind. Co., Ltd.
    Inventors: Shuichi Takayama, Nobuo Higaki
  • Publication number: 20020056034
    Abstract: A data processing system including a memory system and a plurality of peripheral components. A processor is coupled to the memory and peripheral components. A plurality of pipeline stages are implemented within the processor where each stage is configured to perform specific operations according to instructions then associated with that stage. A snapshot register is associated with at least some of the pipeline stages where the snapshot register configured to store data describing the state of execution of the instruction then associated with that stage.
    Type: Application
    Filed: October 1, 1999
    Publication date: May 9, 2002
    Inventors: MARGARET GEARTY, CHIH-JUI PENG
  • Patent number: 6385715
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: May 7, 2002
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6385719
    Abstract: A transfer tag is generated by the Instruction Fetch Unit and passed to the decode unit in the instruction pipeline with each group of instructions fetched during a branch prediction by a fetcher. Individual instructions within the fetched group for the branch pipeline are assigned a concatenated version (group tag concatenated with instruction lane) of the transfer tag which is used to match on requests to flush any newer instructions. All potential instruction or Internal Operation latches in the decode pipeline must perform a match and if a match is encountered, all valid bits associated with newer instructions or internal operations upstream from the match are cleared. The transfer tag representing the next instruction to be processed in the branch pipeline is passed to the Instruction Dispatch Unit. The Instruction Dispatch Unit queries the branch pipeline to compare its transfer tag with transfer tags of instructions in the branch pipeline.
    Type: Grant
    Filed: June 30, 1999
    Date of Patent: May 7, 2002
    Assignee: International Business Machines Corporation
    Inventors: John Edward Derrick, Brian R. Konigsburg, Lee Evan Eisen, David Stephen Levitan
  • Publication number: 20020053014
    Abstract: A tag monitoring system for assigning tags to instructions. A source supplies instructions to be executed by a functional unit. A register file stores information required for the execution of each instruction. A queue having a plurality of slots containing tags which are used for tagging the instructions. The tags are arranged in the queue in an order specified by the program order of their corresponding instructions. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file stores an instruction's information at a location in the register file defined by the tag assigned to that instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.
    Type: Application
    Filed: January 3, 2002
    Publication date: May 2, 2002
    Inventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
  • Patent number: 6381689
    Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.
    Type: Grant
    Filed: March 13, 2001
    Date of Patent: April 30, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David B. Witt, Thang M. Tran
  • Patent number: 6378060
    Abstract: The present invention provides a cross-bar circuit that implements a switch of a broadband processor. In an exemplary embodiment, the present invention provides a cross-bar circuit that, in response to partially-decoded instruction information and in response to datapath information, (1) allows any bit from a 2n-bit (e.g. 256-bit) input source word to be switched into any bit position of a 2m-bit (e.g. 128-bit) output destination word and (2) provides the ability to set-to-zero any bit in said 2m-bit output destination word. The cross-bar circuit includes: (1) a switch circuit which includes 2m 2n:1 multiplexor circuits, where each of the 2n:1 multiplexor circuits (a) has a unique n-bit (e.g.
    Type: Grant
    Filed: February 11, 2000
    Date of Patent: April 23, 2002
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig Hansen, Bruce Bateman, John Moussouris
  • Patent number: 6360309
    Abstract: A tag monitoring system for assigning tags to instructions. A source supplies instructions to be executed by a functional unit. A register file stores information required for the execution of each instruction. A queue having a plurality of slots containing tags which are used for tagging the instructions. The tags are arranged in the queue in an order specified by the program order of their corresponding instructions. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file stores an instruction's information at a location in the register file defined by the tag assigned to that instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.
    Type: Grant
    Filed: May 19, 2000
    Date of Patent: March 19, 2002
    Assignee: Seiko Epson Corporation
    Inventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
  • Publication number: 20020029328
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
    Type: Application
    Filed: May 10, 2001
    Publication date: March 7, 2002
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6353881
    Abstract: A system is provided that facilitates space and time dimensional execution of computer programs through selective versioning of memory elements located in a system heap. The system includes a head thread that executes program instructions and a speculative thread that simultaneously executes program instructions in advance of the head thread with respect to the time dimension of sequential execution of the program. The collapsing of the time dimensions is facilitated by expanding the heap into two space-time dimensions, a primary dimension (dimension zero), in which the head thread operates, and a space-time dimension (dimension one), in which the speculative thread operates. In general, each dimension contains its own version of an object and objects created by the thread operating in the dimension. The head thread generally accesses a primary version of a memory element and the speculative thread generally accesses a corresponding space-time dimensioned version of the memory element.
    Type: Grant
    Filed: May 17, 1999
    Date of Patent: March 5, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Marc Tremblay
  • Patent number: 6351804
    Abstract: A control bit vector storage is provided. The present control bit vector storage (preferably included within a functional unit) stores control bits indicative of a particular instruction. The control bits are divided into multiple control vectors, each vector indicative of one instruction operation. The control bits control dataflow elements within the functional unit to cause the instruction operation to be performed. Additionally, the present control bit vector storage allows complex instructions (or instructions which produce multiple results) to be divided into simpler operations. The hardware included within the functional unit may be reduced to that employed to perform the simpler operations. In one embodiment, the control bit vector storage comprises a plurality of vector storages. Each vector storage comprises a pair of individual vector storages and a shared vector storage. The shared vector storage stores control bits common to both control vectors.
    Type: Grant
    Filed: October 10, 2000
    Date of Patent: February 26, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Marty L. Pflum
  • Publication number: 20020016903
    Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.
    Type: Application
    Filed: May 8, 2001
    Publication date: February 7, 2002
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6343359
    Abstract: An apparatus is presented for expediting the execution of dependent micro instructions in a pipeline microprocessor having design characteristics—complexity, power, and timing—that are not significantly impacted by the number of stages in the microprocessor's pipeline. In contrast to conventional result distribution schemes where an intermediate result is distributed to multiple pipeline stages, the present invention provides a cache for storage of multiple intermediate results. The cache is accessed by a dependent micro instruction to retrieve required operands. The apparatus includes a result forwarding cache, result update logic, and operand configuration logic. The result forwarding cache stores the intermediate results. The result update logic receives the intermediate results as they are generated and enters the intermediate results into the result forwarding cache.
    Type: Grant
    Filed: May 18, 1999
    Date of Patent: January 29, 2002
    Assignee: IP-First, L.L.C.
    Inventors: Gerard M. Col, G. Glenn Henry
  • Patent number: 6341347
    Abstract: A processor includes a thread switching control logic that performs a fast thread-switching operation in response to an L1 cache miss stall. The fast thread-switching operation implements one or more of several thread-switching methods. A first thread-switching operation is “oblivious” thread-switching for every N cycle in which the individual flip-flops locally determine a thread-switch without notification of stalling. The oblivious technique avoids usage of an extra global interconnection between threads for thread selection. A second thread-switching operation is “semi-oblivious” thread-switching for use with an existing “pipeline stall” signal (if any). The pipeline stall signal operates in two capacities, first as a notification of a pipeline stall, and second as a thread select signal between threads so that, again, usage of an extra global interconnection between threads for thread selection is avoided.
    Type: Grant
    Filed: May 11, 1999
    Date of Patent: January 22, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: William N. Joy, Marc Tremblay, Gary Lauterbach, Joseph I. Chamdani
  • Patent number: 6341343
    Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.
    Type: Grant
    Filed: April 26, 2001
    Date of Patent: January 22, 2002
    Assignee: Rise Technology Company
    Inventor: Kenneth K. Munson
  • Publication number: 20020007450
    Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor.
    Type: Application
    Filed: March 13, 2001
    Publication date: January 17, 2002
    Inventors: David B. Witt, Thang M. Tran
  • Patent number: 6339822
    Abstract: A microprocessor configured to cache basic blocks of instructions is disclosed. The microprocessor may comprise decoding logic, a basic block cache, and a branch prediction unit. The decoding logic is coupled to receive and decode variable-length instructions into padded instructions that have one of a predetermined number of predetermined lengths. The decoding logic is further configured to form basic blocks of instructions from the padded and decoded instructions. Basic blocks are natural divisions in instruction streams resulting from branch instructions. The start of a basic block is a target of a branch, and the end is another branch instruction. The basic block cache is configured to store the basic blocks in a plurality of storage locations, wherein each storage location is configured to store an address tag, a link bit, and at least a portion of one basic block. The link bit indicates whether the basic block stored in said storage location extends into another storage location.
    Type: Grant
    Filed: October 2, 1998
    Date of Patent: January 15, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Paul K. Miller
  • Patent number: 6336160
    Abstract: A method and system for dividing computer processor registers into sectors and storing frequently used data in the most significant unused sectors. The method includes sector renaming that is performed on each individual sector (i.e., on a sector-by-sector basis) rather than renaming an entire processor register. A register is divided into sectors such that the smallest accessible unit for an instruction in each register can be uniquely addressed and renamed. A register file is divided into sectors so that each process register can be uniquely addressed and renamed. The most significant sectors of the processor registers are used to hold pre-assigned values therein. Data previously loaded into processor register sectors is stored in the most significant sectors of the processor registers for possible future referencing and use. The method also includes establishing a sign-extend memory that includes at least one sign-extend bit in a sector status table.
    Type: Grant
    Filed: June 19, 1998
    Date of Patent: January 1, 2002
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Nadeem Malik, Alan Vicha Pita, Avijit Saha
  • Patent number: 6336182
    Abstract: A method and system for aligning internal operations (IOPs) for dispatch are disclosed. The method and system comprise conditionally asserting a predecode based on a particular dispatch slot that an instruction is going to be placed. The method and system further include using the information related to the predecode to expand an instruction into at least one dummy operation and an IOP operation whenever the instruction would not be supported in the particular dispatch slot.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: January 1, 2002
    Assignee: International Business Machines Corporation
    Inventors: John Edward Derrick, Lee Evan Eisen, Paul Joseph Jordan, Robert William Hay
  • Patent number: 6336178
    Abstract: An internal RISC-type instruction structure furnishes a fixed bit-length template including a plurality of defined bit fields for a plurality of operation (Op) formats. One format includes an instruction-type bit field, two source-operand bit fields and one destination-operand bit field for designating a register-to-register operation. Another format is a load-store format that includes an instruction-type bit field, an identifier of a source or destination register for the respective load or store operation, and bit fields for specifying the segment, base and index parameters of an address.
    Type: Grant
    Filed: September 11, 1998
    Date of Patent: January 1, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventor: John G. Favor
  • Patent number: 6332187
    Abstract: A processor is configured to generate lookahead values using a cumulative constant. The processor classifies operations to a particular register (e.g. the stack pointer register, or ESP in an embodiment employing the x86 instruction set architecture) as either accelerated or non-accelerated. For example, instructions which are defined to increment/decrement the particular register by an explicit or implicit constant value may be accelerated operations. Upon the occurrence of a non-accelerated operation, the processor may begin accumulating the cumulative effect of accelerated operations to the result of the non-accelerated operation as a cumulative offset. The result of the non-accelerated operation (upon execution thereof) may then be added to the cumulative offset values corresponding to each accelerated operation to generate the particular register value corresponding to that accelerated operation. Accordingly, dependencies upon the register due to the accelerated operations may be alleviated.
    Type: Grant
    Filed: March 8, 2001
    Date of Patent: December 18, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6330661
    Abstract: A register content inheriting system contributes for realization of register content inheriting with a hardware of simple construction in a multithread multi-processor. Respective thread execution units and physical common register are provided. Using a register mapping table, a register number to be made reference to from each program is placed in the physical common register. Only as required in inheriting of register content, a relationship of the register mapping table is updated. Upon inheriting the content of the register, the content of the register mapping table is copied.
    Type: Grant
    Filed: April 26, 1999
    Date of Patent: December 11, 2001
    Assignee: NEC Corporation
    Inventor: Sunao Torii
  • Patent number: 6330660
    Abstract: An application specific signal processor (ASSP) performs vectorized and nonvectorized operations. Nonvectorized operations may be performed using a saturated multiplication and accumulation operation. The ASSP includes a serial interface, a buffer memory, a core processor for performing digital signal processing which includes a reduced instruction set computer (RISC) processor and four signal processing units. The four signal processing units execute the digital signal processing algorithms in parallel including the execution of the saturated multiplication and accumulation operation. The ASSP is utilized in telecommunication interface devices such as a gateway. The ASSP is well suited to handling voice and data compression/decompression in telecommunication systems where a packetized network is used to transceive packetized data and voice.
    Type: Grant
    Filed: October 25, 1999
    Date of Patent: December 11, 2001
    Assignee: VxTel, Inc.
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Patent number: 6330657
    Abstract: An apparatus and method are presented for increasing the throughput within a single-channel of a pipeline microprocessor. Back-to-back pairs of micro instructions are evaluated to determine if they can be combined for execution in parallel. If so, then they are combined and issued for concurrent execution. The apparatus includes a micro instruction queue that buffers and orders micro instructions for sequential execution by the pipeline microprocessor. Within the micro instruction queue, a second micro instruction is ordered to execute immediately following execution of a first micro instruction. Pairing logic is coupled to the micro instruction queue. The pairing logic combines the first and second micro instructions so that the first and second micro instructions are executed in parallel by the pipeline microprocessor.
    Type: Grant
    Filed: May 18, 1999
    Date of Patent: December 11, 2001
    Assignee: IP-First, L.L.C.
    Inventors: Gerard M. Col, G. Glenn Henry
  • Patent number: 6324639
    Abstract: A processor can decode short instructions with a word length equal to one unit field and long instructions with a word length equal to two unit fields. An opcode of each kind of instruction is arranged into the first unit field assigned to the instruction. The number of instructions to be executed by the processor in parallel is s. When the ratio of short to long instructions is s-1:1, the s-1 short instructions are assigned to the first unit field to the s-1th unit field in the parallel execution code, and the long instruction is assigned to the sth unit field to the (s+k−1)th unit field in the same parallel execution code.
    Type: Grant
    Filed: March 29, 1999
    Date of Patent: November 27, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Taketo Heishi, Tetsuya Tanaka, Nobuo Higaki, Shuishi Takayama, Kensuke Odani
  • Publication number: 20010042189
    Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.
    Type: Application
    Filed: February 20, 2001
    Publication date: November 15, 2001
    Inventors: Boris A. Babaian, Yuli Kh Sakhin, Vladimir Yu Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
  • Patent number: 6313766
    Abstract: A method and apparatus to accelerate variable length decode is disclosed. The system includes a logic device to receive a bit stream of variable length encoded information. The logic device outputs a fixed length value corresponding to a variable length code received as part of the bit stream of the variable length encoded information. The system also includes a processor to receive the fixed length value. The processor to performs a write of a coefficient to a system memory device, the coefficient corresponding to the fixed length value received from the logic device.
    Type: Grant
    Filed: July 1, 1998
    Date of Patent: November 6, 2001
    Assignee: Intel Corporation
    Inventors: Brian K. Langendorf, Brian Tucker
  • Patent number: 6311261
    Abstract: The invention involves new microarchitecture apparatus and methods for superscalar microprocessors that support multi-instruction issue, decoupled dataflow scheduling, out-of-order execution, register renaming, multi-level speculative execution, and precise interrupts. These are the Distributed Instruction Queue (DIQ) and the Modified Reorder Buffer (MRB). The DIQ is a new distributed instruction shelving technique that is an alternative to the reservation station (RS) technique and offers a more efficient (improved performance/cost) implementation. The Modified Reorder Buffer (MRB) is an improved reorder buffer (RB) result shelving technique eliminates the slow and expensive prioritized associative lookup, shared global buses, and dummy branch entries (to reduce entry usage). The MRB has an associateive key unit which uses a unique associative key.
    Type: Grant
    Filed: September 15, 1997
    Date of Patent: October 30, 2001
    Assignee: Georgia Tech Research Corporation
    Inventors: Joseph I. Chamdani, Cecil O. Alford