Abstract: A processing system provides predicate data that indicates whether instructions processed by a processor pipeline should be executed by the pipeline. In architecture, the system of the present invention utilizes a register, a pipeline, and predicate circuitry. The pipeline includes a first stage and a second stage for processing instructions of a computer program. The predicate circuitry is configured to read a first predicate value from the register and to receive a second predicate value. The predicate circuitry may transmit the first predicate value read from the register to the first stage and then select between the first predicate value and the second predicate value. The predicate value selected by the predicate circuitry is transmitted to the second stage.
Type:
Grant
Filed:
January 24, 2000
Date of Patent:
September 16, 2003
Assignee:
Hewlett-Packard Development Company, L.P.
Inventors:
Gary J Benjamin, Donald Charles Soltis, Jr., Ronny Lee Arnold
Abstract: A processing system receives instructions from a computer program. Each instruction is included within an issue group such that each issue group only includes instructions that may be simultaneously processed. The issue groups are then sequentially transmitted to a plurality of pipelines that simultaneously processes and executes the instructions within the issue groups in program order. During execution, the instructions within an issue group are analyzed to determine whether any of the instructions in the issue group is dependent on unavailable data. Any of the instructions in the issue group determined to be dependent on unavailable data are independently stalled, while execution of other instructions in the issue group is allowed to continue.
Type:
Grant
Filed:
September 7, 1999
Date of Patent:
September 9, 2003
Assignee:
Hewlett-Packard Company, L.P.
Inventors:
Ronny Lee Arnold, Donald Charles Soltis, Jr.
Abstract: In a computer system the instruction decoding unit for translating program instructions to microcode instructions operates dynamically. Thus the unit receives state signals indicating the state of the computer, such as a trace enabling signal, influencing the translation process in the instruction decoding unit. These state signals are added to the operation code of the program instruction to be decoded, the operation code of the program instruction thus being extended and used as input to a translating table, the extended operation code of the program instruction being taken as an address of a field in the table. The addresses and thus the contents of the fields addressed for the same operation code of a program instruction can then be different for different values of the state signals. Thus generally, the state signals cause the instruction decoder to change its translating algorithm so that the decoder can decode an operation code differently depending on the state which the signals adopt.
Type:
Grant
Filed:
December 1, 1998
Date of Patent:
August 26, 2003
Assignee:
Telefonaktiebolaget LM Ericsson (publ)
Inventors:
Tobias Roos, Dan Halvarsson, Tomas Jonsson
Abstract: A parallel hardware-based multithreaded processor is described. The processor includes a general purpose processor that coordinates system functions and a plurality of microengines that support multiple hardware threads. The processor also includes a memory control system that has a first memory controller that sorts memory references based on whether the memory references are directed to an even bank or an odd bank of memory and a second memory controller that optimizes memory references based upon whether the memory references are read references or write references.
Type:
Grant
Filed:
August 31, 1999
Date of Patent:
August 12, 2003
Assignee:
Intel Corporation
Inventors:
Matthew J. Adiletta, Gilbert Wolrich, William Wheeler
Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.
Type:
Grant
Filed:
October 20, 1999
Date of Patent:
August 5, 2003
Assignee:
Transmeta Corporation
Inventors:
Brett Coon, Godfrey D'Souza, Paul Serris
Abstract: A bypass logic circuit (30) generates select signals (SelRs0, SelRt0, SelRs1 and SelRt1) by using prediction result flags (PrdNTkn1A and PrdNTkn1D) which are results of prediction about branch, instead of a branch condition not-taken signal (NTknA) actually output from a branch unit (52). Bypass multiplexers (44, 46, 54, 56) select operands to be output to ALU (42) or the branch unit (52) on the basis of these select signals (SelRs0, SelRt0, SelRs1 and SelRt1). Therefore, ample time is given for generating these select signals (SelRs0, SelRt0, SelRs1 and SelRt1).
Abstract: A method of reducing the branch penalty in a microprocessor includes predecoding the instruction to determine whether an instruction is a branch, the length of the instruction, and prediction marker information for the instruction should it be a branch. The target of the branch is relayed to the align stage of the microprocessor to readjust the read pointer to point to the target of the branch if the instruction is a branch. An apparatus for reducing the branch penalty in a microprocessor includes a branch predecode and taken resolution unit which determines whether an instruction is a predicted taken branch, and relays that information to the align stage of the microprocessor to deliver the target of the branch to the align stage as early as possible.
Abstract: Enables a processor to quickly recover reliable use of a multi-cycle index used in a branch prediction mechanism for certain types of flush events occurring in the processor pipeline, whether the flush event occurs for a non-branch instruction or for a branch instruction contained in the same dispatch group. A GHV (global history vector) value is used in the generation of a multi-cycle index required for locating a prediction in a GBHT (global branch history table) for the instruction associated with the GHV value. The GHV value is captured in a BIQ (branch information queue) element representing each branch instruction selected for execution of a program. The BIQ element also captures an associated GHV count when the GHV value is captured.
Type:
Grant
Filed:
November 8, 1999
Date of Patent:
July 22, 2003
Assignee:
International Business Machines Corporation
Abstract: A processor (100) is provided that is a programmable digital signal processor (DSP) with variable instruction length. A user stack region (910) is used to pass variables to a subroutine and to hold values representative of a first portion of a program counter (1000). A system stack region (911) is used to hold values representative of a remaining portion of the program counter (1001) and to hold additional context information. The user stack region and the system stack region are managed independently so that software from a prior generation processor can be translated to run on processor (100).
Type:
Grant
Filed:
October 1, 1999
Date of Patent:
July 22, 2003
Assignee:
Texas Instruments Incorporated
Inventors:
Gilbert Laurenti, Walter A. Jackson, Jack Rosenzweig
Abstract: A pipelined data processor has instructions at different stages of execution. Some of the instructions specify virtual addresses into a file of registers having physical addresses. A speculative translator maps the virtual registers of an instruction at one pipeline stage into physical registers for speculative use by the instruction at a later pipeline stage. The registers have multiple differently translated regions. Failure of speculative renaming reverts to an archive copy of renaming data.
Type:
Grant
Filed:
December 31, 1998
Date of Patent:
July 8, 2003
Assignee:
Intel Corporation
Inventors:
David Hass, Michael P. Corwin, Luke E. Girard, Ken Arora, Harshvardhan Sharangpani, Syed Reza
Abstract: A system for designing and implementing digital integrated circuits utilizing a set of synchronized sequencers that permit quick and efficient parallel processing of system level designs. The system and method converts digital schematics and hardware description language (HDL) based designs into a set of logic equations and single bit arithmetic-logic operations executed by a set of parallel operating sequencers. The system includes software for converting netlists and HDL designs into Boolean logic equations, and a compiler for distributing these logic equations between multiple sequencers. Each sequencer is comprised of a logic processor and the associated program memory for storing the executable code of the assigned Boolean logic equations and data memory for storing the results of processing of logic equations. To synchronize execution of logic equations by multiple sequencers, all program memories are addressed by one common address register.
Abstract: A floating-point unit of a computer includes a floating-point computation unit, floating-point registers and a floating-point status register. The floating-point status register may include a main status field and one or more alternate status fields. Each of the status fields contains flag and control information. Different floating-point operations may be associated with different status fields. Subfields of the floating-point status register may be updated dynamically during operation. The control bits of the alternate status fields may include a trap disable bit for deferring interruptions during speculative execution. A widest range exponent control bit in the status fields may be used to prevent interruptions when the exponent of an intermediate result is within the range of the register format but exceeds the range of the memory format. The floating-point data may be stored in big endian or little endian format.
Type:
Grant
Filed:
October 10, 1998
Date of Patent:
June 10, 2003
Assignee:
Institute for the Development of Emerging Architectures,
L.L.C.
Inventors:
Jerome C. Huck, Peter Markstein, Glenn T. Colon-Bonet, Alan H. Karp, Roger Golliver, Michael Morrison, Gautam B. Doshi
Abstract: A communication control system and method of implementing a high-speed data transmitting and receiving process uses a communication protocol program, a protocol buffer which is managed by the communication protocol program, a communication controller for controlling a transmission line and a network buffer managed by the communication controller.
Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.
Type:
Grant
Filed:
August 1, 2001
Date of Patent:
May 27, 2003
Assignee:
Nintendo Co., Ltd.
Inventors:
Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
Abstract: A data transfer circuit or a recording apparatus includes an address setting unit for setting a start address for a buffer memory, an offset setting unit for setting an offset for the buffer memory, and an address creating unit for creating a predetermined number of consecutive transfer addresses to be supplied for the buffer memory using a reference address. The circuit also includes an arithmetic logic unit that, after the address creating unit has created transfer addresses using the start address as a reference address, calculates a new reference address in accordance with the offset relative to the start address so as to provide the new reference address to the address creating unit.
Abstract: An apparatus is provided for detecting instruction ordering dependencies. The apparatus includes a plurality of address comparators. Each comparator including a first input adapted to receive a first operand address from one of a plurality of instructions; a second input adapted to receive a second operand address from a second one of a plurality of instructions; and an output to transmit a logic signal responsive to a match between the first and second operand addresses. The address comparators receive the first operand address from a respective, different ones of the plurality of instructions; and a hardware structure to receive the match indications from the address comparators and to indicate a dependency responsive to the match indications from a first one and a second one of the address comparators. A method is provided for detecting instruction dependencies.
Abstract: A processor performs precise trap handling for out-of-order and speculative load instructions. It keeps track of the age of load instructions in a shared scheme that includes a load buffer and a load annex. All precise exceptions are detected in a T phase of a load pipeline. Data and control information concerning load operations that hit in the data cache are staged in a load annex during the A1, A2, A3, and T pipeline stages until all exceptions in the same or earlier instruction packet are detected. Data and control information from all other load instructions is staged in the load annex after the load data is retrieved. Before the load data is retrieved, the load instruction is kept in a load buffer. If an exception occurs, any load in the same instruction packet as the instruction causing the exception is canceled. Any load instructions that are “younger” than the instruction that caused the exception are also canceled.
Type:
Grant
Filed:
October 1, 1999
Date of Patent:
April 1, 2003
Assignee:
Sun Microsystems, Inc.
Inventors:
Marc Tremblay, Jeffrey Meng Wah Chan, Subramania Sudharsanan, Sharada Yeluri, Biyu Pan
Abstract: Method and apparatus for reducing or eliminating retirement logic in an out-of-order processor are disclosed. Instructions are processed using a processing unit capable of out-of-order processing and having architectural registers having an architectural state. Groups of instructions are prepared for processing by processing unit, wherein within each group to be processed the instructions producing the final state of an architectural register are changed so that they write to an output copy of the architectural state, the instructions reading architectural registers are changed to read from an input copy of the architectural state, and the instructions within each group producing results to architectural registers that would be overwritten by another instruction in the group are changed to write their results to temporary registers.
Abstract: A data processing apparatus has a first processing unit for processing an input data, a second processing unit responsive to the data processed by the first processing unit for executing a processing dependent on the data and producing a display data, and a display unit having a display drive unit and a display device for displaying the display data. The second processing unit is selectively inactivated and activated under control of the first processing unit to reduce power consumption in the second processing unit. The display drive unit is also selectively inactivated and activated under control of the first processing unit to reduce power consumption in the display unit.
Type:
Grant
Filed:
May 30, 2000
Date of Patent:
March 18, 2003
Assignee:
Matsushita Electric Industrial Co., Ltd.
Abstract: A processing device (10) provides general-purpose input/output pins (52) for use by software routines as needed. A data input register (54) has bits corresponding to each pin (52) for storing the value of the signal on the pin. A data output register (56) has bits corresponding to each pin for driving the signal on the pin (52) to a desired value. An output enable register (58) controls output buffers (62) coupled between the output register (56) and the pins (52). A plurality of mask registers (60) may be individually set to define a set a pins associated with the mask. Each of the data registers, the data input register (56), the data output register (58) and the output enable register (60) are accessed through a plurality of addresses, where the address specifies both the data register being accessed and an associated mask register (60). Logic (50) accesses the data registers in view of the state of the associated mask register (60).
Type:
Grant
Filed:
November 29, 1999
Date of Patent:
March 11, 2003
Assignee:
Texas Instruments Incorporated
Inventors:
Amarjit S. Bhandal, Graham Short, Richard Simpson