Long Instruction Word Patents (Class 712/24)
  • Patent number: 6738893
    Abstract: A process for scheduling computer processor execution of operations in a plurality of instruction word formats including the steps of arranging commands into properly formatted instruction words beginning at one end into a sequence selected to provide the most rapid execution of the operations, and then rearranging the operations within the plurality of instruction words from the other end of the sequence into instruction words selected to occupy the least space in memory.
    Type: Grant
    Filed: April 25, 2000
    Date of Patent: May 18, 2004
    Assignee: Transmeta Corporation
    Inventor: Guillermo J. Rozas
  • Patent number: 6728846
    Abstract: Atomic multiple word writes are provided when emulating a target system that supports atomic multiple word writes on a host system that does not. For each except the last word to be written, a gate flag is read, tested, and locked when it is found unlocked. The words are then written to memory in reverse order, unlocking the gate flags as they are written. In a host system with a longer word size than the target system, the gate flags can be stored in otherwise unused bits in the host system words containing the target system words to be written.
    Type: Grant
    Filed: December 22, 2000
    Date of Patent: April 27, 2004
    Assignee: Bull HN Information Systems Inc.
    Inventor: Bruce A. Noyes
  • Patent number: 6728865
    Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.
    Type: Grant
    Filed: October 20, 1999
    Date of Patent: April 27, 2004
    Assignee: Transmeta Corporation
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
  • Patent number: 6725357
    Abstract: A system comprises: a first execution unit, a second execution unit and a third execution unit; a first-in-first-out memory arranged to receive a plurality of instructions for the first to third execution units and to output the instructions to the execution units; a memory store for storing at least one instruction for one of the execution units, the at least one instruction being received from the first-in-first-out memory, the first and second execution units being arranged to receive their instructions from the first-in-first-out memory and the third execution unit being arranged to receive the instructions from the memory store, wherein a given instruction for the third execution unit is available to the third execution unit at substantially the same time that the instruction would be available to the first or second execution unit if that instruction was for the first or second execution unit.
    Type: Grant
    Filed: May 2, 2000
    Date of Patent: April 20, 2004
    Assignee: STMicroelectronics S.A.
    Inventor: Jean-Philippe Cousin
  • Patent number: 6725356
    Abstract: The present invention provides a system and method for improving the performance of general purpose processors by expanding at least one source operand to a width greater than the width of either the general purpose register or the data path width. In addition, the present invention provides several classes of instructions which cannot be performed efficiently if the operands are limited to the width and accessible number of general purpose registers. The present invention provides operands which are substantially larger than the data path width of the processor by using a general purpose register to specify a memory address from which at least more than one, but typically several data path widths of data can be read. The present invention also provides for the efficient usage of a multiplier array that is fully used for high precision arithmetic, but is only partly used for other, lower precision operations.
    Type: Grant
    Filed: August 2, 2001
    Date of Patent: April 20, 2004
    Assignee: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 6721875
    Abstract: Disclosed is a computer architecture with single-syllable IP-relative branch instructions and long IP-relative branch instructions (IP=instruction pointer). The architecture fetches instructions in multi-syllable, bundle form. Single-syllable IP-relative branch instructions occupy a single syllable in an instruction bundle, and long IP-relative branch instructions occupy two syllables in an instruction bundle. The additional syllable of the long branch carries with it additional IP-relative offset bits, which when merged with offset bits carried in a core branch syllable provide a much greater offset than is carried by a single-syllable branch alone. Thus, the long branch provides for greater reach within an address space. Use of the long branch to patch IA-64 architecture instruction bundles is also disclosed. Such a patch provides the reach of an indirect branch with the overhead of a single-syllable IP-relative branch.
    Type: Grant
    Filed: February 22, 2000
    Date of Patent: April 13, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: James E McCormick, Jr., Stephen R. Undy, Donald Charles Soltis, Jr.
  • Patent number: 6721873
    Abstract: A method and apparatus for improving dispersal performance of instruction threads is described. In one embodiment, the dispersal logic determines whether the instructions supplied to it include any NOP instructions. When a NOP instruction is detected, the dispersal logic places the NOP into a no-op port for execution. All other instructions are distributed to the proper execution pipes in a normal manner. Because the NOP instructions do not use the execution resources of other instructions, all instruction threads can be executed in one cycle.
    Type: Grant
    Filed: December 29, 2000
    Date of Patent: April 13, 2004
    Assignee: Intel Corporation
    Inventors: Sailesh Kottapalli, Udo Walterscheidt, Andrew Sun, Thomas Yeh, Kinkee Sit
  • Publication number: 20040059894
    Abstract: The program to be executed is compiled by translating it into native instructions of the instruction-set architecture of the processor system, organizing the instructions deriving from the translation of the program into respective bundles in an order of successive bundles, each bundle grouping together instructions adapted to be executed in parallel by the processor system. The bundles of instructions are ordered into respective sub-bundles, said sub-bundles identifying a first set of instructions, which must be executed before the instructions belonging to the next bundle of said order, and a second set of instructions, which can be executed both before and in parallel with respect to the instructions belonging to said subsequent bundle of said order.
    Type: Application
    Filed: July 1, 2003
    Publication date: March 25, 2004
    Applicant: STMicroelectronics S.r.I.
    Inventors: Fabrizio Simone Rovati, Antonio Maria Borneo, Danilo Pietro Pau
  • Patent number: 6711670
    Abstract: A superscalar processing system that detects data hazards within instruction groups utilizes a memory, a plurality of pipelines, an instruction dispersal unit (IDU), and a control mechanism. The memory includes a plurality of entries that respectively correspond with a plurality of registers. The IDU receives an instruction group that includes a plurality of instructions and transmits the instructions of the instruction group to the plurality of pipelines. The control mechanism analyzes one of the instructions and identifies an entry in the memory that corresponds with a register associated with the one instruction. The control mechanism then analyzes the entry and transmits a warning signal in response to a determination that the entry indicates that another instruction within the instruction group is associated with the register.
    Type: Grant
    Filed: October 14, 1999
    Date of Patent: March 23, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Donald Charles Soltis, Jr., Ronny Lee Arnold
  • Patent number: 6704859
    Abstract: A compressed instruction format for a VLIW processor allows greater efficiency in use of cache and memory. Instructions are byte aligned and variable length. Branch targets are uncompressed. Format bits specify how many issue slots are used in a following instruction. NOPS are not stored in memory. Individual operations are compressed according to features such as whether they are resultless, guarded, short, zeroary, unary, or binary. Instructions are stored in compressed form in memory and in cache. Instructions are decompressed on the fly after being read out from cache.
    Type: Grant
    Filed: August 4, 1998
    Date of Patent: March 9, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Eino Jacobs, Michael Ang
  • Patent number: 6704857
    Abstract: The ManArray processor is a scalable indirect VLIW array processor that defines two preferred architectures for indirect VLIW memories. One approach treats the VIM as one composite block of memory using one common address interface to access any VLIW stored in the VIM. The second approach treats the VIM as made up of multiple smaller VIMs each individually associated with the functional units and each individually addressable for loading and reading during XV execution. The VIM memories, contained in each processing element (PE), are accessible by the same type of LV and XV Short Instruction Words (SIWs) as in a single processor instantiation of the indirect VLIW architecture. In the ManArray architecture, the control processor, also called a sequence processor (SP), fetches the instructions from the SIW memory and dispatches them to itself and the PEs. By using the LV instruction, VLIWs can be loaded into VIMs in the SP and the PEs.
    Type: Grant
    Filed: December 22, 2000
    Date of Patent: March 9, 2004
    Assignee: PTS Corporation
    Inventors: Edwin Frank Barry, Gerald G. Pechanek
  • Patent number: 6704862
    Abstract: One embodiment of the present invention provides a system that supports exception handling through use of a conditional trap instruction. The system supports a head thread that executes program instructions and a speculative thread that speculatively executes program instructions in advance of the head thread. During operation, the system uses the speculative thread to execute code, which includes an instruction that can cause an exception condition. After the instruction is executed, the system determines if the instruction caused the exception condition. If so, the system writes an exception condition indicator to a register. At some time in the future, the system executes a conditional trap instruction which examines a value in the register. If the value in the register is an exception condition indicator, the system executes a trap handling routine to handle the exception condition. Otherwise, the system proceeds with execution of the code.
    Type: Grant
    Filed: June 9, 2000
    Date of Patent: March 9, 2004
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Marc Tremblay
  • Publication number: 20040030860
    Abstract: A VLIW processor for executing a sequence of very long instruction words having a plurality of operations to be executed in parallel. The VLIW processor has a plurality of functional units for parallel execution of the operations specified by the VLIW, an instruction register for holding the VLIW, and a condition flag for indicating the results of a comparison operation. The VLIW includes a conditional head and a plurality of slots, each slot including an operational code and any related operands. The conditional head has a plurality of conditional indicators, each conditional indicator uniquely corresponding to one operation and specifying a condition in which the operation is to be executed if the indicated condition exists. A control circuit is connected to the instruction register and the functional units to deliver the operation from the instruction register to the corresponding functional unit for execution when the condition exists.
    Type: Application
    Filed: August 8, 2002
    Publication date: February 12, 2004
    Inventor: Yu-Min Wang
  • Publication number: 20040019766
    Abstract: Multiple instructions, specifying equivalent operations but designating different execution units, are stored beforehand on an instruction exchange table. First, a primary compiler compiles a source program into a set of machine-readable instructions. From the set of instructions, an instruction parallelizer generates a set of long instruction words. Specifically, an instruction identifier identifies one of the instructions in the set with one of the instructions stored on the instruction exchange table. Then, an instruction replacer replaces the instruction in question with another one of the instructions that is also stored on the instruction exchange table, specifies an equivalent operation but designates a different execution unit as a target. In this manner, the number of parallelly executable instructions can be increased, while the number of no-operation instructions can be reduced, thus generating a parallelized instruction set at a higher level of parallelism.
    Type: Application
    Filed: July 18, 2003
    Publication date: January 29, 2004
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventor: Kenichi Kawaguchi
  • Patent number: 6684319
    Abstract: The present invention minimizes power consumption and processing time in a very long instruction word digital signal processor by identifying certain blocks of instructions and placing them in a small, fast buffer for subsequent retrieval and execution. A decoder unit decodes a prefetch instruction flag bit that indicates when instructions are to be prefetched and placed into the buffer. The decoder unit signals a control unit, which sends the instruction code from a memory unit to the buffer and maintains an address mapping table and a program counter. The control unit also sets a select input on a multiplexer to indicate that the multiplexer is to output the prefetch instructions it receives from the buffer. The multiplexer outputs the prefetch instructions to an instruction register that sends the prefetch instructions to appropriate functional units for execution.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: January 27, 2004
    Assignee: Conexant Systems, Inc.
    Inventors: Moataz A. Mohamed, Keith M. Bindloss
  • Patent number: 6684320
    Abstract: An apparatus and method for issue grouping of instructions in a VLIW processor is disclosed. There can be one, two, or three issue groups (but no greater than three issue groups) in each VLIW packet. In one embodiment, a template in the VLIW packet comprises two issue group end markers where each issue group end marker comprises three bits. The three bits in the first issue group end marker identifies the instruction which is the last instruction in the first issue group. Likewise, the three bits in the second issue group end marker identifies the instruction which is the last instruction in the second issue group. Any instructions in the VLIW packet falling outside the two expressly defined first and second issue groups are placed in a third issue group. As such, three issue groups can be identified by use of the two issue group end markers. In one embodiment, the template of the VLIW packet includes a chaining bit.
    Type: Grant
    Filed: February 28, 2002
    Date of Patent: January 27, 2004
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Moataz A Mohamed, Chien-Wei Li, John R. Spence
  • Patent number: 6665791
    Abstract: A method and apparatus are disclosed for releasing functional units in a multithreaded very large instruction word (VLIW) processor. The functional unit release mechanism can retrieve the capacity lost due to multiple cycle instructions. The functional unit release mechanism of the present invention permits idle functional units to be reallocated to other threads, thereby improving workload efficiency. Instruction packets are assigned to functional units, which can maintain their state, independent of the issue logic. Each functional unit has an associated state machine (SM) that keeps track of the number of cycles that the functional unit will be occupied by a multiple-cycle instruction. Functional units do not reassign themselves as long as the functional unit is busy. When the instruction is complete, the functional unit can participate in functional unit allocation, even if other functional units assigned to the same thread are still busy.
    Type: Grant
    Filed: March 30, 2000
    Date of Patent: December 16, 2003
    Assignee: Agere Systems Inc.
    Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
  • Patent number: 6658655
    Abstract: A threaded interpreter (916) is suitable for executing a program comprising a series of program instructions stored in a memory (904). For the execution of a program instruction the threaded interpreter includes a preparatory unit (918) for executing a plurality of preparatory steps making the program instruction available in the threaded interpreter, and an execution unit (920) with one or more machine instructions emulating the program instruction. According to the invention, the threaded interpreter is designed such that during the execution on an instruction-level parallel processor of the series of program instructions machine instructions implementing a first one of the preparatory steps are executed in parallel with machine instructions implementing a second one of the preparatory steps for respective ones of the series of program instructions.
    Type: Grant
    Filed: December 6, 1999
    Date of Patent: December 2, 2003
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Jan Hoogerbrugge, Alexander Augusteijn
  • Patent number: 6658551
    Abstract: A method and apparatus are disclosed for allocating functional units in a multithreaded very large instruction word (VLIW) processor. The present invention combines the techniques of conventional very long instruction word (VLIW) architectures and conventional multithreaded architectures to reduce execution time within an individual program, as well as across a workload. The present invention utilizes instruction packet splitting to recover some efficiency lost with conventional multithreaded architectures. Instruction packet splitting allows an instruction bundle to be partially issued in one cycle, with the remainder of the bundle issued during a subsequent cycle. There are times, however, when instruction packets cannot be split without violating the semantics of the instruction packet assembled by the compiler. A packet split identification bit is disclosed that allows hardware to efficiently determine when it is permissible to split an instruction packet.
    Type: Grant
    Filed: March 30, 2000
    Date of Patent: December 2, 2003
    Assignee: Agere Systems Inc.
    Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
  • Patent number: 6654869
    Abstract: A microprocessor includes a fetch unit, an instruction cracking unit, and dispatch and completion control logic. The fetch unit retrieves a set of instructions from an instruction cache. The instruction cracking unit receives the set of fetched instructions and organizes the set of instructions into an instruction group. The dispatch and completion logic assigns a group tag to the instruction group and records the group tag in an entry of the completion table for tracking the completion status of the instructions comprising the instruction group. The dispatch and control logic may record a single instruction address in the completion table entry corresponding to the each instruction group. Preferably, the single instruction address is the instruction address of the first instruction in the instruction group. The processor may flush the instruction group in response to detecting an exception generated by an instruction in the instruction group.
    Type: Grant
    Filed: October 28, 1999
    Date of Patent: November 25, 2003
    Assignee: International Business Machines Corporation
    Inventors: James Allan Kahle, Hung Qui Le, Charles Roberts Moore
  • Patent number: 6654870
    Abstract: Port priorities are defined on a 32-bit word, 16-bit half-word, and 8-bit byte basis to control the write enable signals to a compute register file (CRF). With a manifold array (ManArray) reconfigurable register file, it is possible to have double-word 64-bit and single word 32-bit data-type instructions mixed with other double-word, single-word, half-word, or byte data-type instructions within the same very long instruction word (VLIW). By resolving a write priority conflict on the byte, half-word, or word that is in conflict during the VLIW execution, it is possible to have partial operations complete that provide a useful function. For. example, a load half-word to the half-word H0 portion of a 32-bit register R0 can have priority to complete its operation while a 64-bit shift of the register pair R0 and R1 will complete its operation on the non-conflicting half-word portions of the 64-bit register R0 and R1.
    Type: Grant
    Filed: June 21, 2000
    Date of Patent: November 25, 2003
    Assignee: PTS Corporation
    Inventors: Edwin Frank Barry, Edward A. Wolff, Patrick Rene Marchand, David Carl Strube
  • Patent number: 6643768
    Abstract: A dyadic digital signal processing (DSP) instruction processor including a first DSP functional block to execute a main operation of a dyadic DSP instruction and a second DSP functional block to execute a sub operation of the dyadic DSP instruction with data paths of each selectively configured to execute the main operation and the sub operation of the dyadic DSP instruction. A voice and data communication system has a first gateway and a second gateway coupled to a packetized network, each gateway having a network interface including the dyadic DSP instruction processor. An application specific signal processor with a signal processor having a first DSP functional block to execute a main operation of a dyadic DSP instruction and a second DSP functional block to execute a sub operation with multiplexers coupled to the first DSP functional block and the second DSP functional block to selectively configure data paths thereto.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: November 4, 2003
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Publication number: 20030200420
    Abstract: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs.
    Type: Application
    Filed: April 28, 2003
    Publication date: October 23, 2003
    Applicant: PTS Corporation
    Inventors: Gerald G. Pechanek, Edwin F. Barry, Juan Guillermo Revilla, Larry D. Larsen
  • Patent number: 6631461
    Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The instruction set architecture implemented with the ASSP, is adapted to DSP algorithmic structures. The instruction word of the ISA is typically 20 bits but can be expanded to 40-bits to control two instructions to be executed in series or parallel. All DSP instructions of the ISA are dyadic DSP instructions performing two operations with one instruction in one cycle. The DSP instructions or operations in the preferred embodiment include a multiply instruction (MULT), an addition instruction (ADD), a minimize/maximize instruction (MIN/MAX) also referred to as an extrema instruction, and a no operation instruction (NOP) each having an associated operation code (“opcode”). The present invention efficiently executes DSP instructions by means of the instruction set architecture and the hardware architecture of the application specific signal processor.
    Type: Grant
    Filed: August 8, 2002
    Date of Patent: October 7, 2003
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Patent number: 6622234
    Abstract: Techniques for adding more complex instructions and their attendant multi-cycle execution units with a single instruction multiple data, stream (SIMD) very long instruction word (VLIW) processing framework are described. In one aspect, an initiation mechanism also acts as a resynchronization mechanism to read the results of multi-cycle execution. This multi-purpose mechanism operates with a short instruction word (SIW) issue of the multi-cycle instruction, in a sequence processor (SP) alone, with a VLIW, and across all processing elements (PEs) individually or as an array of PEs. A number of advantageous floating point instructions are also described.
    Type: Grant
    Filed: June 21, 2000
    Date of Patent: September 16, 2003
    Assignee: PTS Corporation
    Inventors: Gerald G. Pechanek, David Carl Strube, Edward A. Wolff, Edwin Frank Barry, Grayson Morris, Carl Donald Busboom, Dale Edward Schneider
  • Patent number: 6615338
    Abstract: A Very Long Instruction Word (VLIW) processor has a clustered architecture including a plurality of independent functional units and a multi-ported register file that is divided into a plurality of separate register file segments, the register file segments being individually associated with the plurality of independent functional units. The functional units access the respective associated register file segments using read operations that are local to the functional unit/ register file segment pairs. In contrast, the functional units access the register file segments using write operations that are broadcast to a plurality of register file segments. Independence between clusters is attained since the separate clustered functional unit/ register file segment pairs have local (internal) bypassing that allows internal computations to proceed, but have only limited bypassing between different functional unit/ register file segment pair clusters.
    Type: Grant
    Filed: December 3, 1998
    Date of Patent: September 2, 2003
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, William Joy
  • Patent number: 6615339
    Abstract: A VLIW processor includes an instruction decode unit selecting one of parallel execution and consecutive execution and decoding a plurality of operation instructions included in an instruction word, and a program counter control unit controlling a value of a program counter for providing an indication for the instruction decode unit to provide as no-operation an operation instruction provided in a consecutive execution and executed prior to an operation instruction executed during a consecutive execution when branching to the operation instruction executed during the consecutive execution is introduced. This renders it possible to branch to an operation instruction executed during a consecutive execution and thus provide an enhanced efficiency of instruction-code compression.
    Type: Grant
    Filed: January 18, 2000
    Date of Patent: September 2, 2003
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hironobu Ito, Hisakazu Sato
  • Publication number: 20030163669
    Abstract: The invention provides a processor that processes bundles of instructions preferentially through clusters or execution units according to thread characteristics. The cluster architectures of the invention preferably include capability to process “multi-threaded” instructions. Selectively, the architecture either (a) processes singly-threaded instructions through a single cluster to avoid bypassing and to increase throughput, or (b) processes singly-threaded instructions through multiple processes to increase “per thread” performance. The architecture may be “configurable” to operate in one of two modes: in a “wide” mode of operation, the processor's internal clusters collectively process bundled instructions of one thread of a program at the same time; in a “throughput” mode of operation, those clusters independently process instruction bundles of separate program threads.
    Type: Application
    Filed: February 27, 2002
    Publication date: August 28, 2003
    Inventor: Eric DeLano
  • Publication number: 20030154358
    Abstract: Apparatus and method for dispatching a very long instruction word (VLIW) instruction having a variable length are provided. The apparatus for dispatching a VLIW instruction includes a packet buffer for storing at least one or more VLIW instructions, and a decoding unit configured to constitute a VLIW instruction to be currently executed among the VLIW instructions stored in the packet buffer and decode predetermined bits of each sub-instruction contained in the VLIW instruction. The apparatus dispatches a corresponding sub-instruction to an FU which corresponds to each sub-instruction, based on the results of decoding performed in the decoding unit, position information on the sub-instructions that are placed on the packet buffer, and position information on the sub-instructions that are placed in the current VLIW instruction. Sub-instructions can be effectively dispatched to corresponding FUs using simple decoding logic even in a case where the length of the VLIW instruction is not fixed.
    Type: Application
    Filed: December 3, 2002
    Publication date: August 14, 2003
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Nak-Hee Seong, Kyoung-Mook Lim, Seh-Woong Jeong, Jae-Hong Park, Hyung-Jun Im, Gun-Young Bae, Young-Duck Kim
  • Patent number: 6606699
    Abstract: An apparatus for concurrently executing controller single instruction single data (SISD) instructions and single instruction multiple data (SIMD) processing element instructions comprising a combined controller and processing element. At least first and second simplex instructions each comprise a mode of operation bit, said mode of operation bit in the first simplex instruction specifying a controller SISD operation for execution by the controller, and the mode of operation bit in the second simplex instruction specifying a procesing element SIMD operation for execution by the processsing element. A very long instruction word (VLIW) contains said at least first and second simplex instructions.
    Type: Grant
    Filed: February 14, 2001
    Date of Patent: August 12, 2003
    Assignee: Bops, Inc.
    Inventors: Gerald G. Pechanek, Juan G. Revilla
  • Patent number: 6604188
    Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.
    Type: Grant
    Filed: October 20, 1999
    Date of Patent: August 5, 2003
    Assignee: Transmeta Corporation
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
  • Patent number: 6581131
    Abstract: A method and apparatus for efficient cache mapping of compressed Very Long Instruction Word (VLIW) instructions. In the present invention, efficient cache mapping of compressed variable length cache lines is performed by decompressing a sequence of compressed instructions to obtain decompressed cache lines and storing the decompressed cache lines in the same sequence in the cache memory. The present invention decouples the program counter based cache mapping from the memory address. In this way, a fixed increment cache pointer and variable size compressed cache line can be achieved, and, in doing so, decompressed cache lines fit nicely within the cache, in sequential order, while variable length compressed cache lines can be directly accessed without the use of a translation table.
    Type: Grant
    Filed: January 9, 2001
    Date of Patent: June 17, 2003
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Gary L Vondran, Jr.
  • Patent number: 6581152
    Abstract: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses.
    Type: Grant
    Filed: February 11, 2002
    Date of Patent: June 17, 2003
    Assignee: BOPS, Inc.
    Inventors: Edwin F. Barry, Gerald G. Pechanek
  • Publication number: 20030088758
    Abstract: Methods and systems are disclosed for calculating the number of valid instructions in a microprocessor instruction bundle. One method advances the instructions along the pipeline and edge detects the number of valid instructions within the pipeline. Another method fetches a bundle of instructions, shifts instructions within the bundle, and edge detects the valid instructions. Still another method fetches the bundle of instructions and detects a complex instruction within the bundle. Instructions occurring after the complex instruction are shifted, and the number of valid instructions occurring after the complex instruction are edge detected.
    Type: Application
    Filed: November 8, 2001
    Publication date: May 8, 2003
    Inventors: Matthew Becker, Masooma Bhaiwala
  • Patent number: 6557094
    Abstract: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs.
    Type: Grant
    Filed: September 28, 2001
    Date of Patent: April 29, 2003
    Assignee: Bops, Inc.
    Inventors: Gerald G. Pechanek, Edwin F. Barry, Juan Guillermo Revilla, Larry D. Larsen
  • Publication number: 20030079109
    Abstract: A pipelined data processing unit includes an instruction sequencer and n functional units capable of executing n operations in parallel. The instruction sequencer includes a random access memory for storing very-long-instruction-words (VLIWs) used in operations involving the execution of two or more functional units in parallel. Each VLIW comprises a plurality of short-instruction-words (SIWs) where each SIW corresponds to a unique type of instruction associated with a unique functional unit. VLIWs are composed in the VLIW memory by loading and concatenating SIWs in each address, or entry. VLIWs are executed via the execute-VLIW (XV) instruction. The iVLIWs can be compressed at a VLIW memory address by use of a mask field contained within the XV1 instruction which specifies which functional units are enabled, or disabled, during the execution of the VLIW. The mask can be changed each time the XV1 instruction is executed, effectively modifying the VLIW every time it is executed.
    Type: Application
    Filed: September 24, 2002
    Publication date: April 24, 2003
    Applicant: BOPS, Inc.
    Inventors: Gerald G. Pechanek, Juan Guillermo Revilla, Edwin F. Barry
  • Publication number: 20030074654
    Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.
    Type: Application
    Filed: October 16, 2001
    Publication date: April 17, 2003
    Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
  • Patent number: 6542862
    Abstract: An apparatus and method for determining register dependency in multiple architecture system. The system includes a microprocessor emulating an emulated instruction set using a native instruction set where the microprocessor contains at least one register. An execution engine provides the native instructions where each native instruction contains at least one register identifier. Flags are provided to each native instruction where each flag indicates whether a register identifier is valid. A bundler checks for dependency among the valid register identifiers in the native instructions.
    Type: Grant
    Filed: February 18, 2000
    Date of Patent: April 1, 2003
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Kevin David Safford, Patrick Knebel, Joel D Lamb
  • Patent number: 6542989
    Abstract: A processor comprises an arithmetic logic unit (ALU) that co-operates with a stack arrangement (STCK). The processor is arranged to execute instructions (INSTR) which include a stack control field (SCF) and an opcode field (OPF) for controlling the stack arrangement (STCK) and the arithmetic logic unit (ALU), respectively.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: April 1, 2003
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Marc Duranton
  • Patent number: 6539471
    Abstract: Method and apparatus for reducing or eliminating retirement logic in an out-of-order processor are disclosed. Instructions are processed using a processing unit capable of out-of-order processing and having architectural registers having an architectural state. Groups of instructions are prepared for processing by processing unit, wherein within each group to be processed the instructions producing the final state of an architectural register are changed so that they write to an output copy of the architectural state, the instructions reading architectural registers are changed to read from an input copy of the architectural state, and the instructions within each group producing results to architectural registers that would be overwritten by another instruction in the group are changed to write their results to temporary registers.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: March 25, 2003
    Assignee: Intel Corporation
    Inventor: Gad S. Sheaffer
  • Patent number: 6516407
    Abstract: To an existing instruction set, newly added are a condition code conversion instruction for converting a first condition code (N, Z, OV, C) to a second condition code (V, S) based on a reference condition code COND, a second conditional instruction having a reference flag SF, and an instruction of operation between two selected second condition codes. A VLIW processor comprises a second condition code register file 163, a condition code conversion circuit 12A, and a logic operation circuit 12E for performing a non-Boolean logic operation between two selected second condition codes.
    Type: Grant
    Filed: December 28, 1999
    Date of Patent: February 4, 2003
    Assignee: Fujitsu Limited
    Inventors: Atsuhiro Suga, Toshihiro Ozawa
  • Publication number: 20030023830
    Abstract: Aspects of a method and system for encoding instructions as a very long instruction word for processing in a plurality of computation units that reduces instruction memory requirements in a processing system are described. The aspects include determining at which stages of instruction processing that an instruction code needs to be executed. Further, an enable signal of the instruction code is utilized to direct execution during the determined stages by controlling storage operations for the instruction code.
    Type: Application
    Filed: July 25, 2001
    Publication date: January 30, 2003
    Inventor: Eugene B. Hogenauer
  • Patent number: 6499100
    Abstract: When decoding instructions of a program to be executed in a central processing unit comprising pipelining facilities for fast instruction decoding, part of the decoding is executed or the decoding in pipelining units is prepared in a remapping unit during loading a program into a program or primary memory used by the central processor, the remapping or predecoding operation resulting in operation codes which can be very rapidly interpreted by the pipelining units of the central processor. Thus, the operation code field of an instruction is changed to include information on e.g., instruction length, jumps, parameters, etc., this information indicating the instruction length, whether it is a jump instruction or has a parameter etc. respectively, in a direct way that allows the use of simple combinatorial circuits in the pipelining units.
    Type: Grant
    Filed: May 30, 2000
    Date of Patent: December 24, 2002
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Dan Halvarsson, Tomas Jonsson, Per Holmberg
  • Patent number: 6499096
    Abstract: A VLIW processor includes a plurality of containers holding a plurality of sub-instructions in a VLIW instruction, an exchanging portion exchanging the plurality of sub-instructions held in the plurality of containers and inputting the instructions to the plurality of containers, a plurality of decoders decoding the sub-instructions held in the plurality of containers, and a plurality of processing units executing the sub-instructions decoded by the plurality of decoders. Since the exchanging portion exchanges a plurality of sub-instructions held in the plurality of containers and inputs the instructions to the plurality of containers, a compressed code can be executed in such an execution sequence that is taken prior to compression.
    Type: Grant
    Filed: September 29, 1999
    Date of Patent: December 24, 2002
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Hiroaki Suzuki
  • Patent number: 6499097
    Abstract: The present invention provides an instruction fetch unit aligner. In one embodiment, an apparatus for an instruction fetch unit aligner includes selection logic for selecting a non-power of two size instruction from power of two size instruction data, and control logic for controlling the selection logic.
    Type: Grant
    Filed: May 31, 2001
    Date of Patent: December 24, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, Graham R. Murphy, Frank C. Chiu
  • Publication number: 20020194454
    Abstract: The invention relates to a method and an arrangement for instruction word generation in the driving of functional units in a processor, the instruction words comprising a plurality of instruction word parts. In this case, in a program sequence, under the control of a program word, an instruction word is taken from a row—determined by a reading row number—of an instruction word memory that can be written to row by row, the said instruction word is altered by means of substitution of an instruction word part by the information part of the respective program word and is written back to a row of the instruction word memory, the said row being determined by a writing row number. Afterwards, an instruction word—which is generated in this way and is to be executed in accordance with the program—for driving the functional units is output to the processor.
    Type: Application
    Filed: February 14, 2002
    Publication date: December 19, 2002
    Inventors: Matthias Weiss, Gerhard Fettweis
  • Patent number: 6496919
    Abstract: A data processor which includes a first processor for executing a first instruction set and a second processor for executing a second instruction set different from the first instruction set. When the first processor executes a predetermined instruction of the first instruction set the second processor executes an instruction of the second instructions set. The first processor may be a reduced instruction set computer (RISC) type processor, the second processor may be a very long instruction word (VLIW) type processor, the first instruction set may be a RISC instruction set and the second instruction set may be a VLIW instruction set. The predetermined instruction of the RISC instruction set executed by the first processor may be a branch instruction causing a branch to a specific address space at which VLIW instructions are stored. Thereafter, the VLIW instructions at the specific address space are executed by the VLIW type processor.
    Type: Grant
    Filed: August 25, 1999
    Date of Patent: December 17, 2002
    Assignee: Hitachi, Ltd.
    Inventors: Junichi Nishimoto, Hideo Maejima
  • Publication number: 20020178345
    Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.
    Type: Application
    Filed: April 1, 2002
    Publication date: November 28, 2002
    Applicant: BOPS, Inc.
    Inventors: Thomas L. Drabenstott, Gerald George Pechanek, Edwin Franklin Barry, Charles W. Kurak,
  • Publication number: 20020169942
    Abstract: The VLIW processor according to the present invention, which executes in parallel a plurality of processings described in parallel in a VLIW instruction using a plurality of execution pipelines, performs pipeline execution of processings selected and designated from among the plurality of processings based on the VLIW instruction in respective steps on a diagonal formed by shifting one step at a time starting with an initial step in the order of parallel arrangement of the plurality of execution pipelines, one by one in the direction of the diagonal.
    Type: Application
    Filed: May 3, 2002
    Publication date: November 14, 2002
    Inventor: Hideki Sugimoto
  • Publication number: 20020161986
    Abstract: An instruction processing method for checking an arrangement of basic instructions in a very long instruction word (VLIW) instruction, suitable for language processing systems, an assembler and a compiler, used for processors which execute variable length VLIW instructions designed based on variable length VLIW architecture.
    Type: Application
    Filed: January 24, 2002
    Publication date: October 31, 2002
    Applicant: FUJITSU LIMITED
    Inventors: Teruhiko Kamigata, Hideo Miyake