Long Instruction Word Patents (Class 712/24)
-
Patent number: 6738893Abstract: A process for scheduling computer processor execution of operations in a plurality of instruction word formats including the steps of arranging commands into properly formatted instruction words beginning at one end into a sequence selected to provide the most rapid execution of the operations, and then rearranging the operations within the plurality of instruction words from the other end of the sequence into instruction words selected to occupy the least space in memory.Type: GrantFiled: April 25, 2000Date of Patent: May 18, 2004Assignee: Transmeta CorporationInventor: Guillermo J. Rozas
-
Patent number: 6728846Abstract: Atomic multiple word writes are provided when emulating a target system that supports atomic multiple word writes on a host system that does not. For each except the last word to be written, a gate flag is read, tested, and locked when it is found unlocked. The words are then written to memory in reverse order, unlocking the gate flags as they are written. In a host system with a longer word size than the target system, the gate flags can be stored in otherwise unused bits in the host system words containing the target system words to be written.Type: GrantFiled: December 22, 2000Date of Patent: April 27, 2004Assignee: Bull HN Information Systems Inc.Inventor: Bruce A. Noyes
-
Patent number: 6728865Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.Type: GrantFiled: October 20, 1999Date of Patent: April 27, 2004Assignee: Transmeta CorporationInventors: Brett Coon, Godfrey D'Souza, Paul Serris
-
Patent number: 6725357Abstract: A system comprises: a first execution unit, a second execution unit and a third execution unit; a first-in-first-out memory arranged to receive a plurality of instructions for the first to third execution units and to output the instructions to the execution units; a memory store for storing at least one instruction for one of the execution units, the at least one instruction being received from the first-in-first-out memory, the first and second execution units being arranged to receive their instructions from the first-in-first-out memory and the third execution unit being arranged to receive the instructions from the memory store, wherein a given instruction for the third execution unit is available to the third execution unit at substantially the same time that the instruction would be available to the first or second execution unit if that instruction was for the first or second execution unit.Type: GrantFiled: May 2, 2000Date of Patent: April 20, 2004Assignee: STMicroelectronics S.A.Inventor: Jean-Philippe Cousin
-
Patent number: 6725356Abstract: The present invention provides a system and method for improving the performance of general purpose processors by expanding at least one source operand to a width greater than the width of either the general purpose register or the data path width. In addition, the present invention provides several classes of instructions which cannot be performed efficiently if the operands are limited to the width and accessible number of general purpose registers. The present invention provides operands which are substantially larger than the data path width of the processor by using a general purpose register to specify a memory address from which at least more than one, but typically several data path widths of data can be read. The present invention also provides for the efficient usage of a multiplier array that is fully used for high precision arithmetic, but is only partly used for other, lower precision operations.Type: GrantFiled: August 2, 2001Date of Patent: April 20, 2004Assignee: MicroUnity Systems Engineering, Inc.Inventors: Craig Hansen, John Moussouris
-
Patent number: 6721875Abstract: Disclosed is a computer architecture with single-syllable IP-relative branch instructions and long IP-relative branch instructions (IP=instruction pointer). The architecture fetches instructions in multi-syllable, bundle form. Single-syllable IP-relative branch instructions occupy a single syllable in an instruction bundle, and long IP-relative branch instructions occupy two syllables in an instruction bundle. The additional syllable of the long branch carries with it additional IP-relative offset bits, which when merged with offset bits carried in a core branch syllable provide a much greater offset than is carried by a single-syllable branch alone. Thus, the long branch provides for greater reach within an address space. Use of the long branch to patch IA-64 architecture instruction bundles is also disclosed. Such a patch provides the reach of an indirect branch with the overhead of a single-syllable IP-relative branch.Type: GrantFiled: February 22, 2000Date of Patent: April 13, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventors: James E McCormick, Jr., Stephen R. Undy, Donald Charles Soltis, Jr.
-
Patent number: 6721873Abstract: A method and apparatus for improving dispersal performance of instruction threads is described. In one embodiment, the dispersal logic determines whether the instructions supplied to it include any NOP instructions. When a NOP instruction is detected, the dispersal logic places the NOP into a no-op port for execution. All other instructions are distributed to the proper execution pipes in a normal manner. Because the NOP instructions do not use the execution resources of other instructions, all instruction threads can be executed in one cycle.Type: GrantFiled: December 29, 2000Date of Patent: April 13, 2004Assignee: Intel CorporationInventors: Sailesh Kottapalli, Udo Walterscheidt, Andrew Sun, Thomas Yeh, Kinkee Sit
-
Publication number: 20040059894Abstract: The program to be executed is compiled by translating it into native instructions of the instruction-set architecture of the processor system, organizing the instructions deriving from the translation of the program into respective bundles in an order of successive bundles, each bundle grouping together instructions adapted to be executed in parallel by the processor system. The bundles of instructions are ordered into respective sub-bundles, said sub-bundles identifying a first set of instructions, which must be executed before the instructions belonging to the next bundle of said order, and a second set of instructions, which can be executed both before and in parallel with respect to the instructions belonging to said subsequent bundle of said order.Type: ApplicationFiled: July 1, 2003Publication date: March 25, 2004Applicant: STMicroelectronics S.r.I.Inventors: Fabrizio Simone Rovati, Antonio Maria Borneo, Danilo Pietro Pau
-
Patent number: 6711670Abstract: A superscalar processing system that detects data hazards within instruction groups utilizes a memory, a plurality of pipelines, an instruction dispersal unit (IDU), and a control mechanism. The memory includes a plurality of entries that respectively correspond with a plurality of registers. The IDU receives an instruction group that includes a plurality of instructions and transmits the instructions of the instruction group to the plurality of pipelines. The control mechanism analyzes one of the instructions and identifies an entry in the memory that corresponds with a register associated with the one instruction. The control mechanism then analyzes the entry and transmits a warning signal in response to a determination that the entry indicates that another instruction within the instruction group is associated with the register.Type: GrantFiled: October 14, 1999Date of Patent: March 23, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventors: Donald Charles Soltis, Jr., Ronny Lee Arnold
-
Patent number: 6704859Abstract: A compressed instruction format for a VLIW processor allows greater efficiency in use of cache and memory. Instructions are byte aligned and variable length. Branch targets are uncompressed. Format bits specify how many issue slots are used in a following instruction. NOPS are not stored in memory. Individual operations are compressed according to features such as whether they are resultless, guarded, short, zeroary, unary, or binary. Instructions are stored in compressed form in memory and in cache. Instructions are decompressed on the fly after being read out from cache.Type: GrantFiled: August 4, 1998Date of Patent: March 9, 2004Assignee: Koninklijke Philips Electronics N.V.Inventors: Eino Jacobs, Michael Ang
-
Patent number: 6704857Abstract: The ManArray processor is a scalable indirect VLIW array processor that defines two preferred architectures for indirect VLIW memories. One approach treats the VIM as one composite block of memory using one common address interface to access any VLIW stored in the VIM. The second approach treats the VIM as made up of multiple smaller VIMs each individually associated with the functional units and each individually addressable for loading and reading during XV execution. The VIM memories, contained in each processing element (PE), are accessible by the same type of LV and XV Short Instruction Words (SIWs) as in a single processor instantiation of the indirect VLIW architecture. In the ManArray architecture, the control processor, also called a sequence processor (SP), fetches the instructions from the SIW memory and dispatches them to itself and the PEs. By using the LV instruction, VLIWs can be loaded into VIMs in the SP and the PEs.Type: GrantFiled: December 22, 2000Date of Patent: March 9, 2004Assignee: PTS CorporationInventors: Edwin Frank Barry, Gerald G. Pechanek
-
Patent number: 6704862Abstract: One embodiment of the present invention provides a system that supports exception handling through use of a conditional trap instruction. The system supports a head thread that executes program instructions and a speculative thread that speculatively executes program instructions in advance of the head thread. During operation, the system uses the speculative thread to execute code, which includes an instruction that can cause an exception condition. After the instruction is executed, the system determines if the instruction caused the exception condition. If so, the system writes an exception condition indicator to a register. At some time in the future, the system executes a conditional trap instruction which examines a value in the register. If the value in the register is an exception condition indicator, the system executes a trap handling routine to handle the exception condition. Otherwise, the system proceeds with execution of the code.Type: GrantFiled: June 9, 2000Date of Patent: March 9, 2004Assignee: Sun Microsystems, Inc.Inventors: Shailender Chaudhry, Marc Tremblay
-
Publication number: 20040030860Abstract: A VLIW processor for executing a sequence of very long instruction words having a plurality of operations to be executed in parallel. The VLIW processor has a plurality of functional units for parallel execution of the operations specified by the VLIW, an instruction register for holding the VLIW, and a condition flag for indicating the results of a comparison operation. The VLIW includes a conditional head and a plurality of slots, each slot including an operational code and any related operands. The conditional head has a plurality of conditional indicators, each conditional indicator uniquely corresponding to one operation and specifying a condition in which the operation is to be executed if the indicated condition exists. A control circuit is connected to the instruction register and the functional units to deliver the operation from the instruction register to the corresponding functional unit for execution when the condition exists.Type: ApplicationFiled: August 8, 2002Publication date: February 12, 2004Inventor: Yu-Min Wang
-
Publication number: 20040019766Abstract: Multiple instructions, specifying equivalent operations but designating different execution units, are stored beforehand on an instruction exchange table. First, a primary compiler compiles a source program into a set of machine-readable instructions. From the set of instructions, an instruction parallelizer generates a set of long instruction words. Specifically, an instruction identifier identifies one of the instructions in the set with one of the instructions stored on the instruction exchange table. Then, an instruction replacer replaces the instruction in question with another one of the instructions that is also stored on the instruction exchange table, specifies an equivalent operation but designates a different execution unit as a target. In this manner, the number of parallelly executable instructions can be increased, while the number of no-operation instructions can be reduced, thus generating a parallelized instruction set at a higher level of parallelism.Type: ApplicationFiled: July 18, 2003Publication date: January 29, 2004Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.Inventor: Kenichi Kawaguchi
-
Patent number: 6684319Abstract: The present invention minimizes power consumption and processing time in a very long instruction word digital signal processor by identifying certain blocks of instructions and placing them in a small, fast buffer for subsequent retrieval and execution. A decoder unit decodes a prefetch instruction flag bit that indicates when instructions are to be prefetched and placed into the buffer. The decoder unit signals a control unit, which sends the instruction code from a memory unit to the buffer and maintains an address mapping table and a program counter. The control unit also sets a select input on a multiplexer to indicate that the multiplexer is to output the prefetch instructions it receives from the buffer. The multiplexer outputs the prefetch instructions to an instruction register that sends the prefetch instructions to appropriate functional units for execution.Type: GrantFiled: June 30, 2000Date of Patent: January 27, 2004Assignee: Conexant Systems, Inc.Inventors: Moataz A. Mohamed, Keith M. Bindloss
-
Patent number: 6684320Abstract: An apparatus and method for issue grouping of instructions in a VLIW processor is disclosed. There can be one, two, or three issue groups (but no greater than three issue groups) in each VLIW packet. In one embodiment, a template in the VLIW packet comprises two issue group end markers where each issue group end marker comprises three bits. The three bits in the first issue group end marker identifies the instruction which is the last instruction in the first issue group. Likewise, the three bits in the second issue group end marker identifies the instruction which is the last instruction in the second issue group. Any instructions in the VLIW packet falling outside the two expressly defined first and second issue groups are placed in a third issue group. As such, three issue groups can be identified by use of the two issue group end markers. In one embodiment, the template of the VLIW packet includes a chaining bit.Type: GrantFiled: February 28, 2002Date of Patent: January 27, 2004Assignee: Mindspeed Technologies, Inc.Inventors: Moataz A Mohamed, Chien-Wei Li, John R. Spence
-
Patent number: 6665791Abstract: A method and apparatus are disclosed for releasing functional units in a multithreaded very large instruction word (VLIW) processor. The functional unit release mechanism can retrieve the capacity lost due to multiple cycle instructions. The functional unit release mechanism of the present invention permits idle functional units to be reallocated to other threads, thereby improving workload efficiency. Instruction packets are assigned to functional units, which can maintain their state, independent of the issue logic. Each functional unit has an associated state machine (SM) that keeps track of the number of cycles that the functional unit will be occupied by a multiple-cycle instruction. Functional units do not reassign themselves as long as the functional unit is busy. When the instruction is complete, the functional unit can participate in functional unit allocation, even if other functional units assigned to the same thread are still busy.Type: GrantFiled: March 30, 2000Date of Patent: December 16, 2003Assignee: Agere Systems Inc.Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
-
Patent number: 6658655Abstract: A threaded interpreter (916) is suitable for executing a program comprising a series of program instructions stored in a memory (904). For the execution of a program instruction the threaded interpreter includes a preparatory unit (918) for executing a plurality of preparatory steps making the program instruction available in the threaded interpreter, and an execution unit (920) with one or more machine instructions emulating the program instruction. According to the invention, the threaded interpreter is designed such that during the execution on an instruction-level parallel processor of the series of program instructions machine instructions implementing a first one of the preparatory steps are executed in parallel with machine instructions implementing a second one of the preparatory steps for respective ones of the series of program instructions.Type: GrantFiled: December 6, 1999Date of Patent: December 2, 2003Assignee: Koninklijke Philips Electronics N.V.Inventors: Jan Hoogerbrugge, Alexander Augusteijn
-
Patent number: 6658551Abstract: A method and apparatus are disclosed for allocating functional units in a multithreaded very large instruction word (VLIW) processor. The present invention combines the techniques of conventional very long instruction word (VLIW) architectures and conventional multithreaded architectures to reduce execution time within an individual program, as well as across a workload. The present invention utilizes instruction packet splitting to recover some efficiency lost with conventional multithreaded architectures. Instruction packet splitting allows an instruction bundle to be partially issued in one cycle, with the remainder of the bundle issued during a subsequent cycle. There are times, however, when instruction packets cannot be split without violating the semantics of the instruction packet assembled by the compiler. A packet split identification bit is disclosed that allows hardware to efficiently determine when it is permissible to split an instruction packet.Type: GrantFiled: March 30, 2000Date of Patent: December 2, 2003Assignee: Agere Systems Inc.Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
-
Patent number: 6654869Abstract: A microprocessor includes a fetch unit, an instruction cracking unit, and dispatch and completion control logic. The fetch unit retrieves a set of instructions from an instruction cache. The instruction cracking unit receives the set of fetched instructions and organizes the set of instructions into an instruction group. The dispatch and completion logic assigns a group tag to the instruction group and records the group tag in an entry of the completion table for tracking the completion status of the instructions comprising the instruction group. The dispatch and control logic may record a single instruction address in the completion table entry corresponding to the each instruction group. Preferably, the single instruction address is the instruction address of the first instruction in the instruction group. The processor may flush the instruction group in response to detecting an exception generated by an instruction in the instruction group.Type: GrantFiled: October 28, 1999Date of Patent: November 25, 2003Assignee: International Business Machines CorporationInventors: James Allan Kahle, Hung Qui Le, Charles Roberts Moore
-
Patent number: 6654870Abstract: Port priorities are defined on a 32-bit word, 16-bit half-word, and 8-bit byte basis to control the write enable signals to a compute register file (CRF). With a manifold array (ManArray) reconfigurable register file, it is possible to have double-word 64-bit and single word 32-bit data-type instructions mixed with other double-word, single-word, half-word, or byte data-type instructions within the same very long instruction word (VLIW). By resolving a write priority conflict on the byte, half-word, or word that is in conflict during the VLIW execution, it is possible to have partial operations complete that provide a useful function. For. example, a load half-word to the half-word H0 portion of a 32-bit register R0 can have priority to complete its operation while a 64-bit shift of the register pair R0 and R1 will complete its operation on the non-conflicting half-word portions of the 64-bit register R0 and R1.Type: GrantFiled: June 21, 2000Date of Patent: November 25, 2003Assignee: PTS CorporationInventors: Edwin Frank Barry, Edward A. Wolff, Patrick Rene Marchand, David Carl Strube
-
Patent number: 6643768Abstract: A dyadic digital signal processing (DSP) instruction processor including a first DSP functional block to execute a main operation of a dyadic DSP instruction and a second DSP functional block to execute a sub operation of the dyadic DSP instruction with data paths of each selectively configured to execute the main operation and the sub operation of the dyadic DSP instruction. A voice and data communication system has a first gateway and a second gateway coupled to a packetized network, each gateway having a network interface including the dyadic DSP instruction processor. An application specific signal processor with a signal processor having a first DSP functional block to execute a main operation of a dyadic DSP instruction and a second DSP functional block to execute a sub operation with multiplexers coupled to the first DSP functional block and the second DSP functional block to selectively configure data paths thereto.Type: GrantFiled: August 9, 2002Date of Patent: November 4, 2003Assignee: Intel CorporationInventors: Kumar Ganapathy, Ruban Kanapathipillai
-
Publication number: 20030200420Abstract: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs.Type: ApplicationFiled: April 28, 2003Publication date: October 23, 2003Applicant: PTS CorporationInventors: Gerald G. Pechanek, Edwin F. Barry, Juan Guillermo Revilla, Larry D. Larsen
-
Patent number: 6631461Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The instruction set architecture implemented with the ASSP, is adapted to DSP algorithmic structures. The instruction word of the ISA is typically 20 bits but can be expanded to 40-bits to control two instructions to be executed in series or parallel. All DSP instructions of the ISA are dyadic DSP instructions performing two operations with one instruction in one cycle. The DSP instructions or operations in the preferred embodiment include a multiply instruction (MULT), an addition instruction (ADD), a minimize/maximize instruction (MIN/MAX) also referred to as an extrema instruction, and a no operation instruction (NOP) each having an associated operation code (“opcode”). The present invention efficiently executes DSP instructions by means of the instruction set architecture and the hardware architecture of the application specific signal processor.Type: GrantFiled: August 8, 2002Date of Patent: October 7, 2003Assignee: Intel CorporationInventors: Kumar Ganapathy, Ruban Kanapathipillai
-
Patent number: 6622234Abstract: Techniques for adding more complex instructions and their attendant multi-cycle execution units with a single instruction multiple data, stream (SIMD) very long instruction word (VLIW) processing framework are described. In one aspect, an initiation mechanism also acts as a resynchronization mechanism to read the results of multi-cycle execution. This multi-purpose mechanism operates with a short instruction word (SIW) issue of the multi-cycle instruction, in a sequence processor (SP) alone, with a VLIW, and across all processing elements (PEs) individually or as an array of PEs. A number of advantageous floating point instructions are also described.Type: GrantFiled: June 21, 2000Date of Patent: September 16, 2003Assignee: PTS CorporationInventors: Gerald G. Pechanek, David Carl Strube, Edward A. Wolff, Edwin Frank Barry, Grayson Morris, Carl Donald Busboom, Dale Edward Schneider
-
Patent number: 6615338Abstract: A Very Long Instruction Word (VLIW) processor has a clustered architecture including a plurality of independent functional units and a multi-ported register file that is divided into a plurality of separate register file segments, the register file segments being individually associated with the plurality of independent functional units. The functional units access the respective associated register file segments using read operations that are local to the functional unit/ register file segment pairs. In contrast, the functional units access the register file segments using write operations that are broadcast to a plurality of register file segments. Independence between clusters is attained since the separate clustered functional unit/ register file segment pairs have local (internal) bypassing that allows internal computations to proceed, but have only limited bypassing between different functional unit/ register file segment pair clusters.Type: GrantFiled: December 3, 1998Date of Patent: September 2, 2003Assignee: Sun Microsystems, Inc.Inventors: Marc Tremblay, William Joy
-
Patent number: 6615339Abstract: A VLIW processor includes an instruction decode unit selecting one of parallel execution and consecutive execution and decoding a plurality of operation instructions included in an instruction word, and a program counter control unit controlling a value of a program counter for providing an indication for the instruction decode unit to provide as no-operation an operation instruction provided in a consecutive execution and executed prior to an operation instruction executed during a consecutive execution when branching to the operation instruction executed during the consecutive execution is introduced. This renders it possible to branch to an operation instruction executed during a consecutive execution and thus provide an enhanced efficiency of instruction-code compression.Type: GrantFiled: January 18, 2000Date of Patent: September 2, 2003Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Hironobu Ito, Hisakazu Sato
-
Publication number: 20030163669Abstract: The invention provides a processor that processes bundles of instructions preferentially through clusters or execution units according to thread characteristics. The cluster architectures of the invention preferably include capability to process “multi-threaded” instructions. Selectively, the architecture either (a) processes singly-threaded instructions through a single cluster to avoid bypassing and to increase throughput, or (b) processes singly-threaded instructions through multiple processes to increase “per thread” performance. The architecture may be “configurable” to operate in one of two modes: in a “wide” mode of operation, the processor's internal clusters collectively process bundled instructions of one thread of a program at the same time; in a “throughput” mode of operation, those clusters independently process instruction bundles of separate program threads.Type: ApplicationFiled: February 27, 2002Publication date: August 28, 2003Inventor: Eric DeLano
-
Publication number: 20030154358Abstract: Apparatus and method for dispatching a very long instruction word (VLIW) instruction having a variable length are provided. The apparatus for dispatching a VLIW instruction includes a packet buffer for storing at least one or more VLIW instructions, and a decoding unit configured to constitute a VLIW instruction to be currently executed among the VLIW instructions stored in the packet buffer and decode predetermined bits of each sub-instruction contained in the VLIW instruction. The apparatus dispatches a corresponding sub-instruction to an FU which corresponds to each sub-instruction, based on the results of decoding performed in the decoding unit, position information on the sub-instructions that are placed on the packet buffer, and position information on the sub-instructions that are placed in the current VLIW instruction. Sub-instructions can be effectively dispatched to corresponding FUs using simple decoding logic even in a case where the length of the VLIW instruction is not fixed.Type: ApplicationFiled: December 3, 2002Publication date: August 14, 2003Applicant: Samsung Electronics Co., Ltd.Inventors: Nak-Hee Seong, Kyoung-Mook Lim, Seh-Woong Jeong, Jae-Hong Park, Hyung-Jun Im, Gun-Young Bae, Young-Duck Kim
-
Patent number: 6606699Abstract: An apparatus for concurrently executing controller single instruction single data (SISD) instructions and single instruction multiple data (SIMD) processing element instructions comprising a combined controller and processing element. At least first and second simplex instructions each comprise a mode of operation bit, said mode of operation bit in the first simplex instruction specifying a controller SISD operation for execution by the controller, and the mode of operation bit in the second simplex instruction specifying a procesing element SIMD operation for execution by the processsing element. A very long instruction word (VLIW) contains said at least first and second simplex instructions.Type: GrantFiled: February 14, 2001Date of Patent: August 12, 2003Assignee: Bops, Inc.Inventors: Gerald G. Pechanek, Juan G. Revilla
-
Patent number: 6604188Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.Type: GrantFiled: October 20, 1999Date of Patent: August 5, 2003Assignee: Transmeta CorporationInventors: Brett Coon, Godfrey D'Souza, Paul Serris
-
Patent number: 6581131Abstract: A method and apparatus for efficient cache mapping of compressed Very Long Instruction Word (VLIW) instructions. In the present invention, efficient cache mapping of compressed variable length cache lines is performed by decompressing a sequence of compressed instructions to obtain decompressed cache lines and storing the decompressed cache lines in the same sequence in the cache memory. The present invention decouples the program counter based cache mapping from the memory address. In this way, a fixed increment cache pointer and variable size compressed cache line can be achieved, and, in doing so, decompressed cache lines fit nicely within the cache, in sequential order, while variable length compressed cache lines can be directly accessed without the use of a translation table.Type: GrantFiled: January 9, 2001Date of Patent: June 17, 2003Assignee: Hewlett-Packard Development Company, L.P.Inventor: Gary L Vondran, Jr.
-
Patent number: 6581152Abstract: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses.Type: GrantFiled: February 11, 2002Date of Patent: June 17, 2003Assignee: BOPS, Inc.Inventors: Edwin F. Barry, Gerald G. Pechanek
-
Publication number: 20030088758Abstract: Methods and systems are disclosed for calculating the number of valid instructions in a microprocessor instruction bundle. One method advances the instructions along the pipeline and edge detects the number of valid instructions within the pipeline. Another method fetches a bundle of instructions, shifts instructions within the bundle, and edge detects the valid instructions. Still another method fetches the bundle of instructions and detects a complex instruction within the bundle. Instructions occurring after the complex instruction are shifted, and the number of valid instructions occurring after the complex instruction are edge detected.Type: ApplicationFiled: November 8, 2001Publication date: May 8, 2003Inventors: Matthew Becker, Masooma Bhaiwala
-
Patent number: 6557094Abstract: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs.Type: GrantFiled: September 28, 2001Date of Patent: April 29, 2003Assignee: Bops, Inc.Inventors: Gerald G. Pechanek, Edwin F. Barry, Juan Guillermo Revilla, Larry D. Larsen
-
Publication number: 20030079109Abstract: A pipelined data processing unit includes an instruction sequencer and n functional units capable of executing n operations in parallel. The instruction sequencer includes a random access memory for storing very-long-instruction-words (VLIWs) used in operations involving the execution of two or more functional units in parallel. Each VLIW comprises a plurality of short-instruction-words (SIWs) where each SIW corresponds to a unique type of instruction associated with a unique functional unit. VLIWs are composed in the VLIW memory by loading and concatenating SIWs in each address, or entry. VLIWs are executed via the execute-VLIW (XV) instruction. The iVLIWs can be compressed at a VLIW memory address by use of a mask field contained within the XV1 instruction which specifies which functional units are enabled, or disabled, during the execution of the VLIW. The mask can be changed each time the XV1 instruction is executed, effectively modifying the VLIW every time it is executed.Type: ApplicationFiled: September 24, 2002Publication date: April 24, 2003Applicant: BOPS, Inc.Inventors: Gerald G. Pechanek, Juan Guillermo Revilla, Edwin F. Barry
-
Publication number: 20030074654Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.Type: ApplicationFiled: October 16, 2001Publication date: April 17, 2003Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
-
Patent number: 6542862Abstract: An apparatus and method for determining register dependency in multiple architecture system. The system includes a microprocessor emulating an emulated instruction set using a native instruction set where the microprocessor contains at least one register. An execution engine provides the native instructions where each native instruction contains at least one register identifier. Flags are provided to each native instruction where each flag indicates whether a register identifier is valid. A bundler checks for dependency among the valid register identifiers in the native instructions.Type: GrantFiled: February 18, 2000Date of Patent: April 1, 2003Assignee: Hewlett-Packard Development Company, L.P.Inventors: Kevin David Safford, Patrick Knebel, Joel D Lamb
-
Patent number: 6542989Abstract: A processor comprises an arithmetic logic unit (ALU) that co-operates with a stack arrangement (STCK). The processor is arranged to execute instructions (INSTR) which include a stack control field (SCF) and an opcode field (OPF) for controlling the stack arrangement (STCK) and the arithmetic logic unit (ALU), respectively.Type: GrantFiled: January 28, 2000Date of Patent: April 1, 2003Assignee: Koninklijke Philips Electronics N.V.Inventor: Marc Duranton
-
Patent number: 6539471Abstract: Method and apparatus for reducing or eliminating retirement logic in an out-of-order processor are disclosed. Instructions are processed using a processing unit capable of out-of-order processing and having architectural registers having an architectural state. Groups of instructions are prepared for processing by processing unit, wherein within each group to be processed the instructions producing the final state of an architectural register are changed so that they write to an output copy of the architectural state, the instructions reading architectural registers are changed to read from an input copy of the architectural state, and the instructions within each group producing results to architectural registers that would be overwritten by another instruction in the group are changed to write their results to temporary registers.Type: GrantFiled: December 23, 1998Date of Patent: March 25, 2003Assignee: Intel CorporationInventor: Gad S. Sheaffer
-
Patent number: 6516407Abstract: To an existing instruction set, newly added are a condition code conversion instruction for converting a first condition code (N, Z, OV, C) to a second condition code (V, S) based on a reference condition code COND, a second conditional instruction having a reference flag SF, and an instruction of operation between two selected second condition codes. A VLIW processor comprises a second condition code register file 163, a condition code conversion circuit 12A, and a logic operation circuit 12E for performing a non-Boolean logic operation between two selected second condition codes.Type: GrantFiled: December 28, 1999Date of Patent: February 4, 2003Assignee: Fujitsu LimitedInventors: Atsuhiro Suga, Toshihiro Ozawa
-
Publication number: 20030023830Abstract: Aspects of a method and system for encoding instructions as a very long instruction word for processing in a plurality of computation units that reduces instruction memory requirements in a processing system are described. The aspects include determining at which stages of instruction processing that an instruction code needs to be executed. Further, an enable signal of the instruction code is utilized to direct execution during the determined stages by controlling storage operations for the instruction code.Type: ApplicationFiled: July 25, 2001Publication date: January 30, 2003Inventor: Eugene B. Hogenauer
-
Patent number: 6499100Abstract: When decoding instructions of a program to be executed in a central processing unit comprising pipelining facilities for fast instruction decoding, part of the decoding is executed or the decoding in pipelining units is prepared in a remapping unit during loading a program into a program or primary memory used by the central processor, the remapping or predecoding operation resulting in operation codes which can be very rapidly interpreted by the pipelining units of the central processor. Thus, the operation code field of an instruction is changed to include information on e.g., instruction length, jumps, parameters, etc., this information indicating the instruction length, whether it is a jump instruction or has a parameter etc. respectively, in a direct way that allows the use of simple combinatorial circuits in the pipelining units.Type: GrantFiled: May 30, 2000Date of Patent: December 24, 2002Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Dan Halvarsson, Tomas Jonsson, Per Holmberg
-
Patent number: 6499096Abstract: A VLIW processor includes a plurality of containers holding a plurality of sub-instructions in a VLIW instruction, an exchanging portion exchanging the plurality of sub-instructions held in the plurality of containers and inputting the instructions to the plurality of containers, a plurality of decoders decoding the sub-instructions held in the plurality of containers, and a plurality of processing units executing the sub-instructions decoded by the plurality of decoders. Since the exchanging portion exchanges a plurality of sub-instructions held in the plurality of containers and inputs the instructions to the plurality of containers, a compressed code can be executed in such an execution sequence that is taken prior to compression.Type: GrantFiled: September 29, 1999Date of Patent: December 24, 2002Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Hiroaki Suzuki
-
Patent number: 6499097Abstract: The present invention provides an instruction fetch unit aligner. In one embodiment, an apparatus for an instruction fetch unit aligner includes selection logic for selecting a non-power of two size instruction from power of two size instruction data, and control logic for controlling the selection logic.Type: GrantFiled: May 31, 2001Date of Patent: December 24, 2002Assignee: Sun Microsystems, Inc.Inventors: Marc Tremblay, Graham R. Murphy, Frank C. Chiu
-
Publication number: 20020194454Abstract: The invention relates to a method and an arrangement for instruction word generation in the driving of functional units in a processor, the instruction words comprising a plurality of instruction word parts. In this case, in a program sequence, under the control of a program word, an instruction word is taken from a row—determined by a reading row number—of an instruction word memory that can be written to row by row, the said instruction word is altered by means of substitution of an instruction word part by the information part of the respective program word and is written back to a row of the instruction word memory, the said row being determined by a writing row number. Afterwards, an instruction word—which is generated in this way and is to be executed in accordance with the program—for driving the functional units is output to the processor.Type: ApplicationFiled: February 14, 2002Publication date: December 19, 2002Inventors: Matthias Weiss, Gerhard Fettweis
-
Patent number: 6496919Abstract: A data processor which includes a first processor for executing a first instruction set and a second processor for executing a second instruction set different from the first instruction set. When the first processor executes a predetermined instruction of the first instruction set the second processor executes an instruction of the second instructions set. The first processor may be a reduced instruction set computer (RISC) type processor, the second processor may be a very long instruction word (VLIW) type processor, the first instruction set may be a RISC instruction set and the second instruction set may be a VLIW instruction set. The predetermined instruction of the RISC instruction set executed by the first processor may be a branch instruction causing a branch to a specific address space at which VLIW instructions are stored. Thereafter, the VLIW instructions at the specific address space are executed by the VLIW type processor.Type: GrantFiled: August 25, 1999Date of Patent: December 17, 2002Assignee: Hitachi, Ltd.Inventors: Junichi Nishimoto, Hideo Maejima
-
Publication number: 20020178345Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.Type: ApplicationFiled: April 1, 2002Publication date: November 28, 2002Applicant: BOPS, Inc.Inventors: Thomas L. Drabenstott, Gerald George Pechanek, Edwin Franklin Barry, Charles W. Kurak,
-
Publication number: 20020169942Abstract: The VLIW processor according to the present invention, which executes in parallel a plurality of processings described in parallel in a VLIW instruction using a plurality of execution pipelines, performs pipeline execution of processings selected and designated from among the plurality of processings based on the VLIW instruction in respective steps on a diagonal formed by shifting one step at a time starting with an initial step in the order of parallel arrangement of the plurality of execution pipelines, one by one in the direction of the diagonal.Type: ApplicationFiled: May 3, 2002Publication date: November 14, 2002Inventor: Hideki Sugimoto
-
Publication number: 20020161986Abstract: An instruction processing method for checking an arrangement of basic instructions in a very long instruction word (VLIW) instruction, suitable for language processing systems, an assembler and a compiler, used for processors which execute variable length VLIW instructions designed based on variable length VLIW architecture.Type: ApplicationFiled: January 24, 2002Publication date: October 31, 2002Applicant: FUJITSU LIMITEDInventors: Teruhiko Kamigata, Hideo Miyake