Long Instruction Word Patents (Class 712/24)
-
Publication number: 20080133880Abstract: The data processing device has a plurality of functional units and issues instructions in successive instruction cycles. Instructions of a first type are each intended for one functional unit at a time. An instruction of a second type causes a combination of functional units to respond in the same instruction execution cycle, a result from one functional unit being used by another as part of the execution of the same instruction. Preferably, the device supports alternative operation at a number of different instruction cycle rates, dependent on whether an executed program segment contains instructions of the second type. The fastest instruction cycle rate does not allow execution of the instruction of the second type, because operation by different functional units does not fit within the instruction execution cycle.Type: ApplicationFiled: June 22, 2004Publication date: June 5, 2008Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.Inventors: Carlos Antonio Alba Pinto, Balakrishnan Srinivasan, Ramanathan Sethuraman
-
Patent number: 7383422Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.Type: GrantFiled: September 27, 2004Date of Patent: June 3, 2008Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
-
Patent number: 7380100Abstract: The present invention provides a data processing system that includes a plurality of processing units and first, second, and third data transfer means. The first data transfer means connects a plurality of processing units in a network, exchanges first data, and configures at least one reconfigurable data flow by connecting at least two of the plurality of processing units. The second data transfer means supplies control information that loads setting data as second data to the plurality of processing units in parallel. The third data transfer means supplies the setting data to each of the plurality of the processing units individually. Setting data is data for setting a data flow with a different function by directly or indirectly changing other processing unit connected to a processing unit via the first data transfer means, and/or changing a process included in the processing unit.Type: GrantFiled: September 6, 2002Date of Patent: May 27, 2008Assignee: IPFlex Inc.Inventors: Hiroshi Shimura, Kenji Ikeda, Tomoyoshi Sato
-
Patent number: 7376813Abstract: A data processing apparatus execution unit includes a multiplexer having inputs receiving data from sections of a source data register or registers. The multiplexer selects data from one section to store in a destination data register. The execution unit may zero extend or sign extend the remaining most significant bits of the destination data. In an alternative embodiment, the execution unit includes plural multiplexers, one for each section of the destination data. Each multiplexer received data from each section of the source data register or registers. Special codes in the sections of the second source data register may select 0 fill, 1 fill or sign extension from the next most significant section for each multiplexer.Type: GrantFiled: March 4, 2005Date of Patent: May 20, 2008Assignee: Texas Instruments IncorporatedInventor: Jagadeesh Sankaran
-
Patent number: 7376812Abstract: A processor can achieve high code density while allowing higher performance than existing architectures, particularly for Digital Signal Processing (DSP) applications. In accordance with one aspect, the processor supports three possible instruction sizes while maintaining the simplicity of programming and allowing efficient physical implementation. Most of the application code can be encoded using two sets of narrow size instructions to achieve high code density. Adding a third (and larger, i.e. VLIW) instruction size allows the architecture to encode multiple operations per instruction for the performance critical section of the code. Further, each operation of the VLIW format instruction can optionally be a SIMD operation that operates upon vector data. A scheme for the optimal utilization (highest achievable performance for the given amount of hardware) of multiply-accumulate (MAC) hardware is also provided.Type: GrantFiled: May 13, 2002Date of Patent: May 20, 2008Assignee: Tensilica, Inc.Inventors: Himanshu A. Sanghavi, Earl A. Killian, James Robert Kennedy, Darin S. Petkov, Peng Tu, William A. Huffman
-
Patent number: 7370136Abstract: A very long instruction word processor with sequence control. During each cycle the processor generates control signals to functional units based on the values in fields of an instruction. Each instruction may include an iteration count specifying the number of cycles for which the control signals should be generated based on that instruction. The instruction set further includes flow control instructions allowing for repetitive execution of a single instruction, repetitive execution of a block of instructions or branching within a program. Such a processor is illustrated in connection with a disk controller for a hard drive of a computer. The flexible sequencing allows a hard-drive controller to be readily reprogrammed for use in connection with different types of media or to be dynamically reprogrammed upon detection of a disk read error to increase the ability of the disk controller to recover data from a disk.Type: GrantFiled: January 26, 2005Date of Patent: May 6, 2008Assignee: STMicroelectronics, Inc.Inventor: Dillip K. Dash
-
Patent number: 7370182Abstract: A processor includes a program memory containing program instructions, and a processor core including several processing units and a central unit. The central unit, upon receipt of a program instruction, issues corresponding instructions to the various processing units. The processor core is clocked by a clock signal. A branching instruction received by the central unit, in the course of a current cycle, is processed in the course of the current cycle.Type: GrantFiled: February 25, 2002Date of Patent: May 6, 2008Assignee: STMicroelectronics SAInventors: Andrew Cofler, Anne Merlande, Sebastien Ferroussat
-
Patent number: 7366874Abstract: Apparatus and method for dispatching a very long instruction word (VLIW) instruction having a variable length are provided. The apparatus for dispatching a VLIW instruction includes a packet buffer for storing at least one or more VLIW instructions, and a decoding unit configured to constitute a VLIW instruction to be currently executed among the VLIW instructions stored in the packet buffer and decode predetermined bits of each sub-instruction contained in the VLIW instruction. The apparatus dispatches a corresponding sub-instruction to an FU which corresponds to each sub-instruction, based on the results of decoding performed in the decoding unit, position information on the sub-instructions that are placed on the packet buffer, and position information on the sub-instructions that are placed in the current VLIW instruction. Sub-instructions can be effectively dispatched to corresponding FUs using simple decoding logic even in a case where the length of the VLIW instruction is not fixed.Type: GrantFiled: December 3, 2002Date of Patent: April 29, 2008Assignee: Samsung Electronics Co., Ltd.Inventors: Nak-hee Seong, Kyoung-mook Lim, Seh-woong Jeong, Jae-hong Park, Hyung-jun Im, Gun-young Bae, Young-duck Kim
-
Patent number: 7343475Abstract: A processor including an integer processing unit and a data processing unit. The processor can be operated by a first instruction format or a second instruction format. The first instruction format includes only an instruction for the integer processing unit, and is executed in the integer processing unit alone. The second instruction format includes instructions for the integer processing unit and the data processing unit, the second instructions being executed in both the integer processing unit and the data processing unit in parallel. When an instruction in the first instruction format is to be executed, only in the integer processing unit, a control signal is generated in the integer processing unit and is supplied to the data processing unit to halt an operation of the data processing unit.Type: GrantFiled: July 12, 2006Date of Patent: March 11, 2008Assignee: Kabushiki Kaisha ToshibaInventor: Takashi Miyamori
-
Patent number: 7343471Abstract: Instructions of a program are stored in compressed form in a program memory (12). In a processor which executes the instructions, a program counter (50) identifies a position in the program memory. An instruction cache (40) has cache blocks, each for storing one or more instructions of the program in decompressed form. A cache loading unit (42) includes a decompression section (44) and performs a cache loading operation in which one or more compressed-form instructions are read from the position in the program memory identified by the program counter and are decompressed and stored in one of the said cache blocks of the instruction cache. A cache pointer (52) identifies a position in the instruction cache of an instruction to be fetched for execution. An instruction fetching unit (46) fetches an instruction to be executed from the position identified by the cache pointer.Type: GrantFiled: January 12, 2005Date of Patent: March 11, 2008Assignee: PTS CorporationInventor: Nigel Peter Topham
-
Patent number: 7340591Abstract: A number of architectural and implementation approaches are described for using extra path (Epath) storage that operate in conjunction with a compute register file to obtain increased instruction level parallelism that more flexibly addresses the requirements of high performance algorithms. A processor that supports a single load data to a register file operation can be doubled in load capability through the use of an extra path storage, an additional independently addressable data memory path, and instruction decode information that specifies two independently load data operations. By allowing the extra path storage to be accessible by arithmetic facilities, the increased data bandwidth can be fully utilized.Type: GrantFiled: October 28, 2004Date of Patent: March 4, 2008Assignee: Altera CorporationInventors: Gerald George Pechanek, Patrick R. Marchand, Larry D. Larsen
-
Publication number: 20080046689Abstract: A cooperative multithreading architecture includes an instruction cache, capable of providing a micro-VLIW instruction; a first cluster, connects to the instruction cache to fetch the micro-VLIW instruction; and a second cluster, connects to the instruction cache to fetch the micro-VLIW instruction and capable of execution acceleration. The second cluster includes a second front-end module, connects to the instruction cache and capable of requesting and dispatching the micro-VLIW instruction; a helper dynamic scheduler, connects to the second front-end module and capable of dispatching the micro-VLIW instruction; a non-shared data path, connects to the second front-end module and capable of providing a wider data path; and a shared data path, connected to the helper dynamic scheduler and capable of assisting a control part of the non-shared data path. The first cluster and the second cluster carry out execution of the respective micro-instructions in parallel.Type: ApplicationFiled: August 21, 2006Publication date: February 21, 2008Inventors: Tien-Fu Chen, Shu-Hsuan Chou, Chieh-Jen Cheng, Zhi-Heng Kang
-
Patent number: 7325232Abstract: A compiler for multiple processor and distributed memory architectures is described. The compiler uses a high-level language to represent a task-level network of behaviors that describes an embedded system. The compiler maps a plurality of tasks and data onto a multiple processor, distributed memory hardware architecture. The mapping includes describing a task-level network of behaviors, each of the task-level network of behaviors being related through control and data flow. The mapping further includes predicting a schedule of tasks for the task-level network of behaviors and allocating the plurality of tasks and data to at least one of the multiple processors and to at least one of distributed memory, respectively, in response to the predicted schedule of tasks.Type: GrantFiled: January 25, 2002Date of Patent: January 29, 2008Assignee: Improv Systems, Inc.Inventor: Clifford Liem
-
Patent number: 7313646Abstract: An electronic system comprises an initiator module and a target module addressable by the initiator module, and an interface and control module for interfacing between respective communication protocols of the initiator module and of the target module. The interface and control module is constructed to set a composite instruction detection signal in response to the detection of a composite instruction executed by the initiator module, which composite instruction detection signal is used for the interfacing. The interface and control module is constructed to detect a composite instruction executed by the initiator module when, at a determined clock cycle of the initiator module, a change of the elementary operation executed by the initiator module is detected with respect to the previous clock cycle of the initiator module, while, at the same time, a signal for selecting the target module which was active is kept active.Type: GrantFiled: May 26, 2005Date of Patent: December 25, 2007Assignee: STMicroelectronics S.A.Inventors: Hervé Chalopin, Laurent Tabaries
-
Patent number: 7313671Abstract: Computer architectures consist of a fixed data path, which is controlled by a set of control words. Each control word controls part of the data path. Each set of instructions generates a new set of control words. In case of a VLIW processor, multiple instructions are packaged into one so-called VLIW instruction. A VLIW processor uses multiple, independent functional units to execute these multiple instructions in parallel. Application specific domain tuning of a VLIW processor requires that instructions having varying requirements with respect to the number of instruction bits they require can be encoded in a single VLIW instruction, such that an efficient encoding and encoding of instructions is maintained. The present invention describes a processing apparatus as well as a processing method for processing data, allowing the use of such an asymmetric instruction set.Type: GrantFiled: July 18, 2003Date of Patent: December 25, 2007Assignee: Koninklijke Philips Electronics, N.V.Inventor: Jeroen Anton Johan Leijten
-
Patent number: 7310723Abstract: Methods and systems thereof for exception handling are described. An event to be handled is identified during execution of a code sequence. A bit is set to indicate that handling of the event is to be deferred. An exception corresponding to the event is generated if the bit is set.Type: GrantFiled: April 2, 2003Date of Patent: December 18, 2007Assignee: Transmeta CorporationInventors: Guillermo J. Rozas, Alexander Klaiber
-
Patent number: 7305542Abstract: Speculatively decoding instruction lengths in order to increase instruction throughput. Instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles.Type: GrantFiled: June 25, 2002Date of Patent: December 4, 2007Assignee: Intel CorporationInventor: Venkateswara Rao Madduri
-
Patent number: 7302552Abstract: A processor is described including a plurality of data path elements which independently perform in parallel different data processing operations. Program instructions are provided which are decoded to generate control signals for controlling the data path elements. Multiple instruction sets are supported with the same data processing operation to be performed by the same data path element being differently encoded within different instructions of different instruction sets. This enables code compaction when little parallelism may be achieved and full parallelism to be specified when this is possible.Type: GrantFiled: October 14, 2004Date of Patent: November 27, 2007Assignee: Arm LimitedInventors: Jan Guffens, Ludwig Callewaert, Koenraad Van Nieuwenhove
-
Patent number: 7302555Abstract: Programmable processors are used to transform input data into output data based on program information encoded in instructions. The value of the resulting output data depends, amongst others, on the momentary state of the processor at any given moment in time. This state is composed of temporary data values stored in registers, for example, as well as so-called flags. A disadvantage of the principle of flags, is that they cause side effects in the processor, especially in parallel processors. However, when removing the traditional concept of flags, the remaining problem is the implementation of branching. A processing system according to the invention comprises an execution unit (EX1, EX2), a first register file (RF1, RF2) for storing data, a memory (PM) and a second register file (RF3) for storing a program counter. The execution unit conditionally executes dedicated instructions for writing a value of the program counter into the second register file.Type: GrantFiled: April 27, 2004Date of Patent: November 27, 2007Assignee: Koninklijke Philips Electronics, N.V.Inventor: Jeroen Anton Johan Leijten
-
Patent number: 7299339Abstract: A field programmable gate array includes a virtual bus interface that receives a control word from a host processor over a standard I/O bus. A configurable very long instruction word (VLIW) controller receives the control word via virtual bus interface signals mapped from the virtual bus interface. A reconfigurable communication and control fabric controls the data paths and programming modes of single instruction-multiple data (SIMD) processing element cells. The configurable VLIW controller has an interface with the reconfigurable communication and control fabric. SIMD processing element cells are controlled by the configurable VLIW controller through the reconfigurable communication and control fabric via the interface.Type: GrantFiled: August 30, 2004Date of Patent: November 20, 2007Assignee: The Boeing CompanyInventor: Tirumale K. Ramesh
-
Patent number: 7296120Abstract: Disclosed is an apparatus, method, and program product that provides atomic, multi-word load support without incurring additional memory utilization. A double-word is atomically loaded without the use of one or more additional fields and without a lock. An invalidity marker is used in connection with a cache miss time to ascertain whether a loaded double-word has been stored and loaded atomically, and is thus, valid.Type: GrantFiled: November 18, 2004Date of Patent: November 13, 2007Assignee: International Business Machines CorporationInventors: Michael Joseph Corrigan, Timothy Joseph Torzewski
-
Patent number: 7290122Abstract: A method and apparatus for power reduction in a processor controlled by multiple-instruction control words. A multiple-instruction control word comprises a number of ordered fields, with each ordered field containing an instruction for an element of the processor. The sequence of instructions for a loop is compressed by identifying a set of aligned fields that contain NOP instructions in all of the control words of the sequence. The sequence of control words is then modified by removing the fields of the identified aligned set containing NOP instructions and adding an identifier that identifies the set of fields removed. The sequence of control words is processed by fetching the identifier at the start the loop, then, for each control word in the sequence, fetching a control word and reconstructing the corresponding uncompressed control word by inserting NOP instructions into the compressed control word as indicated by the identifier.Type: GrantFiled: August 29, 2003Date of Patent: October 30, 2007Assignee: Motorola, Inc.Inventors: Philip E. May, Brian G. Lucas, Kent D. Moat
-
Patent number: 7290120Abstract: A microprocessor having a power-saving fetch and decoding unit for fetching and decoding compressed program instructions and having a program instruction sequencer is disclosed. The microprocessor based on the inventive architecture has a power-saving fetch and decoding unit for fetching and decoding program instructions. The fetch and decoding unit has a program instruction memory which receives a sequential program instruction address addressing the next program instruction memory line which is to be read, having at least one program instruction memory line which can store an indicator flag, a long program instruction index, a short program instruction and a first source register address. A directory memory receives the long program instruction index (6) addressing the next directory memory line which is to be read.Type: GrantFiled: January 14, 2005Date of Patent: October 30, 2007Assignee: Infineon Technologies AGInventor: Lorenzo DiGregorio
-
Patent number: 7287151Abstract: A VLIW processor comprising a plurality of functional units (1, 3, 5, 7), a distributed register file (9, 11, 13, 15) accessible by the functional units (1, 3, 5, 7), a partially connected communication network (17) for coupling the functional units (1, 3, 5, 7) and selected parts of the distributed register file (9, 11, 13, 15), characterized in that the VLIW processor further comprises a communication device (29) for coupling the functional units (1, 3, 5, 7) and the distributed register file (9, 11, 13, 15).Type: GrantFiled: March 28, 2002Date of Patent: October 23, 2007Assignee: NXP B.V.Inventors: Marco Jan Gerrit Bekooij, Bernardo Oliveira Kastrup Pereira
-
Patent number: 7281119Abstract: A computer system supplies instructions simultaneously to a plurality of parallel execution pipelines in either superscalar mode or very long instruction word mode with checks for vertical and horizontal dependency between instructions, the horizontal dependency checks between instructions supplied in the same machine cycle being effective in superscalar mode but disabled in very long instruction word mode.Type: GrantFiled: May 2, 2000Date of Patent: October 9, 2007Assignee: STMicroelectronics S.A.Inventors: Andrew Cofler, Bruno Fel, Laurent Ducousso
-
Patent number: 7281117Abstract: A processor according to the present invention includes a decoding unit 20, an operation unit 40 and others. When the decoding unit 20 decodes Instruction vcchk, the operation unit 40 and the like judges whether vector condition flags VC0ËśVC3 (110) of a condition flag register (CFR) 32 are all zero or not, and (i) sets condition flags C4 and C5 of the condition flag register (CFR) 32 to 1 and 0, respectively, when all of the vector condition flags VC0ËśVC3 are zero, and (ii) sets the condition flags C4 and C5 to 0 and 1, respectively, when not all the vector condition flags are zero. Then, the vector condition flags VC0ËśVC3 are stored in the condition flags C0ËśC3.Type: GrantFiled: September 24, 2003Date of Patent: October 9, 2007Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Tetsuya Tanaka, Hazuki Okabayashi, Taketo Heishi, Hajime Ogawa, Tsuneyuki Suzuki, Tokuzo Kiyohara, Takeshi Tanaka, Hideshi Nishida, Masaki Maeda
-
Patent number: 7269720Abstract: Techniques are described for dynamically controlling the execution of operations within a multi-operation instruction, such as a very long instruction word (VLIW). A programmable processor fetches and executes a first instruction having an operation mask. Based on the operation mask, the processor selectively executes one or more operations within a second instruction. Individual operations within a multi-operation instruction can be selectively enabled and disabled, which is advantageous in many situations, including event handling and code debugging.Type: GrantFiled: April 4, 2005Date of Patent: September 11, 2007Assignee: NXP B.V.Inventors: Marcel J. A. Tromp, Frans W Sijstermans, Sunny C Huang, Rudolf H. J. Bloks
-
Patent number: 7266671Abstract: There is disclosed a technique for accessing a register file which comprises defining a first register address as a plurality of bits and using said first register address to access said register file generating a second register address by using a sequence of said plurality of bits with at least one of said plurality of bits supplied via a unitary operator, the unitary operator being effective to selectively alter the logical value of said bit depending on its logical value in the first register address, and using said second register address to access said register file. A computer system for carrying out such a technique is also enclosed.Type: GrantFiled: December 6, 2004Date of Patent: September 4, 2007Assignee: Broadcom CorporationInventors: Mark Taunton, Sophie Wilson, Timothy Martin Dobson
-
Patent number: 7260709Abstract: The present invention relates to a processing method and apparatus for implementing a systolic-array-like structure. Input data are stored in a depth-configurable register means (DCF) in a predetermined sequence, and are supplied to a processing means (FU) for processing said input data based on control signals generated from instruction data, wherein the depth of the register means (DCF) is controlled in accordance with the instruction data. Thereby, systolic arrays can be mapped onto a programmable processor, e.g. a VLIW processor, without the need for explicitly issuing operations to implement the register moves that constitute the delay lines of the array.Type: GrantFiled: April 1, 2003Date of Patent: August 21, 2007Assignee: Koninklijke Philips Electronics N.V.Inventor: Bernardo De Oliveira Kastrup Pereira
-
Patent number: 7257696Abstract: Techniques for adding more complex instructions and their attendant multi-cycle execution units with a single instruction multiple data stream (SIMD) very long instruction word (VLIW) processing framework are described. In one aspect, an initiation mechanism also acts as a resynchronization mechanism to read the results of multi-cycle execution. This multi-purpose mechanism operates with a short instruction word (SIW) issue of the multi-cycle instruction, in a sequence processor (SP) alone, with a VLIW, and across all processing elements (PEs) individually or as an array of PEs. A number of advantageous floating point instructions are also described.Type: GrantFiled: August 15, 2003Date of Patent: August 14, 2007Assignee: Altera CorporationInventors: Gerald George Pechanek, David Strube, Edward A. Wolff, Edwin Franklin Barry, Grayson Morris, Carl Donald Busboom, Dale Edward Schneider
-
Patent number: 7254697Abstract: Dynamic reformatting of a dispatch group by selective activation of inactive Start bits of instructions within the dispatch group at the time the instructions are read from the IBUF. The number of instructions in the reformatted dispatch groups can vary from as few as one instruction per group to a maximum number of instructions read from the IBUF per cycle. The reformatted dispatch groupings can be terminated after a single cycle, or they can remain reformatted for as many cycles as desired, depending upon need.Type: GrantFiled: February 11, 2005Date of Patent: August 7, 2007Assignee: International Business Machines CorporationInventors: James Wilson Bishop, Hung Qui Le, Jafar Nahidi, Dung Quoc Nguyen, Brian William Thompto
-
Publication number: 20070168645Abstract: Methods and processor architectures for the execution of instruction having a condition are disclosed. Very long instruction words can be loaded from a memory unit into an instruction word decoder and the decoder can separate the VLIW into processable sequences. Each processable sequence can be processable by a processing unit among a plurality of processing units. Each processable sequence can be executed independently in the absence of a condition in the processable sequences, and when the processable sequences contain a condition, processing units can be logically coupled together to add processing resources to a processing intensive condition type code to assist in disposing of the conditional execution quickly by assigning these additional resources.Type: ApplicationFiled: January 16, 2007Publication date: July 19, 2007Inventors: Karl Heinz Grabner, Robert Klima
-
Patent number: 7243213Abstract: A procedure for translating ARM instructions of a first set into instructions of a second set for execution on an LX processor comprising a core provides a first set of registers corresponding to the ARM instructions and a second set of registers corresponding to the instructions that can be executed on the LX processor. Each register of the first set is mapped in a corresponding register of the second set designed to emulate the behavior of the first register, obtaining a unique independent translation of the first set into the second set. The translation is performed by a translation device external to the LX core without altering the core, and the translation operating without accessing resources of the core, by the translating device intercepting accesses of the core to the storage area reserved to the ARM instructions.Type: GrantFiled: February 10, 2004Date of Patent: July 10, 2007Assignee: STMicroelectronics S.r.l.Inventors: Andrea Pagni, Fabrizio Lucini, Danilo Pietro Pau, Antonio Maria Borneo, Vittorio Zaccaria
-
Patent number: 7234042Abstract: An instruction set for a computer is described which includes instructions having a common predetermined bit length. That predetermined bit length can define a single operation or two independent operations. The instruction includes designated bits at predetermined bit locations which identify whether the instruction is a long instruction or a dual operation instruction.Type: GrantFiled: September 13, 1999Date of Patent: June 19, 2007Assignee: Broadcom CorporationInventor: Sophie Wilson
-
Patent number: 7203932Abstract: A method for using idiom recognition during a software translation process. The method includes accessing non-native instructions of a non-native application, determining whether an instruction pattern of the non-native instructions is recognized from a previous execution, if recognized, retrieving and executing translated instructions corresponding to the non-native instructions.Type: GrantFiled: December 30, 2002Date of Patent: April 10, 2007Assignee: Transmeta CorporationInventors: Dean Gaudet, Brian O'Clair
-
Patent number: 7194734Abstract: A threaded interpreter executes a program having a series of program instructions stored in a memory. For the execution of a program instruction the threaded interpreter includes a preparatory unit for executing a plurality of preparatory steps making th program instruction available in the threaded interpreter, and an execution unit with one or more machine instructions emulating the program instruction. The threaded interpreter is designed such that during the execution on an instruction-level parallel processor of the series of program instructions. Machine instructions implement a first one of the preparatory steps for execution in parallel with machine instructions implementing a second one of the preparatory steps for respective ones of the series of program instructions.Type: GrantFiled: February 13, 2003Date of Patent: March 20, 2007Assignee: Koninklijke Philips Electronics N.V.Inventors: Jan Hoogerbrugge, Alexander Augusteijn
-
Patent number: 7181595Abstract: To decode a composite VLIW packet, assembly code is provided for the bit patterns corresponding to each individual instruction in the packet. The bit pattern for the template in the packet is then matched against a known template. The known template uniquely corresponds to a known syntax corresponding to one of a plurality of known syntaxes, where the plurality of known syntaxes are arranged as a plurality of first level nodes in a tree structure, where each of a plurality of second level nodes in said tree structure includes a combination of instruction types, and where each of a plurality of third level nodes in said tree structure includes an instruction type. The known syntax is then matched to a resolved packet syntax using the tree structure. The resolved packet syntax is then used to provide assembly code associated with the execution of the combination of instructions in the packet.Type: GrantFiled: November 22, 2000Date of Patent: February 20, 2007Assignee: Mindspeed Technologies, Inc.Inventor: Charles P. Siska
-
Patent number: 7174014Abstract: The present invention provides permutation instructions usable in a programmable processor for solving permutation problems in cryptography, multimedia and other applications. PPERM and PPERM3R instructions are defined to perform permutations by a sequence of instructions with each sequence specifying the position in the source for each bit in the destination. In the PPERM instruction bits in the destination register that change are updated and bits in the destination register that do not change are set to zero. In the PPERM3R instruction bits in the destination register that change are updated and bits in the destination register that do not change are copied from intermediate result of previous PPERM3R instructions. Both PPERM and PPERM3R instructions can individually do permutation with bit repetition. Both PPERM and PPERM3R instructions can individually do permutation of bits stored in more than one register. In an alternate embodiment, a GRP instruction is defined to perform permutations.Type: GrantFiled: May 7, 2001Date of Patent: February 6, 2007Assignee: Teleputers, LLCInventors: Ruby B. Lee, Zhijie Shi
-
Patent number: 7162620Abstract: A multi-processing computer architecture and a method of operating the same are provided. The multi-processing architecture provides a main processor and multiple sub-processors cascaded together to efficiently execute loop operations. The main processor executes operations outside of a loop and controls the loop. The multiple sub-processors are operably interconnected, and are each assigned by the main processor to a given loop iteration. Each sub-processor is operable to receive one or more sub-instructions sequentially, operate on each sub-instruction and propagate the sub-instruction to a subsequent sub-processor.Type: GrantFiled: July 24, 2002Date of Patent: January 9, 2007Assignee: Sony Computer Entertainment Inc.Inventor: Hidetaka Magoshi
-
Patent number: 7146487Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.Type: GrantFiled: November 20, 2003Date of Patent: December 5, 2006Assignee: Altera CorporationInventors: Thomas L. Drabenstott, Gerald George Pechanek, Edwin Franklin Barry, Charles W. Kurak, Jr.
-
Patent number: 7139897Abstract: Circuit arrangement and method for dispatching computer instructions. In a processor having a plurality of types of execution units, the computer instructions are grouped in bundles, and each bundle includes a plurality of instructions and an associated index code. Template values are stored in a plurality of template registers, and each template value specifies types of execution units for a bundle of instructions and those instructions in a bundle that are executable in parallel. A dispatch logic circuit is coupled to the template registers and is responsive to an input bundle of instructions and associated index value. The dispatch logic circuit reads a code from a selected one of the plurality of template registers referenced by the index value and issues one or more selected instructions in the bundle to at least one execution unit of a selected type responsive to the code read from the selected one of the plurality of template registers.Type: GrantFiled: April 1, 2002Date of Patent: November 21, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventors: Paul Keltcher, Gary Vondran
-
Patent number: 7134001Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.Type: GrantFiled: June 16, 2003Date of Patent: November 7, 2006Assignee: Transmeta CorporationInventors: Brett Coon, Godfrey D'Souza, Paul Serris
-
Patent number: 7127588Abstract: In one exemplary embodiment, the disclosed VLIW processor comprises a number of threads where each thread includes a processing unit. For example, there can be two threads, where each of the two threads has its own processing unit. According to this exemplary embodiment, a number of VLIW packets are divided into a number of issue groups. As an example, two VLIW packets are divided into two issue groups each. The first issue group in the first VLIW packet is provided to a first thread for execution in the first thread processing unit during a first clock cycle. Concurrently, the first issue group in the second VLIW packet is provided to a second thread for execution in the second thread processing unit during the same clock cycle, i.e. during the first clock cycle. Moreover, the second issue group in the first VLIW packet is provided to the first thread for execution in the first thread processing unit during a second clock cycle.Type: GrantFiled: December 5, 2000Date of Patent: October 24, 2006Assignee: Mindspeed Technologies, Inc.Inventors: Moataz A. Mohamed, John R. Spence
-
Patent number: 7124279Abstract: Instructions of a program are stored in compressed form in a program memory. A cache loading unit includes a decompression section and performs a cache loading operation in which one or more compressed-form instructions are read from the position in the program memory identified by the program counter and are decompressed and stored in one of the said cache blocks of the instruction cache. When a cache miss occurs because the instruction to be fetched is not present in the instruction cache, a cache loading unit performs such a cache loading operation. An updating unit updates the program counter and cache pointer in response to the fetching of instructions so as to ensure that the position identified by the said program counter is maintained consistently at the position in the program memory at which the instruction to be fetched from the instruction cache is stored in compressed form.Type: GrantFiled: May 22, 2001Date of Patent: October 17, 2006Assignee: PTS CorporationInventor: Nigel Peter Topham
-
Patent number: 7120780Abstract: A microprocessor for processing instructions comprises multiple clusters for receiving the instructions, each of the clusters having a plurality of functional units for executing the instructions, multiple register sub-files each having multiple registers for storing data for executing the instructions, wherein each of the clusters is associated with corresponding one of the register sub-files so that an instruction dispatched to a cluster is executed by accessing registers in a register sub-file associated with the cluster to which the instruction is dispatched, a register-renaming unit for renaming target registers in an instruction with registers in a register sub-file associated with a cluster to which the instruction is dispatched, and issue-queue units each of which is associated with a corresponding one of the clusters, wherein an issue-queue unit holds instruction renamed by the register-renaming unit until the renamed instruction is issued to be executed in a cluster associated with the issue-queue uType: GrantFiled: March 4, 2002Date of Patent: October 10, 2006Assignee: International Business Machines CorporationInventor: Mayan Moudgill
-
Patent number: 7111152Abstract: Instructions in a computer system are executed in a plurality of parallel execution pipelines, a horizontal dependency check is carried out between instructions supplied to the parallel pipelines and in response to detecting horizontal dependency a control signal of a first or second type is generated depending on whether the dependency can be resolved by activating a by-pass or whether a temporary stall is required in one of the pipelines.Type: GrantFiled: May 2, 2000Date of Patent: September 19, 2006Assignee: STMicroelectronics S.A.Inventors: Andrew Cofler, Bruno Fel, Laurent Ducousso
-
Patent number: 7107432Abstract: A VLIW processor comprising: a plurality of functional units (1, 3); a distributed register file (4) comprising a plurality of segments (5, 7, 9), the distributed register file (4) being accessible by the functional units (1, 3); a communication unit (11) for communication with a memory; a communication network (13) for coupling the functional units (1, 3) and the distributed register file (4); characterized in that the VLIW processor further comprises a spilling device (15) for transferring data between the distributed register file (4) and the communication unit (11).Type: GrantFiled: April 1, 2003Date of Patent: September 12, 2006Assignee: Koninklijke Philips Electronics N.V.Inventors: Tromp Johannes De Vries, Marco Jan Gerrit Bekooij, Alexander Augusteijn, Johan Sebastiaan Henri Van Gageldonk
-
Patent number: 7100022Abstract: In one embodiment, move buses utilized in presently known VLIW processors are eliminated and replaced with a busing scheme which results in transfer of operands from each register file bank to any data path block while also reducing the total bus width and total power consumption associated with transport of operands from register file banks to data path blocks. According to this busing scheme, the speed of VLIW processor is also improved since the need for one clock cycle to move operands from one register file bank to another is overcome. In another embodiment, a scheduling restriction is used to eliminate the need for the presently required write back buses used by various data path blocks. In yet another embodiment, a scheduling restriction is imposed which results in a reduction of the number of ports, a reduction in the width of buses, and a reduction of power consumption.Type: GrantFiled: February 28, 2002Date of Patent: August 29, 2006Assignee: Mindspeed Technologies, Inc.Inventors: Moataz Mohamed, John Spence, Kevin R. Bowles, Chien-Wei Li
-
Patent number: 7096343Abstract: A method and apparatus are disclosed for allocating functional units in a multithreaded very large instruction word (VLIW) processor. The present invention combines the techniques of conventional very long instruction word architectures and conventional multithreaded architectures to reduce execution time within an individual program, as well as across a workload. The present invention utilizes instruction packet splitting to recover some efficiency lost with conventional multithreaded architectures. Instruction packet splitting allows an instruction bundle to be partially issued in one cycle, with the remainder of the bundle issued during a subsequent cycle. The allocation hardware assigns as many instructions from each packet as will fit on the available functional units, rather than allocating all instructions in an instruction packet at one time. Those instructions that cannot be allocated to a functional unit are retained in a ready-to-run register.Type: GrantFiled: March 30, 2000Date of Patent: August 22, 2006Assignee: Agere Systems Inc.Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
-
Patent number: 7089402Abstract: Controlling an order of instructions executed by a VLIW computing architecture comprised of a plurality of computing functional units. A first VLIW instruction sequence is generated based on jobs selected from a job queue, each instruction field of the VLIWs corresponding to one of the computing functional units and containing a sequential instruction. The first VLIW sequence is sequentially executed by the computing architecture, and a detection is made if any of the computing functional units is in a free state. When at least one free computing functional units is detected, a second sequence of long instruction words is generated including instructions from a selected new job from the job queue for each such free computing functional unit. The second sequence of long instruction words is copied into the first sequence of long instruction words, and execution of the first sequence of long instruction words is resumed if it was halted.Type: GrantFiled: December 12, 2001Date of Patent: August 8, 2006Assignee: Canon Kabushiki KaishaInventor: Sadahiro Tanaka