Long Instruction Word Patents (Class 712/24)
  • Patent number: 7873794
    Abstract: Disclosed is an apparatus, method, and program product that provides atomic, multi-word load support without incurring additional memory utilization. A double-word is atomically loaded without the use of one or more additional fields and without a lock. An invalidity marker is used in connection with a cache miss time to ascertain whether a loaded double-word has been stored and loaded atomically, and is thus, valid.
    Type: Grant
    Filed: August 21, 2007
    Date of Patent: January 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Michael Joseph Corrigan, Timothy Joseph Torzewski
  • Patent number: 7861061
    Abstract: A processor and a method for executing VLIW instructions by first fetching a VLIW instruction and then identifying from option bits encoded in a first one of the instructions within the fetched VLIW instruction packet which, if any, of the remaining instructions within the VLIW instruction are to be executed in the same execution cycle as the first instruction. Finally, executing the first instruction and any remaining instructions identified from the encoded option bits.
    Type: Grant
    Filed: May 23, 2003
    Date of Patent: December 28, 2010
    Assignee: STMicroelectronics (R&D) Ltd.
    Inventor: Zahid Hussain
  • Patent number: 7831804
    Abstract: A processor architecture includes a number of processing elements for treating input signals. The architecture is organized according to a matrix including rows and columns, the columns of which each include at least one microprocessor block having a computational part and a set of associated processing elements that are able to receive the same input signals. The number of associated processing elements is selectively variable in the direction of the column so as to exploit the parallelism of said signals. Additionally the processor architecture of the present invention enable dynamic switching between instruction parallelism and data parallel processing typical of vectorial functionality. The architecture can be scaled in various dimensions in an optimal configuration for the algorithm to be executed.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: November 9, 2010
    Assignee: ST Microelectronics S.R.L.
    Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elio Guidetti
  • Publication number: 20100268862
    Abstract: A technology for controlling a reconfigurable processor is provided. The reconfigurable processor dynamically loads configuration data from a peripheral memory to a configuration memory while a program is being executed, in place of loading all compiled configuration data in advance into the configuration memory when booting commences. Accordingly, a reduction in capacity of a configuration memory may be achieved.
    Type: Application
    Filed: March 2, 2010
    Publication date: October 21, 2010
    Inventors: Jae-un PARK, Ki-seok KWON, Sang-suk LEE
  • Publication number: 20100211759
    Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.
    Type: Application
    Filed: February 16, 2010
    Publication date: August 19, 2010
    Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, II-hyun Park
  • Patent number: 7774581
    Abstract: An apparatus and a method are provided for a parallel processing very long instruction word (VLIW) computer. The apparatus includes: an index code generation unit sequentially generating an index code, which is associated with a number of no operation (NOP) instruction word between effective instruction words, with respect to each of instruction word groups to be executed in a VLIW computer; an instruction compression unit sequentially deleting the NOP instruction word which corresponds to the index code with respect to each of instruction word groups; and an instruction word conversion unit converting the effective instruction words to include the index code, the effective instruction words corresponding to the NOP instruction words.
    Type: Grant
    Filed: August 14, 2007
    Date of Patent: August 10, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chang-Woo Baek, Hong-Seok Kim, Hee Seok Kim, Jeongwook Kim
  • Patent number: 7747843
    Abstract: A computer system with a processor architecture having more than one execution channel is described. The processor architecture contains at least one load/store unit for loading and storing data objects, and at least one data cache memory associated to the processor holding data objects accessed by the processor. The processor's load/store unit includes a load/store memory directly interfacing the load/store unit to the data cache.
    Type: Grant
    Filed: June 2, 2004
    Date of Patent: June 29, 2010
    Assignee: Broadcom Corporation
    Inventors: Sophie Wilson, John E. Redford
  • Patent number: 7739479
    Abstract: A method of providing physics data within a game program or simulation using a hardware-based physics processing unit having unique architecture designed to efficiently calculate physics related data.
    Type: Grant
    Filed: November 19, 2003
    Date of Patent: June 15, 2010
    Assignee: NVIDIA Corporation
    Inventors: Jean Pierre Bordes, Curtis Davis, Monier Maher, Manju Hegde, Otto A. Schmid
  • Patent number: 7694301
    Abstract: A method for supporting input/output for a virtual machine. The method includes the step of executing virtual machine application instructions, wherein the application instructions are executed using micro architecture code of a processor architecture. An I/O access is received from the virtual machine application. Virtual memory protection is used to generate an exception, wherein the exception is caused by the I/O access. A single step mode is entered to perform the I/O access using a host operating system. State data for the virtual machine application is updated in accordance with the I/O access. Subsequently, execution of the virtual machine application is resumed.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: April 6, 2010
    Inventors: Nathan Laredo, Linus Torvalds
  • Patent number: 7685403
    Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in to the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.
    Type: Grant
    Filed: June 16, 2003
    Date of Patent: March 23, 2010
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
  • Patent number: 7681046
    Abstract: A system with secure cryptographic capabilities using a hardware specific digital secret.
    Type: Grant
    Filed: September 26, 2003
    Date of Patent: March 16, 2010
    Inventors: Andrew Morgan, H. Peter Anvin
  • Patent number: 7673119
    Abstract: This invention is useful in a very long instruction word data processor that fetches a predetermined plural number of instructions each operation cycle. A predetermined one of these instructions is used as a special header. This special header has a unique encoding different from any normal instruction. When decoded this special header instructs decode hardware to decode this fetch packet in a special way. In one embodiment a bit field in the header signals the decode hardware whether to decode each instruction word normally or in an alternative way. The header may include extension opcode bits corresponding to each of the other instruction slots. In another embodiment another bit field signals whether to decode an instruction field as one normal length instruction or as two half-length instructions.
    Type: Grant
    Filed: May 8, 2006
    Date of Patent: March 2, 2010
    Assignee: Texas Instruments Incorporated
    Inventors: Michael D. Asal, Eric J. Stotzer, Todd T. Hahn
  • Patent number: 7669041
    Abstract: A processor having a zero-overhead operand copy capability. The processor includes multiple execution units to execute instructions in parallel and multiple register files each associated with one or more of the execution units. The processor further includes circuitry to select either an instruction execution result from a first one of the execution units or content of a register within a first one of the register files associated with the first one of the execution units to be stored within a register within a second one of the register files.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: February 23, 2010
    Assignee: Stream Processors, Inc.
    Inventors: Brucek Khailany, Ujval J. Kapasi
  • Patent number: 7664929
    Abstract: A program of instruction words is executed with a VLIW data processing apparatus. The apparatus comprises a plurality of functional units capable of executing a plurality of instructions from each instruction word in parallel. The instructions from each of at least some of the instruction words are fetched from respective memory units in parallel, addressed with an instruction address that is common for the functional units. Translation of the instruction address into a physical address can be modified for one or more particular ones of the memory units. Modification is controlled by modification update instructions in the program. Thus, it can be selected dependent on program execution which instructions from the memory units will be combined into the instruction word in response to the instruction address.
    Type: Grant
    Filed: September 17, 2003
    Date of Patent: February 16, 2010
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Carlos Antonio Alba Pinto, Ramanathan Sethuraman, Srinivasan Balakrishnan, Harm Johannes Antonius Maria Peters, Rafael Peset Llopis
  • Publication number: 20100037039
    Abstract: It is possible to increase the processor instruction set design job efficiency and reduce workload on designers in investigation of an instruction set. An instruction operation code generation system includes an operation code bit width decision means, an instruction sorting means, and an operation code value decision means. The operation code bit width decision means decides a bit width that can be assigned for an operation code of each instruction according to specification data associated with a processor instruction set. The instruction sorting means sorts the instructions according to the operation code bit width. The operation code value decision means decides the value of the operation code of each instruction.
    Type: Application
    Filed: November 19, 2007
    Publication date: February 11, 2010
    Inventor: Takahiro Kumura
  • Patent number: 7647473
    Abstract: An instruction processing method for checking an arrangement of basic instructions in a very long instruction word (VLIW) instruction, suitable for language processing systems, an assembler and a compiler, used for processors which execute variable length VLIW instructions designed based on variable length VLIW architecture.
    Type: Grant
    Filed: January 24, 2002
    Date of Patent: January 12, 2010
    Assignee: Fujitsu Limited
    Inventors: Teruhiko Kamigata, Hideo Miyake
  • Patent number: 7647474
    Abstract: Embodiments of a method and system for saving system context after a power outage are disclosed herein. A power agent operates to reduce the possibility of data corruption due to partially written data during an unexpected power outage. The power agent can determine an amount of time remaining before a power store is depleted. Based on the amount of time, the power agent can store system context information. Correspondingly, the power agent can operate to save complete system context, partial system context, or flush (I/O) buffers. Once power is restored, the power agent can restore the system context based on the nature of the save. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 27, 2005
    Date of Patent: January 12, 2010
    Assignee: Intel Corporation
    Inventors: Mallik Bulusu, Vincent J. Zimmer, Michael A. Rothman
  • Patent number: 7640417
    Abstract: Methods and apparatus relating to speculatively decoding instruction lengths in order to increase instruction throughput are described. In an embodiment, instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles. Other embodiments are also disclosed.
    Type: Grant
    Filed: October 1, 2007
    Date of Patent: December 29, 2009
    Assignee: Intel Corporation
    Inventor: Venkateswara Rao Madduri
  • Patent number: 7627735
    Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 21, 2005
    Date of Patent: December 1, 2009
    Assignee: Intel Corporation
    Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
  • Publication number: 20090240914
    Abstract: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edward T. Malley, Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Jeffrey S. Plate, John G. Rell, JR., Chung-Lung Kevin Shum
  • Patent number: 7590824
    Abstract: Techniques for processing transmissions in a communications (e.g., CDMA) system. A method and system for issuing and executing mixed architecture instructions in a multiple-issue digital signal processor receives in a mixed instruction listing a plurality of digital signal processor instructions. The plurality of digital signal processor instructions includes a plurality of parallel executable instructions (e.g., VLIW instructions or instruction packets) mixed among a plurality of series executable instructions (e.g., superscalar instructions). The series executable instructions are associated by various instruction dependencies. The method and system further identify in the mixed instruction listing the plurality of parallel executable instructions. Once identified, the parallel executable instructions are first executed in parallel irrespective of any such instruction's relative order in the mixed instruction listing.
    Type: Grant
    Filed: March 29, 2005
    Date of Patent: September 15, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich Plondke, Lucian Codrescu, William C. Anderson
  • Patent number: 7584405
    Abstract: A method for detecting computational errors in a digital processor executing a program. Initially, the program is divided into computation segments, and source code for at least one of the segments is compiled to generate two redundant code sections. Comparison code is generated for comparing results produced by execution of the two code sections. Each of the code sections is then executed in a different computational domain to generate respective results. The results of the computation are executed to alter further flow of the program only if the respective results are identical.
    Type: Grant
    Filed: December 3, 2003
    Date of Patent: September 1, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Benjamin Daniel Osecky, Blaine Douglas Gaither
  • Patent number: 7574583
    Abstract: Differences in encoding efficiency of instructions may arise if certain operations require very large immediate values as operands, as opposed to others requiring no immediate values or small immediate values. The present invention describes a processing apparatus, a compiler as well as a method for processing data, allowing the use of instructions that require large immediate data, while simultaneously maintaining an efficient encoding and decoding of instructions. The processing apparatus comprises a plurality of issue slots (UC0, UC1, UC2, UC3), wherein each issue slot comprises a plurality of functional units (FU20, FU21, FU22). The processing apparatus is arranged for processing data, based on control signals generated from a set of instructions being executed in parallel. The processing apparatus further comprises a dedicated issue slot (UC4) arranged for loading an immediate value (IMV1) in dependence upon a dedicated instruction (IMM).
    Type: Grant
    Filed: August 8, 2003
    Date of Patent: August 11, 2009
    Assignee: Silicon Hive B.V.
    Inventors: Jeroen Anton Johan Leijten, Willem Charles Mallon
  • Publication number: 20090193226
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Application
    Filed: April 6, 2009
    Publication date: July 30, 2009
    Applicant: PANASONIC CORPORATION
    Inventors: Shuichi TAKAYAMA, Nobuo Higaki
  • Publication number: 20090177862
    Abstract: An input device for executing an instruction code and method and interface for generating the instruction code are disclosed. The method for generating an instruction code which is executed by an input device includes the steps of: opening a specific purpose programming interface which is used for simulating to show a plurality of corresponding buttons according the input device; selecting a corresponding button waiting for being defined and entering a programming window; selecting any instruction for the corresponding button waiting for being defined to form a combined operation instruction in proper sequence; compiling the combined operation instruction to form an instruction code which can be executed by the input device; and downloading the instruction code to the input device. Accordingly, the input device can show a continuous operation action when the corresponding button waiting for being defined is pressed.
    Type: Application
    Filed: January 7, 2008
    Publication date: July 9, 2009
    Inventor: Kuo-Shu Cheng
  • Publication number: 20090164753
    Abstract: An Operation, Compare, Branch (OCB) VLIW instruction word has a memory address, a respective operation code, a respective comparison and branch code; and at least two respective branch pointers. A plurality of OCB VLIW instructions are contained in memory. The branch pointers of a given instruction word connect to a memory address determined by a comparison analysis. The branch pointers form a linked list structure connecting the OCB instructions together, thus no program counter is required. The OCB instructions can be scrambled to realize a branch obfuscated program with built in software protections in that software protection mechanisms can be placed in the lengths and positions of the pointers. The processor architecture allows multiple branching without branch penalties. Other prior art obfuscation techniques may be applied to software programs for the OCB processor.
    Type: Application
    Filed: December 20, 2007
    Publication date: June 25, 2009
    Applicant: United States of America as Represented by the Secrectary of the Army
    Inventor: Patrick W. Jungwirth
  • Patent number: 7533244
    Abstract: Network-on-Chip Dataflow Architecture is the new microprocessor architecture. It consists of many processing elements connecting together via two distinct networks namely instruction network and data network. Instructions are fetched into the processing elements through instruction network which uses packet switching scheme. Then the instructions will configure the processing elements and connections of the data network to create a dataflow graph. After that data are transferred and processed by the graph in a dataflow manner. Our architecture has one special characteristic in which instructions within loops are fetched only once but they are used many times.
    Type: Grant
    Filed: May 9, 2006
    Date of Patent: May 12, 2009
    Inventor: Le Nguyen Tran
  • Patent number: 7533243
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Grant
    Filed: May 22, 2002
    Date of Patent: May 12, 2009
    Assignee: Panasonic Corporation
    Inventors: Shuichi Takayama, Nobuo Higaki
  • Patent number: 7506137
    Abstract: Techniques for adding more complex instructions and their attendant multi-cycle execution units with a single instruction multiple data stream (SIMD) very long instruction word (VLIW) processing framework are described. In one aspect, an initiation mechanism also acts as a resynchronization mechanism to read the results of multi-cycle execution. This multi-purpose mechanism operates with a short instruction word (SIW) issue of the multi-cycle instruction, in a sequence processor (SP) alone, with a VLIW, and across all processing elements (PEs) individually or as an array of PEs. A number of advantageous floating point instructions are also described.
    Type: Grant
    Filed: July 16, 2007
    Date of Patent: March 17, 2009
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Strube, Edward A. Wolff, Edwin Franklin Barry, Grayson Morris, Carl Donald Busboom, Dale Edward Schneider
  • Patent number: 7496656
    Abstract: A method for processing an instruction word in a data processing system, the instruction word comprising a plurality of instruction bit positions, each bit position corresponding to an instruction actions, and a status of the bit at an instruction bit position indicating whether the instruction action corresponding to that bit position should be performed; the method comprising: forming a plurality of single action words, each single action word corresponding to one of the instruction actions and having a bit set at the bit position corresponding to that instruction actions and all its other bits un-set; forming a common action word having bits set at the bit positions corresponding to the instruction actions of any of the single action words and all its other bits un-set; comparing the instruction word and the common action word, and if the instruction word and the common action word have no bits set in common terminating processing of the instruction, and otherwise: repeating for successive single action wo
    Type: Grant
    Filed: September 19, 2001
    Date of Patent: February 24, 2009
    Assignee: STMicroelectronics Limited
    Inventor: Steven Haydock
  • Publication number: 20090049276
    Abstract: Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.
    Type: Application
    Filed: August 15, 2007
    Publication date: February 19, 2009
    Inventors: Tyson J. Bergland, Craig M. Okruhlica, Michael J.M. Toksvig, Justin M. Mahan, Edward A. Hutchins
  • Patent number: 7484075
    Abstract: Effective remote register file access time can be reduced in a clustered VLIW processor using partitioned register files and some additional hardware for pre-fetching remote registers. An instruction pre-fetcher and an instruction pre-decoder is used for pre-fetching and partially decoding instructions in order to pre-fetch the remote registers required for executing VLIWs at run-time, thus substantially reducing the number of inter-cluster copy instructions. The instructions (VLIWs) are scheduled taking into account the various hardware constraints such as limited inter-cluster communication bandwidth, inter-cluster communication delay, etc.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: January 27, 2009
    Assignee: International Business Machines Corporation
    Inventor: Krishnan K. Kailas
  • Patent number: 7475222
    Abstract: A processor comprises a memory, an instruction decoder coupled to the memory for decoding instructions retrieved therefrom, and a plurality of execution units for executing the decoded instructions. One or more of the instructions are in a compound instruction format in which a single instruction comprises multiple operation fields, with one or more of the operation fields each comprising at least an operation code field and a function field. The operation code field and the function field together specify a particular operation to be performed by one or more of the execution units.
    Type: Grant
    Filed: April 1, 2005
    Date of Patent: January 6, 2009
    Assignee: Sandbridge Technologies, Inc.
    Inventors: C. John Glossner, Erdem Hokenek, Mayan Moudgill, Michael J. Schulte
  • Patent number: 7472257
    Abstract: Processor (100) has a plurality of registers (120) for storing instructions for execution by the plurality of execution units (160). The plurality of registers (120) are coupled to the plurality of execution units (160) via distribution means (140). Distribution means (140) have a plurality of dispatch units (144) coupled to the plurality of execution units (160) and a reroutable network, e.g. a data communication bus (142), coupling the plurality of execution units (120) to the plurality of dispatch units (144). The data communication bus (142) is controlled by control unit (148). Dispatch units (144) are arranged to detect dedicated instructions in the instruction flow, which signal the beginning of an inactive period of an execution unit (160a, 160b, 160c, 160d) in the plurality of execution units (160).
    Type: Grant
    Filed: November 20, 2002
    Date of Patent: December 30, 2008
    Assignee: NXP B.V.
    Inventor: Francesco Pessolano
  • Publication number: 20080270750
    Abstract: A processor having a zero-overhead operand copy capability. The processor includes multiple execution units to execute instructions in parallel and multiple register files each associated with one or more of the execution units. The processor further includes circuitry to select either an instruction execution result from a first one of the execution units or content of a register within a first one of the register files associated with the first one of the execution units to be stored within a register within a second one of the register files.
    Type: Application
    Filed: April 30, 2007
    Publication date: October 30, 2008
    Inventors: Brucek Khailany, Ujval J. Kapasi
  • Patent number: 7444276
    Abstract: A logic simulation processor stores in a shift register intermediate values generated during the logic simulation. The simulation processor includes multiple processor units and an interconnect system that communicatively couples the processor units to each other. Each of the processor units includes a processor element configurable to simulate at least a logic gate, and a shift register associated with the processor element. The shift register includes multiple entries to store the intermediate values, and is coupled to receive the output of the processor element. Each of the processor units further includes one or more multiplexers for selecting one of the entries of the shift register as outputs to be coupled to the interconnect system. Each of the processor units may further include a local memory for storing data from, and loading the data to, the simulation processor.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: October 28, 2008
    Assignee: Liga Systems, Inc.
    Inventors: William Watt, Henry T. Verheyen
  • Publication number: 20080256330
    Abstract: Compiling a source code program for a heterogeneous multi-core processor having a first instruction sequencer, having a first instruction set architecture, an accelerator to the first instruction sequencer, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer having a second instruction set architecture, the source code program having specified therein a region of source code for the first instruction set architecture of the processor and a region of source code for the second instruction set architecture of the processor.
    Type: Application
    Filed: April 13, 2007
    Publication date: October 16, 2008
    Inventors: Perry Wang, Jamison Collins, Gautham Chinya, Hong Jiang, Hong Wang, Xinmin Tian, Guei-Yuan Lueh
  • Patent number: 7437534
    Abstract: A Very Long Instruction Word (VLIW) processor having a plurality of functional units includes a multi-ported register file that is divided into a plurality of separate register file segments, each of the register file segments being associated to one of the plurality of functional units. The register file segments are partitioned into local registers and global registers. The global registers are read and written by all functional units. The local registers are read and written only by a functional unit associated with a particular register file segment. The local registers and global registers are addressed using register addresses in an address space that is separately defined for a register file segment/functional unit pair. The global registers are addressed within a selected global register range using the same register addresses for the plurality of register file segment/functional unit pairs.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: October 14, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, William N. Joy
  • Publication number: 20080235492
    Abstract: An apparatus and a method are provided for a parallel processing very long instruction word (VLIW) computer. The apparatus includes: an index code generation unit sequentially generating an index code, which is associated with a number of no operation (NOP) instruction word between effective instruction words, with respect to each of instruction word groups to be executed in a VLIW computer; an instruction compression unit sequentially deleting the NOP instruction word which corresponds to the index code with respect to each of instruction word groups; and an instruction word conversion unit converting the effective instruction words to include the index code, the effective instruction words corresponding to the NOP instruction words.
    Type: Application
    Filed: August 14, 2007
    Publication date: September 25, 2008
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chang-Woo BAEK, Hong-Seok KIM, Hee Seok KIM, Jeongwook KIM
  • Patent number: 7424594
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Grant
    Filed: June 3, 2004
    Date of Patent: September 9, 2008
    Assignee: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Publication number: 20080215851
    Abstract: A method is provided for the functional control of program and/or data flows in digital signal processors and processors, which have respective closed and separated modules for program and data flow control, working in parallel with computers. The method enables a power-efficient adaptation of the signal processing with the applied SIMD command-type in the individual paths and minimizes the emergence of the appearance of NOP-commands with which the VLIW-architecture of the processor must be supplied. The adaptation of the signal processing is achieved by individually controlling the parallel signal processing of the processor in the data paths (DP) which respectively belong to a first and second slice. For this purpose, a single slice halt outputted from an SSM register bank switches the register clockline according to state-dependent signal processing.
    Type: Application
    Filed: May 5, 2008
    Publication date: September 4, 2008
    Inventors: Uwe Porst, Wolfram Drescher
  • Patent number: 7418575
    Abstract: A system for adding reconfigurable computational instructions to a computer, the system comprising a processor operable to execute a set of instructions of a computer program comprising a set of computational instructions and long instruction word instructions with at least one of the long instruction word instructions comprising an instruction extension, an extension adapter coupled to the processor and operable to detect the execution of the instruction extension, and programmable logic coupled to the extension adapter and operable to receive configuration data for defining the instruction extension and execute the instruction extension.
    Type: Grant
    Filed: May 12, 2005
    Date of Patent: August 26, 2008
    Assignee: Stretch, Inc.
    Inventors: Ricardo E. Gonzalez, Scott Johnson, Derek Taylor
  • Publication number: 20080201554
    Abstract: A method, system and program product for executing a multi-function instruction in a computer system by specifying, via the multi-function instruction, either a capability query or execution of a selected function of one or more optional functions, wherein the selected function is an installed optional function, wherein the capability query determines which optional functions of the one or more optional functions are installed on the computer system.
    Type: Application
    Filed: March 28, 2007
    Publication date: August 21, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shawn D. Lundvall, Ronald M. Smith, Phil Chi-Chung Yeh
  • Patent number: 7412591
    Abstract: An apparatus for switchable conditional execution in a VLIW processor is provided, comprising one or more decoders, one or more ALU with control units, and a register file. The decoder loads and decodes instructions from a fetch unit for decoding and sending the decoded instructions to the ALU with control units for execution. The register file stores and forwards the results on result buses to the decoders. The execution of a VLIW instruction includes a fetch stage, a decode stage, plural execution stages and a write-back stage. The invention has the features of approximate ASIC timing by conditional write-back with the compiler support for the conditional write-back, condition resolved just before write-back, software selective conditional issue and conditional write-back modes, and without hardware interlock/dependence checking for the VLIW processor.
    Type: Grant
    Filed: June 18, 2005
    Date of Patent: August 12, 2008
    Assignee: Industrial Technology Research Institute
    Inventors: Yung-Cheng Ma, Tengh-Yih Wang, Hsien-Feng Kuo, Chi-Lung Wang
  • Patent number: 7409530
    Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: August 5, 2008
    Assignee: University of Washington
    Inventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
  • Patent number: 7401204
    Abstract: A parallel processor performs efficient parallel processing of one or more basic instructions contained in each of a plurality of instruction words delimited by instruction delimiting information. The processor includes: a plurality of instruction execution units performing processes in accordance with corresponding, supplied basic instructions in parallel; an instruction fetch unit fetching the instruction words one by one in accordance with the instruction delimiting information; and an instruction issue unit recognizing and, in accordance therewith, selecting each of the basic instructions contained in each of the instruction words fetched by the instruction fetch unit to a corresponding instruction execution unit to execute the basic instruction.
    Type: Grant
    Filed: September 1, 2000
    Date of Patent: July 15, 2008
    Assignee: Fujitsu Limited
    Inventors: Hideo Miyake, Atsuhiro Suga, Yasuki Nakamura, Yoshimasa Takebe
  • Patent number: 7398372
    Abstract: Fusing a load micro-operation (uop) together with an arithmetic uop. Intra-instruction fusing can increase cache memory storage efficiency and computer instruction processing bandwidth within a microprocessor without incurring significant computer system cost. Uops are fused, stored in a cache memory, un-fused, executed in parallel, and retired in order to optimized cost and performance.
    Type: Grant
    Filed: June 25, 2002
    Date of Patent: July 8, 2008
    Assignee: Intel Corporation
    Inventors: Nicholas G. Samra, Stephan J. Jourdan, David J. Sager, Glenn J. Hinton
  • Patent number: 7395408
    Abstract: The parallel execution processor 100 fetches a piece of instruction data. When the piece of instruction data includes only one instruction, the instruction decoding unit 120 assigns the one instruction to all the PEs. When the piece of instruction data includes two instructions, the instruction decoding unit 120 forms all the PEs into two groups, so as to assign one instruction to each group. By making it possible to execute, in parallel, not only one type of instruction but also instructions that are different from each other, it is possible to improve the utilization efficiency of the parallel execution processor 100.
    Type: Grant
    Filed: October 16, 2003
    Date of Patent: July 1, 2008
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Takeshi Tanaka, Satoshi Takashima, Hideshi Nishida, Kozo Kimura, Tokuzo Kiyohara
  • Patent number: RE41012
    Abstract: A double indirect method of accessing a block of data in a register file is used to allow efficient implementations without the use of specialized vector processing hardware. In addition, the automatic modification of the register addressing is not tied to a single vector instruction nor to repeat or loop instructions. Rather, the technique, termed register file indexing (RFI) allows full programmer flexibility in control of the block data operational facility and provides the capability to mix non-RFI instructions with RFI instructions. The block-data operation facility is embedded in the iVLIW ManArray architecture allowing its generalized use across the instruction set architecture without specialized vector instructions or being limited in use only with repeat or loop instructions.
    Type: Grant
    Filed: June 3, 2004
    Date of Patent: November 24, 2009
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Gerald George Pechanek, Patrick R. Marchand
  • Patent number: RE41703
    Abstract: A SIMD machine employing a plurality of parallel processor (PEs) in which communications hazards are eliminated in an efficient manner. An indirect Very Long Instruction Word instruction memory (VIM) is employed along with execute and delimiter instructions. A masking mechanism may be employed to control which PEs have their VIMs loaded. Further, a receive model of operation is preferably employed. In one aspect, each PE operates to control a switch that selects from which PE it receives. The present invention addresses a better machine organization for execution of parallel algorithms that reduces hardware cost and complexity while maintaining the best characteristics of both SIMD and MIMD machines and minimizing communication latency. This invention brings a level of MIMD computational autonomy to SIMD indirect Very Long Instruction Word (iVLIW) processing elements while maintaining the single thread of control used in the SIMD machine organization.
    Type: Grant
    Filed: June 21, 2004
    Date of Patent: September 14, 2010
    Assignee: Altera Corp.
    Inventors: Gerald George Pechanek, Thomas L. Drabenstott, Juan Guillermo Revilla, David Strube, Grayson Morris