Long Instruction Word Patents (Class 712/24)
  • Patent number: 8850170
    Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.
    Type: Grant
    Filed: August 25, 2011
    Date of Patent: September 30, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
  • Publication number: 20140215183
    Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler arranges instructions within the identified loop into very large instruction words (VLIW's). At least one VLIW includes instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The compiler generates code wherein when executed assigns at runtime instructions within a given VLIW to multiple parallel execution lanes within a target processor. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The target processor includes a vector register for storing indications indicating which given instruction within a fetched VLIW for an associated lane to execute.
    Type: Application
    Filed: January 29, 2013
    Publication date: July 31, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventor: Reza Yazdani
  • Patent number: 8775147
    Abstract: An algorithm and architecture are disclosed for performing multi-argument associative operations. The algorithm and architecture can be used to schedule operations on multiple facilities for computations or can be used in the development of a model in a modeling environment. The algorithm and architecture resulting from the algorithm use the latency of the components that are used to process the associative operations. The algorithm minimizes the number of components necessary to produce an output of multi-argument associative operations and also can minimize the number of inputs each component receives.
    Type: Grant
    Filed: May 31, 2006
    Date of Patent: July 8, 2014
    Assignee: The MathWorks, Inc.
    Inventors: Alireza Pakyari, Brian K. Ogilvie
  • Patent number: 8775777
    Abstract: Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.
    Type: Grant
    Filed: August 15, 2007
    Date of Patent: July 8, 2014
    Assignee: NVIDIA Corporation
    Inventors: Tyson J. Bergland, Craig M. Okruhlica, Michael J. M. Toksvig, Justin M. Mahan, Edward A. Hutchins
  • Patent number: 8769245
    Abstract: A very long instruction word (VLIW) processor and an apparatus with power management and a method of power management therefor are provided in consistent with the exemplary embodiments of the disclosure. The power management method includes the following steps. Valid instruction(s) and no operation (NOP) instruction(s) of an input instruction package are rearranged to output a transcoded instruction package, wherein the transcoded instruction package by the rearrangement has its NOP instruction(s) corresponding to at least one execution unit, which is to be placed in power reduction state, of a VLIW processor. Power reduction control is selectively performed on at least one execution unit corresponding to at least one NOP instruction of the transcoded instruction package according to the transcoded instruction package.
    Type: Grant
    Filed: May 20, 2011
    Date of Patent: July 1, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Hsien-Ching Hsieh, Po-Han Huang, Shing-Wu Tung
  • Publication number: 20140181468
    Abstract: A processor device is disclosed and includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution. The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register files includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.
    Type: Application
    Filed: February 25, 2014
    Publication date: June 26, 2014
    Applicant: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich James Plondke, Lucian Codrescu, William C. Anderson
  • Patent number: 8745359
    Abstract: A VLIW processor executes a very long instruction word containing a plurality of instructions, and executes a plurality of instruction streams at low cost. A processor executing a very long instruction word containing a plurality of instructions fetches concurrently the very long instruction words of up to M instruction streams, from N instruction caches including a plurality of memory banks to store the very long instruction words of the M instruction streams.
    Type: Grant
    Filed: February 3, 2009
    Date of Patent: June 3, 2014
    Assignee: NEC Corporation
    Inventor: Shohei Nomoto
  • Publication number: 20140149714
    Abstract: A reconfigurable processor and an operation method of the reconfigurable processor may include: a status register configured to store a status value used to determine at least one execution mode in a processor; a parallel processing scheduler configured to schedule at least one of a very long instruction word (VLIW) logic and a coarse grained architecture (CGA) logic to be used based on the stored status value; a VLIW register configured to store processed data according to the VLIW logic; and a CGA register configured to store processed data according to the CGA logic.
    Type: Application
    Filed: November 27, 2013
    Publication date: May 29, 2014
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Doo Hyun Kim, Joon Ho Song, Do Hyung Kim, Shi Hwa Lee
  • Patent number: 8738892
    Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: May 27, 2014
    Assignee: Panasonic Corporation
    Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
  • Patent number: 8713286
    Abstract: A processor device is disclosed and includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution. The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register files includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.
    Type: Grant
    Filed: April 26, 2005
    Date of Patent: April 29, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich Plondke, Lucian Codrescu, William C. Anderson
  • Patent number: 8707012
    Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 12, 2012
    Date of Patent: April 22, 2014
    Assignee: Intel Corporation
    Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
  • Patent number: 8707013
    Abstract: In accordance with at least some embodiments, a digital signal processor (DSP) includes an instruction fetch unit and an instruction decode unit in communication with the instruction fetch unit. The DSP also includes a register set and a plurality of work units in communication with the instruction decode unit. The register set includes a plurality of legacy predicate registers. Separate from the legacy predicate registers, a plurality of on-demand predicate registers are selectively signaled without changing the opcode space for the DSP.
    Type: Grant
    Filed: July 13, 2010
    Date of Patent: April 22, 2014
    Assignee: Texas Instruments Incorporated
    Inventors: Jagadeesh Sankaran, Joseph R. Zbiciak, Steven D. Krueger
  • Patent number: 8671266
    Abstract: Techniques are described for decoupling fetching of an instruction stored in a main program memory from earliest execution of the instruction. An indirect execution method and program instructions to support such execution are addressed. In addition, an improved indirect deferred execution processor (DXP) VLIW architecture is described which supports a scalable array of memory centric processor elements that do not require local load and store units.
    Type: Grant
    Filed: May 18, 2011
    Date of Patent: March 11, 2014
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Stamatis Vassiliadis
  • Publication number: 20140059324
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Application
    Filed: February 20, 2013
    Publication date: February 27, 2014
    Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, JR., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo Rodriguez, Marco Jacobs
  • Publication number: 20140052960
    Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.
    Type: Application
    Filed: October 28, 2013
    Publication date: February 20, 2014
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, Il-hyun Park
  • Patent number: 8656141
    Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a pipelined processor configured to process multiple streams of instructions for the processor; and a switch including switching circuitry to forward data over data paths from other tiles to one or more pipeline stages of the processor and to switches of other tiles. At least some of the data is forwarded based on one or more streams of instructions for the switch.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: February 18, 2014
    Assignee: Massachusetts Institute of Technology
    Inventor: Anant Agarwal
  • Patent number: 8601244
    Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.
    Type: Grant
    Filed: February 16, 2010
    Date of Patent: December 3, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, Il-hyun Park
  • Patent number: 8583895
    Abstract: A compressed instruction format for a VLIW processor allows greater efficiency in use of cache and memory. Instructions are byte aligned and variable length. Branch targets are uncompressed. Format bits specify how many issue slots are used in a following instruction. NOPS are not stored in memory. Individual operations are compressed according to features such as whether they are resultless, guarded, short, zeroary, unary, or binary. Instructions are stored in compressed form in memory and in cache. Instructions are decompressed on the fly after being read out from cache.
    Type: Grant
    Filed: January 22, 2004
    Date of Patent: November 12, 2013
    Assignee: Nytell Software LLC
    Inventors: Eino Jacobs, Michael Ang
  • Patent number: 8516224
    Abstract: Instructions asserted in the instruction pipeline of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units, to interpret that control information and take appropriate action.
    Type: Grant
    Filed: January 31, 2012
    Date of Patent: August 20, 2013
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
  • Patent number: 8510539
    Abstract: A spilling method in register files for a processor is proposed. The processor with Parallel Architecture Core structure includes multiple clusters and a memory. Each cluster includes multiple function units (M-Unit and I-Unit), multiple local register files and a global register file. The local register files are used by the multiple function units, respectively. For a specified live range, the method includes calculating communication costs of the local register files and the global register file in each cluster, and communication cost of the memory for spilling the live range when spilling occurs; calculating use ratios of the local register files and the global register file in each cluster, and use ratio of the memory for the live range; and selecting one of the local register files and the global register file in each cluster and the memory for spilling the live range based on the communication costs and use ratios.
    Type: Grant
    Filed: July 2, 2010
    Date of Patent: August 13, 2013
    Assignee: National Tsing Hua University
    Inventors: Chia Han Lu, Chung Ju Wu, Jenq Kuen Lee
  • Publication number: 20130138918
    Abstract: A circuit arrangement, method, and program product for compressing and decompressing data in a node of a system including a plurality of nodes interconnected via an on-chip network. Compressed data may be received and stored at an input buffer of a node, and in parallel with moving the compressed data to an execution register of the node, decompression logic of the node may decompress the data to generate uncompressed data, such that uncompressed data is stored in the execution register for utilization by an execution unit of the node. Uncompressed data may be output by the execution unit into the execution register, and in parallel with moving the uncompressed data to an output buffer of the node connected to the on-chip network, compression logic may compress the uncompressed data to generate compressed data, such that compressed data is stored at the output buffer.
    Type: Application
    Filed: November 30, 2011
    Publication date: May 30, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8447961
    Abstract: A system to implement a zero overhead software pipelined (SFP) loop includes a Very Long Instruction Word (VLIW) processor having an N number of execution slots. The VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size. A program memory receives a Program Memory address to fetch an instruction packet. The program memory is closely coupled with the instruction buffer size to implement the zero overhead software pipelined (SFP) loop. The size of the zero overhead software pipelined (SFP) loop can exceed the instruction buffer size. A CPU control register includes a block count and an iteration count. The block count is loaded into a block counter and counts the plurality of instructions executed in the SFP loop, and the iteration count is loaded into an iteration counter and counts a number of iterations of the SFP loop based on the block count.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 21, 2013
    Assignee: Saankhya Labs Pvt Ltd
    Inventors: Anindya Saha, Manish Kumar, Hemant Mallapur, Santhosh Billava, Viji Rajangam
  • Patent number: 8433553
    Abstract: A programmed computer and method are described for generating a processor design. The method carried out by the programmed computer comprises providing an initial model for the processor, specifying a plurality of resources in terms of resource parameters and their mutual relations. Furthermore, statistics are provided indicative of the required use of the resources by a selected application. Thereafter, a reduced resource design is generated by the programmed computer by relaxing at least one resource parameter and/or limiting an amount of resources specified in the initial specification on the basis of the statistics.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: April 30, 2013
    Assignee: Intel Benelux B.V.
    Inventors: Alexander Augusteijn, Jeroen Anton Johan Leijten
  • Publication number: 20130080738
    Abstract: In a particular embodiment, a very long instruction word (VLIW) processor is operable to execute VLIW instructions. At least one of the VLIW instructions includes a first load or store instruction and a second load or store instruction. The first instruction and the second instruction are executed as a single atomic unit. At least one of the first and second instructions is a store-conditional instruction.
    Type: Application
    Filed: September 23, 2011
    Publication date: March 28, 2013
    Applicant: QUALCOMM INCORPORATED
    Inventors: Erich J. Plondke, Ajay A. Ingle, Lucian Codrescu
  • Patent number: 8397000
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: March 12, 2013
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
  • Publication number: 20130061022
    Abstract: A method for providing intrinsic supports for a VLIW DSP processor with distributed register files comprises the steps of: generating a program representation with cluster information on instructions of the DSP processor, wherein the cluster information is provided by a program with cluster intrinsic coding; identifying data stream operations indicating parallel instruction sequences applied on different data sets in the program representation; identifying data sharing relations indicating data shared by the data stream operations in the program representation; identifying data aggregation relations indicating results aggregated from the data stream operations in the program representation; and performing register allocation for the DSP processor according to the identified data stream operations, the data sharing relations and the data aggregation relations.
    Type: Application
    Filed: September 1, 2011
    Publication date: March 7, 2013
    Applicant: NATIONAL TSING HUA UNIVERSITY
    Inventors: JENQ KUEN LEE, CHI BANG KUAN
  • Patent number: 8364935
    Abstract: A data processing apparatus has an instruction memory system arranged to output an instruction word addressed by an instruction address. An instruction execution unit, processes a plurality of instructions from the instruction word in parallel. A detection unit, detects in which of a plurality of ranges the instruction address lies. The detection unit is coupled to the instruction execution unit and/or the instruction memory system, to control a way in which the instruction execution unit parallelizes processing of the instructions from the instruction word, dependent on a detected range. In an embodiment the instruction execution unit and/or the instruction memory system adjusts a width of the instruction word that determines a number of instructions from the instruction word that is processed in parallel, dependent on the detected range.
    Type: Grant
    Filed: October 1, 2003
    Date of Patent: January 29, 2013
    Assignee: Nytell Software LLC
    Inventors: Ramanathan Sethuraman, Balakrishnan Srinivasan, Carlos Antonio Alba Pinto, Harm Johannes Antonius Maria Peters, Rafael Peset Llopis
  • Patent number: 8316216
    Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: November 20, 2012
    Assignee: Intel Corporation
    Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
  • Patent number: 8296479
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Grant
    Filed: January 5, 2012
    Date of Patent: October 23, 2012
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
  • Patent number: 8250340
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Grant
    Filed: February 12, 2010
    Date of Patent: August 21, 2012
    Assignee: Panasonic Corporation
    Inventors: Shuichi Takayama, Nobuo Higaki
  • Patent number: 8191085
    Abstract: A method for operating a data processing system includes providing an application binary interface (ABI) which determines a set of non-contiguous volatile registers and a set of non-volatile registers. The set of non-contiguous volatile registers includes a plurality of general purpose registers (GPRs) and a plurality of special purpose registers (SPRs). The method includes providing less than three instructions which collectively load or store all of the set of non-contiguous volatile registers determined by the ABI. A system includes a set of volatile registers including a plurality of volatile GPRs, a plurality of volatile supervisor SPRs, and a plurality of volatile user SPRs, and execution circuitry for executing a first instruction that loads or stores the plurality of volatile supervisor SPRs, for executing a second instruction that loads or stores the plurality of volatile GPRs, and for executing a third instruction that loads or stores the plurality of volatile user SPRs.
    Type: Grant
    Filed: August 29, 2006
    Date of Patent: May 29, 2012
    Assignee: Freescale Semiconductor, Inc.
    Inventor: William C. Moyer
  • Patent number: 8145888
    Abstract: A data processing circuit has an execution circuit (18) with a plurality of functional units (20). An instruction decoder (17) is operable in a first and a second instruction mode. In the first instruction mode instructions have respective fields for controlling each of the functional units (20), and in the second instruction mode instructions control one functional unit. A mode control circuit (12) controls the selection of the instruction modes. In an embodiment, the instruction decoder uses time-stationary decoding of the selection of operations to be executed by the execution circuit (18) and the selection of destination registers from the set of registers (19). Mode switching is a more efficient way of reducing instruction time for time stationary processors than indicating functional units for which the instruction contains commands.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: March 27, 2012
    Assignee: Silicon Hive B.V.
    Inventors: Jeroen Anton Johan Leijten, Hendrik Tjeerd Joannes Zwartenkot
  • Patent number: 8127115
    Abstract: Disclosed are a method and a system for grouping processor instructions for execution by a processor, where the group of processor instructions includes at least two branch processor instructions. In one or more embodiments, an instruction buffer can decouple an instruction fetch operation from an instruction decode operation by storing fetched processor instructions in the instruction buffer until the fetched processor instructions are ready to be decoded. Group formation can involve removing processor instructions from the instruction buffer and routing the processor instruction to latches that convey the processor instructions to decoders. Processor instructions that are removed from instruction buffer in a single clock cycle can be called a group of processor instructions. In one or more embodiments, the first instruction in the group must be the oldest instruction in the instruction buffer and instructions must be removed from the instruction buffer ordered from oldest to youngest.
    Type: Grant
    Filed: April 3, 2009
    Date of Patent: February 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Richard William Doing, Kevin Neal Magil, Balaram Sinharoy, Jeffrey R. Summers, James Albert Van Norstrand, Jr.
  • Patent number: 8117357
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Grant
    Filed: May 12, 2011
    Date of Patent: February 14, 2012
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, David Carl Strube, Edwin Frank Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
  • Patent number: 8117423
    Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.
    Type: Grant
    Filed: March 4, 2008
    Date of Patent: February 14, 2012
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
  • Patent number: 8108658
    Abstract: A data processing circuit comprises a register file (14) having read ports and write ports. A plurality of functional units (21a-c), is coupled to receive operand data from a same combination of read ports. Each functional unit is coupled to a respective one of the write ports for writing a respective result. An instruction issue slot has outputs (11) for supplying register selection information to said combination read ports and to the respective ones of the write ports. The output of the issue slot also supplies an operation code. The functional units (21a-c) in the plurality are arranged to respond to at least to one value of the operation code by each executing a respective operation using the same operands from said same combination and each functional unit producing a respective result at a respective ones of the write ports.
    Type: Grant
    Filed: September 21, 2005
    Date of Patent: January 31, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Antonius Adrianus Maria Van Wel
  • Publication number: 20120017067
    Abstract: In accordance with at least some embodiments, a digital signal processor (DSP) includes an instruction fetch unit and an instruction decode unit in communication with the instruction fetch unit. The DSP also includes a register set and a plurality of work units in communication with the instruction decode unit. The register set includes a plurality of legacy predicate registers. Separate from the legacy predicate registers, a plurality of on-demand predicate registers are selectively signaled without changing the opcode space for the DSP.
    Type: Application
    Filed: July 13, 2010
    Publication date: January 19, 2012
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jagadeesh SANKARAN, Joseph R. ZBICIAK, Steven D. KRUEGER
  • Patent number: 8095775
    Abstract: During operation of a VLIW processor, a very long instruction word is fetched. A portion of the very long instruction word that includes a pointer to an instruction is identified, and the instruction pointed to by the pointer is retrieved from a location of an instruction window. The retrieved instruction is input into a functional unit for execution.
    Type: Grant
    Filed: November 19, 2008
    Date of Patent: January 10, 2012
    Assignee: Marvell International Ltd.
    Inventors: Moinul H. Khan, Anitha Kona, Mark N. Fullerton
  • Publication number: 20110307684
    Abstract: An image processing system including a vector processor and a memory adapted for attaching to the vector processor. The memory is adapted to store multiple image frames. The vector processor includes an address generator operatively attached to the memory to access the memory. The address generator is adapted for calculating addresses of the memory over the multiple image frames. The addresses may be calculated over the image frames based upon an image parameter. The image parameter may specify which of the image frames are processed simultaneously. A scalar processor may be attached to the vector processor. The scalar processor provides the image parameter(s) to the address generator for address calculation over the multiple image frames. An input register may be attached to the vector processor. The input register may be adapted to receive a very long instruction word (VLIW) instruction.
    Type: Application
    Filed: June 10, 2010
    Publication date: December 15, 2011
    Inventors: Yosef Kreinin, Gil Dogon, Emmanuel Sixsou, Yosi Arbeli, Mois Navon, Roman Sajman
  • Patent number: 8069335
    Abstract: A processing system for executing instructions comprises a first part (11) having address information and a plurality of data bits, E0 to EN. According to one embodiment, each data bit E0 to EN directly selects a corresponding element 130 to 13N forming a second part of the instruction set (for example a VLIW). In this manner, the first part (11) is used to only select elements that do not comprise NOP instructions, thereby avoiding power being consumed unnecessarily. According to an alternative embodiment, different groups of elements in the second part (13) may be selected by a number encoded in the first part (11), using data bits Eo to EN. Preferably, these different groups reflect the most likely used combinations in a program.
    Type: Grant
    Filed: November 13, 2006
    Date of Patent: November 29, 2011
    Assignee: NXP B.V.
    Inventors: Peter Kievits, Jean-Paul C. F. H. Smeets
  • Patent number: 8046568
    Abstract: The present invention relates to the field of (micro)computer design and architecture, and in particular to microarchitecture associated with moving data values between a (micro)processor and memory components. Particularly, the present invention relates to a computer system with an processor architecture in which register addresses are generated with more than one execution channel controlled by one central processing unit with at least one load/store unit for loading and storing data objects, and at least one cache memory associated to the processor holding data objects accessed by the processor, wherein said processor's load/store unit contains a high speed memory directly interfacing said load/store unit to the cache. The present invention improves the of architectures with dual ported microprocessor implementations comprising two execution pipelines capable of two load/store data transactions per cycle.
    Type: Grant
    Filed: June 28, 2010
    Date of Patent: October 25, 2011
    Assignee: Broadcom Corporation
    Inventors: Sophie Wilson, John E. Redford
  • Patent number: 8019971
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Grant
    Filed: April 6, 2009
    Date of Patent: September 13, 2011
    Assignee: Panasonic Corporation
    Inventors: Shuichi Takayama, Nobuo Higaki
  • Publication number: 20110219210
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Application
    Filed: May 12, 2011
    Publication date: September 8, 2011
    Applicant: ALTERA CORPORATION
    Inventors: Gerald G. Pechanek, David Carl Strube, Edwin Frank Barry, Charles W. Kurak, JR., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
  • Patent number: 8015391
    Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.
    Type: Grant
    Filed: October 8, 2010
    Date of Patent: September 6, 2011
    Assignee: Panasonic Corporation
    Inventor: Takenobu Tani
  • Publication number: 20110213948
    Abstract: An apparatus includes a processor. The processor includes two memories. The first memory stores one set of instructions. The second memory stores another set of instructions that are longer than the set of instructions in the first memory. An instruction in the set of instructions in the first memory is used as a pointer to a corresponding instruction in the set of instructions in the second memory.
    Type: Application
    Filed: February 1, 2010
    Publication date: September 1, 2011
    Inventor: Steven Perry
  • Patent number: 7971197
    Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.
    Type: Grant
    Filed: August 18, 2005
    Date of Patent: June 28, 2011
    Assignee: Tensilica, Inc.
    Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
  • Patent number: 7962719
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Grant
    Filed: August 7, 2008
    Date of Patent: June 14, 2011
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Publication number: 20110072151
    Abstract: A reconfigurable, protocol indifferent bit stream-processing engine, and related systems and data communication methodologies, are adapted to achieve the goal of providing inter-fabric interoperability among high-speed networks operating a speeds of at least 10 gigabits per second. The bit-stream processing engine operates as an omni-protocol, multi-stage processor that can be configured with appropriate switches and related network elements to create a seamless network fabric that permits interoperability not only among existing communication protocols, but also with the ability to accommodate future communication protocols. The method and systems of the present invention are applicable to networks that include storage networks, communication networks and processor networks.
    Type: Application
    Filed: August 24, 2010
    Publication date: March 24, 2011
    Inventors: Viswa Sharma, Roger Holschbach, Bart Stuck, William Chu
  • Patent number: 7895413
    Abstract: A microprocessor for processing instructions comprises multiple clusters for receiving the instructions, each of the clusters having a plurality of functional units for executing the instructions, multiple register sub-files each having multiple registers for storing data for executing the instructions, wherein each of the clusters is associated with corresponding one of the register sub-files so that an instruction dispatched to a cluster is executed by accessing registers in a register sub-file associated with the cluster to which the instruction is dispatched, a register-renaming unit for renaming target registers in an instruction with registers in a register sub-file associated with a cluster to which the instruction is dispatched, and issue-queue units each of which is associated with a corresponding one of the clusters, wherein an issue-queue unit holds instruction renamed by the register-renaming unit until the renamed instruction is issued to be executed in a cluster associated with the issue-queue u
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: February 22, 2011
    Assignee: International Business Machines Corporation
    Inventor: Mayan Moudgill
  • Patent number: 7886135
    Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.
    Type: Grant
    Filed: November 7, 2006
    Date of Patent: February 8, 2011
    Inventors: Brett Coon, Godfrey D'Souza, Paul Serris