Long Instruction Word Patents (Class 712/24)
-
Patent number: 8850170Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.Type: GrantFiled: August 25, 2011Date of Patent: September 30, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
-
Publication number: 20140215183Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler arranges instructions within the identified loop into very large instruction words (VLIW's). At least one VLIW includes instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The compiler generates code wherein when executed assigns at runtime instructions within a given VLIW to multiple parallel execution lanes within a target processor. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The target processor includes a vector register for storing indications indicating which given instruction within a fetched VLIW for an associated lane to execute.Type: ApplicationFiled: January 29, 2013Publication date: July 31, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventor: Reza Yazdani
-
Patent number: 8775147Abstract: An algorithm and architecture are disclosed for performing multi-argument associative operations. The algorithm and architecture can be used to schedule operations on multiple facilities for computations or can be used in the development of a model in a modeling environment. The algorithm and architecture resulting from the algorithm use the latency of the components that are used to process the associative operations. The algorithm minimizes the number of components necessary to produce an output of multi-argument associative operations and also can minimize the number of inputs each component receives.Type: GrantFiled: May 31, 2006Date of Patent: July 8, 2014Assignee: The MathWorks, Inc.Inventors: Alireza Pakyari, Brian K. Ogilvie
-
Patent number: 8775777Abstract: Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.Type: GrantFiled: August 15, 2007Date of Patent: July 8, 2014Assignee: NVIDIA CorporationInventors: Tyson J. Bergland, Craig M. Okruhlica, Michael J. M. Toksvig, Justin M. Mahan, Edward A. Hutchins
-
Patent number: 8769245Abstract: A very long instruction word (VLIW) processor and an apparatus with power management and a method of power management therefor are provided in consistent with the exemplary embodiments of the disclosure. The power management method includes the following steps. Valid instruction(s) and no operation (NOP) instruction(s) of an input instruction package are rearranged to output a transcoded instruction package, wherein the transcoded instruction package by the rearrangement has its NOP instruction(s) corresponding to at least one execution unit, which is to be placed in power reduction state, of a VLIW processor. Power reduction control is selectively performed on at least one execution unit corresponding to at least one NOP instruction of the transcoded instruction package according to the transcoded instruction package.Type: GrantFiled: May 20, 2011Date of Patent: July 1, 2014Assignee: Industrial Technology Research InstituteInventors: Hsien-Ching Hsieh, Po-Han Huang, Shing-Wu Tung
-
REGISTER FILES FOR A DIGITAL SIGNAL PROCESSOR OPERATING IN AN INTERLEAVED MULTI-THREADED ENVIRONMENT
Publication number: 20140181468Abstract: A processor device is disclosed and includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution. The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register files includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.Type: ApplicationFiled: February 25, 2014Publication date: June 26, 2014Applicant: QUALCOMM IncorporatedInventors: Muhammad Ahmed, Erich James Plondke, Lucian Codrescu, William C. Anderson -
Patent number: 8745359Abstract: A VLIW processor executes a very long instruction word containing a plurality of instructions, and executes a plurality of instruction streams at low cost. A processor executing a very long instruction word containing a plurality of instructions fetches concurrently the very long instruction words of up to M instruction streams, from N instruction caches including a plurality of memory banks to store the very long instruction words of the M instruction streams.Type: GrantFiled: February 3, 2009Date of Patent: June 3, 2014Assignee: NEC CorporationInventor: Shohei Nomoto
-
Publication number: 20140149714Abstract: A reconfigurable processor and an operation method of the reconfigurable processor may include: a status register configured to store a status value used to determine at least one execution mode in a processor; a parallel processing scheduler configured to schedule at least one of a very long instruction word (VLIW) logic and a coarse grained architecture (CGA) logic to be used based on the stored status value; a VLIW register configured to store processed data according to the VLIW logic; and a CGA register configured to store processed data according to the CGA logic.Type: ApplicationFiled: November 27, 2013Publication date: May 29, 2014Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Doo Hyun Kim, Joon Ho Song, Do Hyung Kim, Shi Hwa Lee
-
Patent number: 8738892Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.Type: GrantFiled: April 15, 2008Date of Patent: May 27, 2014Assignee: Panasonic CorporationInventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
-
Register files for a digital signal processor operating in an interleaved multi-threaded environment
Patent number: 8713286Abstract: A processor device is disclosed and includes a memory and a sequencer that is responsive to the memory. The sequencer supports very long instruction word (VLIW) type instructions and at least one VLIW instruction packet uses a number of operands during execution. The processor device further includes a plurality of instruction execution units responsive to the sequencer and a plurality of register files. Each of the plurality of register files includes a plurality of registers and the plurality of register files are coupled to the plurality of instruction execution units. Further, each of the plurality of register files includes a number of data read ports and the number of data read ports of each of the plurality of register files is less than the number of operands used by the at least one VLIW instruction packet.Type: GrantFiled: April 26, 2005Date of Patent: April 29, 2014Assignee: QUALCOMM IncorporatedInventors: Muhammad Ahmed, Erich Plondke, Lucian Codrescu, William C. Anderson -
Patent number: 8707012Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 12, 2012Date of Patent: April 22, 2014Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8707013Abstract: In accordance with at least some embodiments, a digital signal processor (DSP) includes an instruction fetch unit and an instruction decode unit in communication with the instruction fetch unit. The DSP also includes a register set and a plurality of work units in communication with the instruction decode unit. The register set includes a plurality of legacy predicate registers. Separate from the legacy predicate registers, a plurality of on-demand predicate registers are selectively signaled without changing the opcode space for the DSP.Type: GrantFiled: July 13, 2010Date of Patent: April 22, 2014Assignee: Texas Instruments IncorporatedInventors: Jagadeesh Sankaran, Joseph R. Zbiciak, Steven D. Krueger
-
Patent number: 8671266Abstract: Techniques are described for decoupling fetching of an instruction stored in a main program memory from earliest execution of the instruction. An indirect execution method and program instructions to support such execution are addressed. In addition, an improved indirect deferred execution processor (DXP) VLIW architecture is described which supports a scalable array of memory centric processor elements that do not require local load and store units.Type: GrantFiled: May 18, 2011Date of Patent: March 11, 2014Assignee: Altera CorporationInventors: Gerald George Pechanek, Stamatis Vassiliadis
-
Publication number: 20140059324Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: ApplicationFiled: February 20, 2013Publication date: February 27, 2014Inventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, JR., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo Rodriguez, Marco Jacobs
-
Publication number: 20140052960Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.Type: ApplicationFiled: October 28, 2013Publication date: February 20, 2014Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, Il-hyun Park
-
Patent number: 8656141Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a pipelined processor configured to process multiple streams of instructions for the processor; and a switch including switching circuitry to forward data over data paths from other tiles to one or more pipeline stages of the processor and to switches of other tiles. At least some of the data is forwarded based on one or more streams of instructions for the switch.Type: GrantFiled: December 13, 2005Date of Patent: February 18, 2014Assignee: Massachusetts Institute of TechnologyInventor: Anant Agarwal
-
Patent number: 8601244Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.Type: GrantFiled: February 16, 2010Date of Patent: December 3, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, Il-hyun Park
-
Patent number: 8583895Abstract: A compressed instruction format for a VLIW processor allows greater efficiency in use of cache and memory. Instructions are byte aligned and variable length. Branch targets are uncompressed. Format bits specify how many issue slots are used in a following instruction. NOPS are not stored in memory. Individual operations are compressed according to features such as whether they are resultless, guarded, short, zeroary, unary, or binary. Instructions are stored in compressed form in memory and in cache. Instructions are decompressed on the fly after being read out from cache.Type: GrantFiled: January 22, 2004Date of Patent: November 12, 2013Assignee: Nytell Software LLCInventors: Eino Jacobs, Michael Ang
-
Patent number: 8516224Abstract: Instructions asserted in the instruction pipeline of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units, to interpret that control information and take appropriate action.Type: GrantFiled: January 31, 2012Date of Patent: August 20, 2013Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
-
Patent number: 8510539Abstract: A spilling method in register files for a processor is proposed. The processor with Parallel Architecture Core structure includes multiple clusters and a memory. Each cluster includes multiple function units (M-Unit and I-Unit), multiple local register files and a global register file. The local register files are used by the multiple function units, respectively. For a specified live range, the method includes calculating communication costs of the local register files and the global register file in each cluster, and communication cost of the memory for spilling the live range when spilling occurs; calculating use ratios of the local register files and the global register file in each cluster, and use ratio of the memory for the live range; and selecting one of the local register files and the global register file in each cluster and the memory for spilling the live range based on the communication costs and use ratios.Type: GrantFiled: July 2, 2010Date of Patent: August 13, 2013Assignee: National Tsing Hua UniversityInventors: Chia Han Lu, Chung Ju Wu, Jenq Kuen Lee
-
Publication number: 20130138918Abstract: A circuit arrangement, method, and program product for compressing and decompressing data in a node of a system including a plurality of nodes interconnected via an on-chip network. Compressed data may be received and stored at an input buffer of a node, and in parallel with moving the compressed data to an execution register of the node, decompression logic of the node may decompress the data to generate uncompressed data, such that uncompressed data is stored in the execution register for utilization by an execution unit of the node. Uncompressed data may be output by the execution unit into the execution register, and in parallel with moving the uncompressed data to an output buffer of the node connected to the on-chip network, compression logic may compress the uncompressed data to generate compressed data, such that compressed data is stored at the output buffer.Type: ApplicationFiled: November 30, 2011Publication date: May 30, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 8447961Abstract: A system to implement a zero overhead software pipelined (SFP) loop includes a Very Long Instruction Word (VLIW) processor having an N number of execution slots. The VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size. A program memory receives a Program Memory address to fetch an instruction packet. The program memory is closely coupled with the instruction buffer size to implement the zero overhead software pipelined (SFP) loop. The size of the zero overhead software pipelined (SFP) loop can exceed the instruction buffer size. A CPU control register includes a block count and an iteration count. The block count is loaded into a block counter and counts the plurality of instructions executed in the SFP loop, and the iteration count is loaded into an iteration counter and counts a number of iterations of the SFP loop based on the block count.Type: GrantFiled: February 18, 2010Date of Patent: May 21, 2013Assignee: Saankhya Labs Pvt LtdInventors: Anindya Saha, Manish Kumar, Hemant Mallapur, Santhosh Billava, Viji Rajangam
-
Patent number: 8433553Abstract: A programmed computer and method are described for generating a processor design. The method carried out by the programmed computer comprises providing an initial model for the processor, specifying a plurality of resources in terms of resource parameters and their mutual relations. Furthermore, statistics are provided indicative of the required use of the resources by a selected application. Thereafter, a reduced resource design is generated by the programmed computer by relaxing at least one resource parameter and/or limiting an amount of resources specified in the initial specification on the basis of the statistics.Type: GrantFiled: November 3, 2008Date of Patent: April 30, 2013Assignee: Intel Benelux B.V.Inventors: Alexander Augusteijn, Jeroen Anton Johan Leijten
-
Publication number: 20130080738Abstract: In a particular embodiment, a very long instruction word (VLIW) processor is operable to execute VLIW instructions. At least one of the VLIW instructions includes a first load or store instruction and a second load or store instruction. The first instruction and the second instruction are executed as a single atomic unit. At least one of the first and second instructions is a store-conditional instruction.Type: ApplicationFiled: September 23, 2011Publication date: March 28, 2013Applicant: QUALCOMM INCORPORATEDInventors: Erich J. Plondke, Ajay A. Ingle, Lucian Codrescu
-
Patent number: 8397000Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: GrantFiled: September 12, 2012Date of Patent: March 12, 2013Assignee: Altera CorporationInventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
-
Publication number: 20130061022Abstract: A method for providing intrinsic supports for a VLIW DSP processor with distributed register files comprises the steps of: generating a program representation with cluster information on instructions of the DSP processor, wherein the cluster information is provided by a program with cluster intrinsic coding; identifying data stream operations indicating parallel instruction sequences applied on different data sets in the program representation; identifying data sharing relations indicating data shared by the data stream operations in the program representation; identifying data aggregation relations indicating results aggregated from the data stream operations in the program representation; and performing register allocation for the DSP processor according to the identified data stream operations, the data sharing relations and the data aggregation relations.Type: ApplicationFiled: September 1, 2011Publication date: March 7, 2013Applicant: NATIONAL TSING HUA UNIVERSITYInventors: JENQ KUEN LEE, CHI BANG KUAN
-
Patent number: 8364935Abstract: A data processing apparatus has an instruction memory system arranged to output an instruction word addressed by an instruction address. An instruction execution unit, processes a plurality of instructions from the instruction word in parallel. A detection unit, detects in which of a plurality of ranges the instruction address lies. The detection unit is coupled to the instruction execution unit and/or the instruction memory system, to control a way in which the instruction execution unit parallelizes processing of the instructions from the instruction word, dependent on a detected range. In an embodiment the instruction execution unit and/or the instruction memory system adjusts a width of the instruction word that determines a number of instructions from the instruction word that is processed in parallel, dependent on the detected range.Type: GrantFiled: October 1, 2003Date of Patent: January 29, 2013Assignee: Nytell Software LLCInventors: Ramanathan Sethuraman, Balakrishnan Srinivasan, Carlos Antonio Alba Pinto, Harm Johannes Antonius Maria Peters, Rafael Peset Llopis
-
Patent number: 8316216Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 21, 2009Date of Patent: November 20, 2012Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8296479Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: GrantFiled: January 5, 2012Date of Patent: October 23, 2012Assignee: Altera CorporationInventors: Gerald George Pechanek, David Strube, Edwin Franklin Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
-
Patent number: 8250340Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.Type: GrantFiled: February 12, 2010Date of Patent: August 21, 2012Assignee: Panasonic CorporationInventors: Shuichi Takayama, Nobuo Higaki
-
Patent number: 8191085Abstract: A method for operating a data processing system includes providing an application binary interface (ABI) which determines a set of non-contiguous volatile registers and a set of non-volatile registers. The set of non-contiguous volatile registers includes a plurality of general purpose registers (GPRs) and a plurality of special purpose registers (SPRs). The method includes providing less than three instructions which collectively load or store all of the set of non-contiguous volatile registers determined by the ABI. A system includes a set of volatile registers including a plurality of volatile GPRs, a plurality of volatile supervisor SPRs, and a plurality of volatile user SPRs, and execution circuitry for executing a first instruction that loads or stores the plurality of volatile supervisor SPRs, for executing a second instruction that loads or stores the plurality of volatile GPRs, and for executing a third instruction that loads or stores the plurality of volatile user SPRs.Type: GrantFiled: August 29, 2006Date of Patent: May 29, 2012Assignee: Freescale Semiconductor, Inc.Inventor: William C. Moyer
-
Patent number: 8145888Abstract: A data processing circuit has an execution circuit (18) with a plurality of functional units (20). An instruction decoder (17) is operable in a first and a second instruction mode. In the first instruction mode instructions have respective fields for controlling each of the functional units (20), and in the second instruction mode instructions control one functional unit. A mode control circuit (12) controls the selection of the instruction modes. In an embodiment, the instruction decoder uses time-stationary decoding of the selection of operations to be executed by the execution circuit (18) and the selection of destination registers from the set of registers (19). Mode switching is a more efficient way of reducing instruction time for time stationary processors than indicating functional units for which the instruction contains commands.Type: GrantFiled: September 6, 2007Date of Patent: March 27, 2012Assignee: Silicon Hive B.V.Inventors: Jeroen Anton Johan Leijten, Hendrik Tjeerd Joannes Zwartenkot
-
Patent number: 8127115Abstract: Disclosed are a method and a system for grouping processor instructions for execution by a processor, where the group of processor instructions includes at least two branch processor instructions. In one or more embodiments, an instruction buffer can decouple an instruction fetch operation from an instruction decode operation by storing fetched processor instructions in the instruction buffer until the fetched processor instructions are ready to be decoded. Group formation can involve removing processor instructions from the instruction buffer and routing the processor instruction to latches that convey the processor instructions to decoders. Processor instructions that are removed from instruction buffer in a single clock cycle can be called a group of processor instructions. In one or more embodiments, the first instruction in the group must be the oldest instruction in the instruction buffer and instructions must be removed from the instruction buffer ordered from oldest to youngest.Type: GrantFiled: April 3, 2009Date of Patent: February 28, 2012Assignee: International Business Machines CorporationInventors: Richard William Doing, Kevin Neal Magil, Balaram Sinharoy, Jeffrey R. Summers, James Albert Van Norstrand, Jr.
-
Patent number: 8117357Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: GrantFiled: May 12, 2011Date of Patent: February 14, 2012Assignee: Altera CorporationInventors: Gerald George Pechanek, David Carl Strube, Edwin Frank Barry, Charles W. Kurak, Jr., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
-
Patent number: 8117423Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.Type: GrantFiled: March 4, 2008Date of Patent: February 14, 2012Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
-
Patent number: 8108658Abstract: A data processing circuit comprises a register file (14) having read ports and write ports. A plurality of functional units (21a-c), is coupled to receive operand data from a same combination of read ports. Each functional unit is coupled to a respective one of the write ports for writing a respective result. An instruction issue slot has outputs (11) for supplying register selection information to said combination read ports and to the respective ones of the write ports. The output of the issue slot also supplies an operation code. The functional units (21a-c) in the plurality are arranged to respond to at least to one value of the operation code by each executing a respective operation using the same operands from said same combination and each functional unit producing a respective result at a respective ones of the write ports.Type: GrantFiled: September 21, 2005Date of Patent: January 31, 2012Assignee: Koninklijke Philips Electronics N.V.Inventor: Antonius Adrianus Maria Van Wel
-
Publication number: 20120017067Abstract: In accordance with at least some embodiments, a digital signal processor (DSP) includes an instruction fetch unit and an instruction decode unit in communication with the instruction fetch unit. The DSP also includes a register set and a plurality of work units in communication with the instruction decode unit. The register set includes a plurality of legacy predicate registers. Separate from the legacy predicate registers, a plurality of on-demand predicate registers are selectively signaled without changing the opcode space for the DSP.Type: ApplicationFiled: July 13, 2010Publication date: January 19, 2012Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Jagadeesh SANKARAN, Joseph R. ZBICIAK, Steven D. KRUEGER
-
Patent number: 8095775Abstract: During operation of a VLIW processor, a very long instruction word is fetched. A portion of the very long instruction word that includes a pointer to an instruction is identified, and the instruction pointed to by the pointer is retrieved from a location of an instruction window. The retrieved instruction is input into a functional unit for execution.Type: GrantFiled: November 19, 2008Date of Patent: January 10, 2012Assignee: Marvell International Ltd.Inventors: Moinul H. Khan, Anitha Kona, Mark N. Fullerton
-
Publication number: 20110307684Abstract: An image processing system including a vector processor and a memory adapted for attaching to the vector processor. The memory is adapted to store multiple image frames. The vector processor includes an address generator operatively attached to the memory to access the memory. The address generator is adapted for calculating addresses of the memory over the multiple image frames. The addresses may be calculated over the image frames based upon an image parameter. The image parameter may specify which of the image frames are processed simultaneously. A scalar processor may be attached to the vector processor. The scalar processor provides the image parameter(s) to the address generator for address calculation over the multiple image frames. An input register may be attached to the vector processor. The input register may be adapted to receive a very long instruction word (VLIW) instruction.Type: ApplicationFiled: June 10, 2010Publication date: December 15, 2011Inventors: Yosef Kreinin, Gil Dogon, Emmanuel Sixsou, Yosi Arbeli, Mois Navon, Roman Sajman
-
Patent number: 8069335Abstract: A processing system for executing instructions comprises a first part (11) having address information and a plurality of data bits, E0 to EN. According to one embodiment, each data bit E0 to EN directly selects a corresponding element 130 to 13N forming a second part of the instruction set (for example a VLIW). In this manner, the first part (11) is used to only select elements that do not comprise NOP instructions, thereby avoiding power being consumed unnecessarily. According to an alternative embodiment, different groups of elements in the second part (13) may be selected by a number encoded in the first part (11), using data bits Eo to EN. Preferably, these different groups reflect the most likely used combinations in a program.Type: GrantFiled: November 13, 2006Date of Patent: November 29, 2011Assignee: NXP B.V.Inventors: Peter Kievits, Jean-Paul C. F. H. Smeets
-
Patent number: 8046568Abstract: The present invention relates to the field of (micro)computer design and architecture, and in particular to microarchitecture associated with moving data values between a (micro)processor and memory components. Particularly, the present invention relates to a computer system with an processor architecture in which register addresses are generated with more than one execution channel controlled by one central processing unit with at least one load/store unit for loading and storing data objects, and at least one cache memory associated to the processor holding data objects accessed by the processor, wherein said processor's load/store unit contains a high speed memory directly interfacing said load/store unit to the cache. The present invention improves the of architectures with dual ported microprocessor implementations comprising two execution pipelines capable of two load/store data transactions per cycle.Type: GrantFiled: June 28, 2010Date of Patent: October 25, 2011Assignee: Broadcom CorporationInventors: Sophie Wilson, John E. Redford
-
Patent number: 8019971Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.Type: GrantFiled: April 6, 2009Date of Patent: September 13, 2011Assignee: Panasonic CorporationInventors: Shuichi Takayama, Nobuo Higaki
-
Publication number: 20110219210Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.Type: ApplicationFiled: May 12, 2011Publication date: September 8, 2011Applicant: ALTERA CORPORATIONInventors: Gerald G. Pechanek, David Carl Strube, Edwin Frank Barry, Charles W. Kurak, JR., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
-
Patent number: 8015391Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.Type: GrantFiled: October 8, 2010Date of Patent: September 6, 2011Assignee: Panasonic CorporationInventor: Takenobu Tani
-
Publication number: 20110213948Abstract: An apparatus includes a processor. The processor includes two memories. The first memory stores one set of instructions. The second memory stores another set of instructions that are longer than the set of instructions in the first memory. An instruction in the set of instructions in the first memory is used as a pointer to a corresponding instruction in the set of instructions in the second memory.Type: ApplicationFiled: February 1, 2010Publication date: September 1, 2011Inventor: Steven Perry
-
Patent number: 7971197Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.Type: GrantFiled: August 18, 2005Date of Patent: June 28, 2011Assignee: Tensilica, Inc.Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
-
Patent number: 7962719Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.Type: GrantFiled: August 7, 2008Date of Patent: June 14, 2011Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
-
Publication number: 20110072151Abstract: A reconfigurable, protocol indifferent bit stream-processing engine, and related systems and data communication methodologies, are adapted to achieve the goal of providing inter-fabric interoperability among high-speed networks operating a speeds of at least 10 gigabits per second. The bit-stream processing engine operates as an omni-protocol, multi-stage processor that can be configured with appropriate switches and related network elements to create a seamless network fabric that permits interoperability not only among existing communication protocols, but also with the ability to accommodate future communication protocols. The method and systems of the present invention are applicable to networks that include storage networks, communication networks and processor networks.Type: ApplicationFiled: August 24, 2010Publication date: March 24, 2011Inventors: Viswa Sharma, Roger Holschbach, Bart Stuck, William Chu
-
Patent number: 7895413Abstract: A microprocessor for processing instructions comprises multiple clusters for receiving the instructions, each of the clusters having a plurality of functional units for executing the instructions, multiple register sub-files each having multiple registers for storing data for executing the instructions, wherein each of the clusters is associated with corresponding one of the register sub-files so that an instruction dispatched to a cluster is executed by accessing registers in a register sub-file associated with the cluster to which the instruction is dispatched, a register-renaming unit for renaming target registers in an instruction with registers in a register sub-file associated with a cluster to which the instruction is dispatched, and issue-queue units each of which is associated with a corresponding one of the clusters, wherein an issue-queue unit holds instruction renamed by the register-renaming unit until the renamed instruction is issued to be executed in a cluster associated with the issue-queue uType: GrantFiled: May 15, 2008Date of Patent: February 22, 2011Assignee: International Business Machines CorporationInventor: Mayan Moudgill
-
Patent number: 7886135Abstract: Instructions asserted in a microprocessors instruction pipeline (3) are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (5) that is synchronized to the instruction pipeline. At the execution stage, the control information is interpreted and appropriate action taken. The control information may indicate that the instruction has been reasserted (asserted again following an initial assertion) and may also indicate the number of times that the instruction has been consecutively asserted in the instruction pipeline. Applied to unaligned memory operations, in which a memory atom is asserted twice, the control information indicates which part of the unaligned data is to be fetched each time the atom is executed.Type: GrantFiled: November 7, 2006Date of Patent: February 8, 2011Inventors: Brett Coon, Godfrey D'Souza, Paul Serris