Multimode (e.g., Mimd To Simd, Etc.) Patents (Class 712/20)
-
Patent number: 7487302Abstract: A memory subsystem includes a memory controller operable to generate first control signals according to a standard interface. A memory interface adapter is coupled to the memory controller and is operable responsive to the first control signals to develop second control signals adapted to be applied to a memory subsystem to access desired storage locations within the memory subsystem.Type: GrantFiled: October 3, 2005Date of Patent: February 3, 2009Assignee: Lockheed Martin CorporationInventors: Brent I. Gouldey, Joel J. Fuster, John Rapp, Mark Jones
-
Publication number: 20090024830Abstract: Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing a SIMD partition comprising a plurality of the compute nodes; booting the SIMD partition in MIMD mode; executing by launcher programs a plurality of MIMD programs on compute nodes in the SIMD partition; and re-executing a launcher program by an operating system on a compute node in the SIMD partition upon termination of the MIMD program executed by the launcher program.Type: ApplicationFiled: July 19, 2007Publication date: January 22, 2009Inventors: Thomas A. Budnik, Alan J. King, Patrick J. McCarthy, Michael B. Mundy, Amanda Peters, James C. Sexton, Gordon G. Stewart
-
Publication number: 20090024831Abstract: Executing MIMD programs on a SIMD machine, including establishing on the SIMD machine a plurality of SIMD partitions; booting a first SIMD partition in MIMD mode; executing, on a compute node of the first SIMD partition booted in MIMD mode, a MIMD accelerator program; executing a SIMD program in a second SIMD partition, one instance of the SIMD program executing on each compute node of the second SIMD partition, each instance of the SIMD program carrying out a portion of the data processing effected by the SIMD program; and accelerating, by an instance of the SIMD program through the MIMD accelerator program, a portion of the data processing of the instance of the SIMD program.Type: ApplicationFiled: July 19, 2007Publication date: January 22, 2009Inventors: Todd A. Inglet, Alan J. King, Patrick J. McCarthy, Amanda Peters, James C. Sexton
-
Patent number: 7467286Abstract: A method and apparatus are provided for executing packed data instructions. According to one aspect of the invention, a processor includes registers, a register renaming unit coupled to the registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands that include data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set specify operations to be performed on all of the data elements. In contrast, each of the instructions in the second set specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either the first or second set of instructions.Type: GrantFiled: May 9, 2005Date of Patent: December 16, 2008Assignee: Intel CorporationInventors: Mohammad Abdallah, James Coke, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
-
Publication number: 20080301482Abstract: A computer array 100 including a field of processors 101-124 each processor having a separate memory. The processors 101-124 are connected to their immediate neighbors with links 200. Several configurations of the links are described including differing types of data lines 210 and control lines 215. Along lines 215 Process Command Words (PCW) to initiate processing tasks and Routing Connection Words (RCW) to initiate routing tasks pass between the processors 101-124 to provide a method for altering the mode of hybrid processors 107-118 in the array.Type: ApplicationFiled: May 31, 2007Publication date: December 4, 2008Inventor: Lonnie C. Goff
-
Patent number: 7457938Abstract: In one embodiment, the present invention includes a method for executing an operation on low order portions of first and second source operands using a first execution stack of a processor and executing the operation on high order portions of the first and second source operands using a second execution stack of the processor, where the operation in the second execution stack is staggered by one or more cycles from the operation in the first execution stack. Other embodiments are described and claimed.Type: GrantFiled: September 30, 2005Date of Patent: November 25, 2008Assignee: Intel CorporationInventors: Stephan Jourdan, Avinash Sodani, Michael Fetterman, Per Hammarlund, Ronak Singhal, Glenn Hinton
-
Publication number: 20080288748Abstract: A core switching system includes a mode switching module that receives a switch signal to switch operation between a first mode and a second mode. During the first mode, instructions associated with applications are executed by a first asymmetric core, and a second asymmetric core is inactive. During the second mode, the instructions are executed by the second asymmetric core, and the first asymmetric core is inactive. A core activation module stops processing of the applications by the first asymmetric core after interrupts are disabled. A state transfer module transfers a state of the first asymmetric core to the second asymmetric core. The core activation module allows the second asymmetric core to resume execution of the instructions and the interrupts are enabled.Type: ApplicationFiled: June 30, 2008Publication date: November 20, 2008Inventors: Sehat Sutardja, Hong-Yi Chen, Premanand Sakarda, Mark N. Fullerton, Jay Heeb
-
Publication number: 20080288746Abstract: Executing MIMD programs on a SIMD machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing one or more SIMD partitions, booting one or more SIMD partitions in MIMD mode; establishing a MIMD partition; executing by launcher programs a plurality of MIMD programs on two or more of the compute nodes of the MIMD partition; and re-executing a launcher program by an operating system on a compute node in the MIMD partition upon termination of the MIMD program executed by the launcher program.Type: ApplicationFiled: May 16, 2007Publication date: November 20, 2008Inventors: Todd A. Inglett, Patrick J. McCarthy, Amanda Peters
-
Publication number: 20080288747Abstract: Executing MIMD programs on a SIMD machine, including establishing SIMD partitions on the SIMD machine; booting SIMD partitions in MIMD mode; executing MIMD programs on the compute nodes of a first SIMD partition booted in MIMD mode; re-executing a launcher program by an operating system on a compute node in the first SIMD partition booted in MIMD mode upon termination of the MIMD program executed by the launcher program; determining by a scheduler that the first SIMD partition booted in MIMD mode is required to establish a new SIMD partition large enough to run a SIMD program that is scheduled for execution; moving by the scheduler data processing operations from the first SIMD partition booted in MIMD mode to the second SIMD partition booted in MIMD mode; and establishing by the scheduler the new SIMD partition.Type: ApplicationFiled: May 18, 2007Publication date: November 20, 2008Inventors: Todd A. Inglett, Patrick J. McCarthy, Amanda Peters
-
Publication number: 20080282062Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The sleeping computer (12) can be awaiting data or instructions (12). In the case of instructions, the sleeping computer (12) can be waiting to store the instructions or to immediately execute the instructions. In the later case, the instructions are placed in an instruction register (30a) when they are received and executed therefrom, without first placing the instructions first into memory. The instructions can include a crawler (201) which is capable of traversing multiple processors along a predefined path (202) and performing a series of operations in preselected computers.Type: ApplicationFiled: May 7, 2007Publication date: November 13, 2008Inventors: Michael B. Montvelishsky, Charles H. Moore, Jeffrey Arthur Fox
-
Publication number: 20080276068Abstract: An IP multimedia subsystem (IMS) network includes (i) a plurality of network elements that are directly or indirectly interconnected for carrying out communications and (ii) an integrated IMS network control unit interfaced with the other network elements. The control unit integrates three IMS network functions into one network node: a multimedia resource function (MRF) module, which incorporates a multimedia resource function controller (MRFC) and/or one or more multimedia resource function processors (MRFP), e.g., media servers; a media gateway control function (MGCF); and a media gateway (MGW). The external physical interface of the control unit mimics the typical interfaces of the MRF/MRFC, MGCF, and MGW, were they to be provided as separate network elements/nodes. As such, the control unit is both physically and logically transparent to the rest of the network, as relating to the integrated MRF, MGCF, and MGW functions.Type: ApplicationFiled: May 3, 2007Publication date: November 6, 2008Inventors: Syed Reaz Ashraf, Behzad Davachi Mottahed
-
Publication number: 20080270747Abstract: A method for a switchover in a computer system having at least two execution units, a switchover being performed between at least two operating modes, and a first operating mode corresponding to a comparison mode, and a second operating mode corresponding to a performance mode, wherein the switchover is triggered by at least one signal, which is generated outside the computer system.Type: ApplicationFiled: October 25, 2005Publication date: October 30, 2008Inventors: Wolfgang Pfeiffer, Reinhard Weiberle, Bernd Mueller, Florian Hartwich, Werner Harter, Ralf Angerbauer, Eberhard Boehl, Thomas Kottke, Yorch von Collani, Rainer Gmehlich, Karsten Graebitz
-
Publication number: 20080270746Abstract: A method and a device for performing switchover operations and for comparing signals in a computer system having at least two processing units, a switchover device being provided, and switchover operations being carried out between at least two operating modes, a comparator being provided, and a first operating mode corresponding to a comparison mode, and a second operating mode corresponding to a performance mode. At least two analog signals of the processing units are compared in such a way that, as a function of these signals, a difference is formed.Type: ApplicationFiled: October 25, 2005Publication date: October 30, 2008Inventors: Bernd Mueller, Eberhard Boehl
-
Publication number: 20080270748Abstract: A system and method for design verification and, more particularly, a hardware simulation accelerator design and method that exploits a parallel structure of user models to support a large user model size. The method includes a computer including N number of logic evaluation units (LEUs) that share a common pool of instruction memory (IM). The computer infrastructure is operable to: partition a number of parallel operations in a netlist; and send a same instruction stream of the partitioned number of parallel operations to N number of LEUs from a single IM. The system is a hardware simulation accelerator having a computer infrastructure operable to provide a stream of instructions to multiple LEUs from a single IM. The multiple LEUs are clustered together with multiple IMs such that each LEU is configured to use instructions from any of the multiple IMs thereby allowing a same instruction stream to drive the multiple LEUs.Type: ApplicationFiled: April 30, 2007Publication date: October 30, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Daniel R. CROUSE, Gernot E. GUENTHER, Viktor GYURIS, Harrell HOFFMAN, Kevin A. PASNIK, Thomas J. TRYT, John H. WESTERMANN
-
Publication number: 20080263317Abstract: An integrated circuit (102) in communication with a host circuit (104) includes an interconnect bus (344) and a plurality of programmable elements (116-130). Each of the programmable elements (116-130) includes a control interface (354) for receiving a control signal, the control signal causing the memory element (338) to selectively operate in one of a plurality of modes. In a first mode, the memory element (338) communicates stored data to the output port upon receiving the control signal; in a second mode the memory element (338) communicates stored data to the output port upon detecting valid data at the input port; in a third mode the memory element stores a first data value consisting of at least a portion of a single data word received at the input port; and in a fourth mode the memory element (338) stores a second data value consisting of at least a portion of each of two separate input values received at the input port.Type: ApplicationFiled: April 19, 2007Publication date: October 23, 2008Applicant: L3 COMMUNICATIONS INTEGRATED SYSTEMS, L.P.Inventors: JERRY WILLIAM YANCEY, YEA ZONG KUO
-
Patent number: 7441098Abstract: A method of executing instructions in a computer system on operands containing a plurality of packed objects in respective lanes of the operand is described. Each instruction defines an operation and contains a condition setting indicator settable independently of the operation. The status of the condition setting indicator determines whether or not multibit condition codes are set. When they are to be set, they are set depending on the results for carrying out the operation for each lane.Type: GrantFiled: May 6, 2005Date of Patent: October 21, 2008Assignee: Broadcom CorporationInventor: Sophie Wilson
-
Publication number: 20080250226Abstract: A multi-mode register rename mechanism which allows a simultaneous multi-threaded processor to support full out-of-order thread execution when the number of threads is low and in-order thread execution when the number of threads increases. Responsive to changing an execution mode of a processor to operate in in-order thread execution mode, the illustrative embodiments switch a physical register in the data processing system to an architected facility, thereby forming a switched physical register. When an instruction is issued to an execution unit, wherein the issued instruction comprises a thread bit, the thread bit is examined to determine if the instruction accesses an architected facility. If the issued instruction accesses an architected facility, the instruction is executed, and the results of the executed instruction are written to the switched physical register.Type: ApplicationFiled: April 4, 2007Publication date: October 9, 2008Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
-
Patent number: 7418575Abstract: A system for adding reconfigurable computational instructions to a computer, the system comprising a processor operable to execute a set of instructions of a computer program comprising a set of computational instructions and long instruction word instructions with at least one of the long instruction word instructions comprising an instruction extension, an extension adapter coupled to the processor and operable to detect the execution of the instruction extension, and programmable logic coupled to the extension adapter and operable to receive configuration data for defining the instruction extension and execute the instruction extension.Type: GrantFiled: May 12, 2005Date of Patent: August 26, 2008Assignee: Stretch, Inc.Inventors: Ricardo E. Gonzalez, Scott Johnson, Derek Taylor
-
Patent number: 7404066Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.Type: GrantFiled: January 24, 2007Date of Patent: July 22, 2008Assignee: Micron Technology, Inc.Inventor: Graham Kirsch
-
Patent number: 7401333Abstract: The present invention provides an array of parallel programmable processing engines interconnected by a switching network. At least some of the processing engines execute a thread, and at least some threads communicate with each other through communication objects either internally within one processing engine or through the network. A scheduling step of the parallel programmable processing engines is initiated by one or more events, an event being defined by a change of a state variable of a communication object. The array comprises: means for scheduling a scheduling step of the processing engines, the scheduling means comprising means for executing at least a first set of threads in parallel, means for updating state values of communications objects in response to the parallel executing step, and means for repeatedly and sequentially scheduling the executing means and the updating means until no more events occur. The present invention also provides a deterministic method of operating such an array.Type: GrantFiled: August 8, 2001Date of Patent: July 15, 2008Assignee: TranSwitch CorporationInventor: Ivo Vandeweerd
-
Patent number: 7392329Abstract: In accordance with one embodiment of the present invention, a method of applying an action initiated for a portion of a plurality of devices to all of the plurality of devices is provided. The method comprises establishing a status block for a plurality of devices that are implemented on a system, and initiating an action for a portion of the plurality of devices. The method further comprises writing information to the status block identifying that the action was initiated, and based at least in part on the information written to the status block, applying the action to all of the plurality of devices.Type: GrantFiled: March 28, 2003Date of Patent: June 24, 2008Assignee: Hewlett-Packard Devopment, L.P.Inventors: Scott Lynn Michaelis, Marvin J. Spinhirne
-
Patent number: 7383427Abstract: A method is provided for executing a plurality of parallel executable sequences of instructions on a processor having a plurality of execution units operated by a single instruction unit. The method includes a) detecting a plurality of sequences of instructions adapted for parallel execution from instructions being provided to the processor, wherein each sequence is adapted for execution by a subset of the plurality of execution units and b) storing information representing a stall status of the execution units. Then, a step c) is performed, wherein, for each unexecuted sequence of the plurality of sequences: i) all of the plurality of execution units other than the subset which corresponds to the unexecuted sequence are stalled, and ii) the sequence of instructions is executed by the corresponding subset. Thereafter, it is determined in a step d) whether a current stall status of the plurality of execution units matches the stall status represented by the stored information.Type: GrantFiled: April 20, 2005Date of Patent: June 3, 2008Assignee: Sony Computer Entertainment Inc.Inventor: Takeshi Yamazaki
-
Patent number: 7360005Abstract: An electrically programmable multiple selectable function integrated circuit module has a plurality of optionally selectable function circuits, which receive and manipulate a plurality of input data signals. The outputs of the plurality of optionally selectable function circuits are either interconnected to each other or connected to a plurality of output connectors to transmit manipulated output data signals to external circuitry. The electrically programmable multiple selectable function integrated circuit module has at least one configuration connector, which may be multiplexed with input control and timing signals, connected to a function configuration circuit to receive electrical configuration signals indicating the activation of a program mode and which of the optionally selectable function circuits are to be elected to manipulate the input data signals.Type: GrantFiled: March 11, 2003Date of Patent: April 15, 2008Inventor: Mou-Shiung Lin
-
Publication number: 20080046685Abstract: A control processor is used for fetching and distributing single instruction multiple data (SIMD) instructions to a plurality of processing elements (PEs). One of the SIMD instructions is a thread start (Tstart) instruction, which causes the control processor to pause its instruction fetching. A local PE instruction memory (PE Imem) is associated with each PE and contains local PE instructions for execution on the local PE. Local PE Imem fetch, decode, and execute logic are associated with each PE. Instruction path selection logic in each PE is used to select between control processor distributed instructions and local PE instructions fetched from the local PE Imem. Each PE is also initialized to receive control processor distributed instructions. In addition, local hold generation logic is associated with each PE. A PE receiving a Tstart instruction causes the instruction path selection logic to switch to fetch local PE Imem instructions.Type: ApplicationFiled: April 18, 2007Publication date: February 21, 2008Inventors: Gerald George Pechanek, Edwin Franklin Barry, Mihailo M. Stojancic
-
Patent number: 7313646Abstract: An electronic system comprises an initiator module and a target module addressable by the initiator module, and an interface and control module for interfacing between respective communication protocols of the initiator module and of the target module. The interface and control module is constructed to set a composite instruction detection signal in response to the detection of a composite instruction executed by the initiator module, which composite instruction detection signal is used for the interfacing. The interface and control module is constructed to detect a composite instruction executed by the initiator module when, at a determined clock cycle of the initiator module, a change of the elementary operation executed by the initiator module is detected with respect to the previous clock cycle of the initiator module, while, at the same time, a signal for selecting the target module which was active is kept active.Type: GrantFiled: May 26, 2005Date of Patent: December 25, 2007Assignee: STMicroelectronics S.A.Inventors: Hervé Chalopin, Laurent Tabaries
-
Patent number: 7035991Abstract: A surface computer includes an address generator for generating an address for adjusting surface region data concerning at least a storage region and a concurrent computer, provided at a subsequent stage of the address generator, having a plurality of unit computers.Type: GrantFiled: October 2, 2003Date of Patent: April 25, 2006Assignee: Sony Computer Entertainment Inc.Inventor: Akio Ohba
-
Patent number: 7028107Abstract: A system for communication between a plurality of functional elements in a cell arrangement and a higher-level unit is described. The system may include, for example, a configuration memory arranged between the functional elements and the higher-level unit; and a control unit configured to move at least one position pointer to a configuration memory location in response to at least one event reported by a functional element. At run time, a configuration word in the configuration memory pointed to by at least one of the position pointers is transferred to the functional element in order to perform reconfiguration without the configuration word being managed by a central logic.Type: GrantFiled: October 7, 2002Date of Patent: April 11, 2006Assignee: Pact XPP Technologies AGInventors: Martin Vorbach, Robert Münch
-
Patent number: 6944744Abstract: A functional unit of a processor may be configured to operate on instructions as either a single, wide functional unit or as multiple, independent narrower units. For example, an execution unit may be scheduled to execute an instruction as a single double-wide execution unit or as two independently schedulable single-wide execution units. Functional unit portions may be independently schedulable for execution of instructions operating on a first data type (e.g. SISD instructions). For single-wide instructions, functional unit portions may be scheduled independently. An issue lock mechanism may lock functional unit portions together so that they form a single multi-wide functional unit. For certain multi-wide instructions (e.g. certain SIMD instructions), an instruction operating on a multi-wide or vector data type may be scheduled so that the full multi-wide operation is performed concurrently by functional unit portions locked together as a one wide functional unit.Type: GrantFiled: August 27, 2002Date of Patent: September 13, 2005Assignee: Advanced Micro Devices, Inc.Inventors: Ashraf Ahmed, Michael A. Filippo, James K. Pickett
-
Patent number: 6928535Abstract: An image input section and a signal processing section are provided. The image input section includes an array of pixel in which a plurality of pixels having a CMOS type photoelectric converting element for converting incident light to an electric signal are arranged in a matrix, and a data read-out circuit having the same number of A/D converters as the number of the pixels arranged in one row of the array of pixel and serving to convert the analog signal converted by the pixels into a digital signal and to output the digital signal. The signal processing section includes plurality of processors. Each of the processors includes a plurality of processing elements (PE) provided on the A/D converter provided in the data read-out circuit by one to one. Moreover, a plurality of PEs provided in each of the processors have the same data processing function in the same processor. Furthermore, the PEs in the processor carry out a signal processing in parallel in response to an instruction.Type: GrantFiled: July 16, 2002Date of Patent: August 9, 2005Assignee: Kabushiki Kaisha ToshibaInventors: Hirofumi Yamashita, Charles G. Sodini
-
Patent number: 6925548Abstract: A data processor can assign a greater number of operations to instruction codes with shorter length, thereby implementing high performance, high code efficiency and low cost data processor. The data processor is a VLIW (Very Long Instruction Word) system that can execute a plurality of operations in parallel, and specify the execution sequence of the operations. It can assign a plurality of operations to the same operation code, and the operations that are executed in a second or subsequent sequence are limited to only predetermined operations among the plurality of operations.Type: GrantFiled: October 9, 2001Date of Patent: August 2, 2005Assignee: Renesas Technology Corp.Inventor: Masahito Matsuo
-
Patent number: 6848041Abstract: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs.Type: GrantFiled: April 28, 2003Date of Patent: January 25, 2005Assignee: PTS CorporationInventors: Gerald G. Pechanek, Edwin F. Barry, Juan Guillermo Revilla, Larry D. Larsen
-
Patent number: 6839828Abstract: There is provided a processor designed to operate in a plurality of modes for processing vector and scalar instructions. Register files are each for storing scalar and vector data and address information. A parallel vector unit, coupled to the register files, includes functional units configurable to operate in a vector operation mode and a scalar operation mode. The vector unit includes an apparatus for tightly coupling the functional units to perform an operation specified by a current instruction. Under a vector operation mode, the vector unit performs, in parallel, a single vector operation on a plurality of data elements. The operations performed on the plurality of data elements are each performed by a different functional unit of the vector unit. Under a scalar operation mode, the vector unit performs a scalar operation on a data element received from the register files in a functional unit within the vector unit.Type: GrantFiled: August 14, 2001Date of Patent: January 4, 2005Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, Harm Peter Hofstee, Martin Edward Hopkins
-
Patent number: 6836837Abstract: There is disclosed a technique for accessing a register file which comprises defining a first register address as a plurality of bits and using said first register address to access said register file generating a second register address by using a sequence of said plurality of bits with at least one of said plurality of bits supplied via a unitary operator, the unitary operator being effective to selectively alter the logical value of said bit depending on its logical value in the first register address, and using said second register address to access said register file. A computer system for carrying out such a technique is also enclosed.Type: GrantFiled: June 19, 2003Date of Patent: December 28, 2004Assignee: Broadcom CorporationInventors: Mark Taunton, Sophie Wilson, Timothy Martin Dobson
-
Patent number: 6785799Abstract: A multiprocessor includes M banks storing a plurality of instructions; and N processors each having N instruction fetch stages, wherein each of the N processors processes one of the plurality of instructions in a pipelined manner, where N is an integer equal to or greater than 2, and M is an integer equal to or greater than N, wherein each of the N processors fetches one of the plurality of instructions at a different instruction fetch stage from instruction fetch stages used by other processors.Type: GrantFiled: February 25, 2000Date of Patent: August 31, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Masayuki Yamasaki, Katsuhiko Ueda
-
Patent number: 6785800Abstract: A SIMD processor includes plural processor elements (PEs) each having a processing unit for data processing, a register for holding data to be processed or already processed by the processing unit, a data transfer bus interconnecting with other PEs, and a register controller for inputting a read or write signal to the register. Read and write processing steps in the processor are carried out by the register controller in response to the signals which are sent form the register controller and inputted into a register of specific processor elements responding to an addressing signal from an external interface. The processor is capable of transferring data directly to a specific processor element, thereby achieving higher speeds of data transfer and resultant data processing and makes flexible use of registers to thereby attain efficient data processing utilizing arbitrary combinations of the register depending on bit width of the data.Type: GrantFiled: September 8, 2000Date of Patent: August 31, 2004Assignee: Ricoh Company, Ltd.Inventors: Shin-ichi Yamaura, Kazuhiko Hara, Takao Katayama, Kazuhiko Iwanaga, Hiroshi Takafuji
-
Patent number: 6782463Abstract: Disclosed is a device comprising a core processing circuit coupled to a single memory array which is partitioned into at least a first portion as a cache memory of the core processing circuit, and a second portion as a memory accessible by the one or more data transmission devices through a data bus independently of the core processing circuit.Type: GrantFiled: September 14, 2001Date of Patent: August 24, 2004Assignee: Intel CorporationInventors: Mark A. Schmisseur, Jeff McCoskey, Timothy J. Jehl
-
Patent number: 6775766Abstract: A ManArray processor pipeline design addresses an indirect VLIW memory access problem without increasing branch latency by providing a dynamically reconfigurable instruction pipeline for SIWs requiring a VLIW to be fetched. By introducing an additional cycle in the pipeline only when a VLIW fetch is required, the present invention solves the VLIW memory access problem. The pipeline stays in an expanded state, in general, until a branch type or load VLIW memory type operation is detected returning the pipeline to a compressed pipeline operation. By compressing the pipeline when a branch type operation is detected, the need for an additional cycle for the branch operation is avoided. Consequently, the shorter compressed pipeline provides more efficient performance for branch intensive control code as compared to a fixed pipeline with an expanded number of pipeline stages.Type: GrantFiled: February 28, 2001Date of Patent: August 10, 2004Assignee: PTS CorporationInventors: Juan Guillermo Revilla, Edwin F. Barry, Patrick Rene Marchand, Gerald G. Pechanek
-
Patent number: 6772368Abstract: In one embodiment a multiprocessing apparatus includes a first processor and a second processor. Each of the processors have their own data and instruction caches to support independent operation. In a normal mode the processors independently execute separate instruction streams. Each of the processors has a respective signature generator. The system also includes a compare unit coupled to the signature generators. In a high reliability mode, both processors execute the same instruction stream. That is, each processor computes a version of a result for ones of the instructions in the stream. Responsive to the respective versions, the respective signature generators assert signatures to the compare unit, so that a faulting instruction may be detected. In another aspect, each processor has its own respective commit logic.Type: GrantFiled: December 11, 2000Date of Patent: August 3, 2004Assignee: International Business Machines CorporationInventors: Sang Hoo Dhong, Harm Peter Hofstee, Ravi Nair, Steven Douglas Posluszny
-
Patent number: 6766437Abstract: Instruction and data registers of processors of a multiprocessing computing system are joined and forked to allow processing in multiple modes of operation. When joined, the registers of the processors each contain a same piece of information, hence generating single instruction and data streams. In contrast, when forked, the registers of the processors contain different pieces of information, thereby generating multiple instruction and data streams. Additionally, information may be stored into partitions of memory and fetched and broadcast by processors local to the particular memory sections thereby resulting in a faster cycle time.Type: GrantFiled: February 28, 2000Date of Patent: July 20, 2004Assignee: International Business Machines CorporationInventors: Anthony S. Coscarella, Joseph L. Temple, III
-
Patent number: 6728871Abstract: A cascadable arithmetic and logic unit (ALU) which is configurable in function and interconnection. No decoding of commands is needed during execution of the algorithm. The ALU can be reconfigured at run time without any effect on surrounding ALUs, processing units or data streams. The volume of configuration data is very small, which has positive effects on the space required and the configuration speed. Broadcasting is supported through the internal bus systems in order to distribute large volumes of data rapidly and efficiently. The ALU is equipped with a power-saving mode to shut down power consumption completely. There is also a clock rate divider which makes it possible to operate the ALU at a slower clock rate. Special mechanisms are available for feedback on the internal states to the external controllers.Type: GrantFiled: June 9, 1999Date of Patent: April 27, 2004Assignee: PACT XPP Technologies AGInventors: Martin Vorbach, Robert Münch
-
Patent number: 6684318Abstract: A programmable integrated circuit utilizes a large number of intermediate-grain processing elements which are multibit processing units arranged in a configurable mesh. The coarse-grain resources, such as memory and processing, are deployable in a way that takes advantage of the opportunities for optimization present in given problems. To accomplish this, the interconnect supports three different modes of operation: a static value in which a value set by the configuration data is provided to a functional unit, static source in which another functional unit serves as the value source, and a dynamic source mode in which the source is determined by the value from another functional unit.Type: GrantFiled: November 12, 2002Date of Patent: January 27, 2004Assignee: Massachusetts Institute of TechnologyInventors: André DeHon, Ethan Mirsky, Thomas F. Knight, Jr.
-
Patent number: 6643763Abstract: Method, system and program storage device are provided for implementing a register pipe between processing engines of a multiprocessor computing system. A register pipe includes at least one first register of a first processing engine and at least one second register of a second processing engine. Data is transferred between the first processing engine and the second processing engine through the register pipe without passing through memory. In one embodiment, general purpose registers within the first processing engine and within the second processing engine are employed to implement the register pipe. A control mechanism is provided within each processing engine to dynamically enable or disable the register pipe coupling any two processing engines of the multiprocessor computer system. A technique for broadcasting to multiple register pipes and for implementing barrier synchronization using a register pipe addressed to a processing engine itself are also provided.Type: GrantFiled: February 28, 2000Date of Patent: November 4, 2003Assignee: International Business Machines CorporationInventors: William J. Starke, Joseph L. Temple, III
-
Patent number: 6618698Abstract: Clusters of processors are interconnected as an emulation engine such that processors share input and data stacks, and the setup and storing of results are done in parallel, but the output of one evaluation unit is connected to the input of the next evaluation unit. A set of ‘cascade’ connections provides access to the intermediate values. By tapping intermediate values from one processor, and feeding them to the next, a significant emulation speedup is achieved.Type: GrantFiled: August 12, 1999Date of Patent: September 9, 2003Assignee: Quickturn Design Systems, Inc.Inventors: William F. Beausoleil, Tak-kwong Ng, Helmut Roth, Peter Tannenbaum, N. James Tomassetti
-
Patent number: 6606699Abstract: An apparatus for concurrently executing controller single instruction single data (SISD) instructions and single instruction multiple data (SIMD) processing element instructions comprising a combined controller and processing element. At least first and second simplex instructions each comprise a mode of operation bit, said mode of operation bit in the first simplex instruction specifying a controller SISD operation for execution by the controller, and the mode of operation bit in the second simplex instruction specifying a procesing element SIMD operation for execution by the processsing element. A very long instruction word (VLIW) contains said at least first and second simplex instructions.Type: GrantFiled: February 14, 2001Date of Patent: August 12, 2003Assignee: Bops, Inc.Inventors: Gerald G. Pechanek, Juan G. Revilla
-
Patent number: 6601157Abstract: There is disclosed a technique for accessing a register file which comprises defining a first register address as a plurality of bits and using said first register address to access said register file generating a second register address by using a sequence of said plurality of bits with at least one of said plurality of bits supplied via a unitary operator, the unitary operator being effective to selectively alter the logical value of said bit depending on its logical value in the first register address, and using said second register address to access said register file. A computer system for carrying out such a technique is also enclosed.Type: GrantFiled: November 1, 2000Date of Patent: July 29, 2003Assignee: Broadcom CorporationInventors: Mark Taunton, Sophie Wilson, Timothy Martin Dobson
-
Patent number: 6581152Abstract: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses.Type: GrantFiled: February 11, 2002Date of Patent: June 17, 2003Assignee: BOPS, Inc.Inventors: Edwin F. Barry, Gerald G. Pechanek
-
Patent number: 6553479Abstract: A method and apparatus for providing local control of processing elements in a network of multiple context processing element are provided. A multiple context processing element is configured to store a number of configuration memory contexts. This multiple context processing element maintains data of a current configuration. State information is received from at least one other multiple context processing element. At least one configuration control signal is generated in responses to the state information and the data of a current configuration. One of multiple configuration memory contexts is selected in response to the configuration control signal, the selected configuration memory context controlling the multiple context processing element. Each multiple context processing element in the networked array of multiple context processing elements has an assigned physical and virtual identification.Type: GrantFiled: July 31, 2002Date of Patent: April 22, 2003Assignee: Broadcom CorporationInventors: Ethan Mirsky, Robert French, Ian Eslick
-
Publication number: 20030074542Abstract: A multiprocessor system capable of responding to various types of processing to improve the processing efficiency of the entire system. Each of a plurality of processors holds information indicating the program control mode, a VLIW mode or a multithread mode, in a program synchronization flag of a program controller. A master processor, responsible for program control of the entire system, notifies an instruction memory section for storing instructions in a program of updated information when the program synchronization flag information is updated.Type: ApplicationFiled: August 29, 2002Publication date: April 17, 2003Applicant: Matsushita Electric Industrial Co., Ltd.Inventor: Yukihiro Sasagawa
-
Publication number: 20030056083Abstract: An apparatus, program product, and method utilize routine cloning to optimize the performance of a compiled computer program. Within a compiled representation of a computer program, an implementation of a called routine is generated that has the same external response as the original routine (i.e., has the same output or result in response to the same input), but is modified from the original routine to calculate the result of an expression, which was originally provided as an input parameter to the routine, within the body of the routine. In addition, the signature of the new implementation of the routine is modified to accept, in lieu of the input parameter that originally received the result of the expression, one or more input parameters representative of the argument(s) to be operated upon by the expression.Type: ApplicationFiled: September 20, 2001Publication date: March 20, 2003Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Cary Lee Bates, John Matthew Santosuosso, William Jon Schmidt
-
Patent number: 6526461Abstract: A method and apparatus for interconnecting multiple programmable logic devices. In a preferred embodiment of the invention, an interconnect chip couples one programmable logic device to another programmable logic device. The interface between devices takes place within the interconnect chip, which can be configured using available routing software, thereby sparing the user the task of routing the connections between devices on the board.Type: GrantFiled: July 17, 1997Date of Patent: February 25, 2003Assignee: Altera CorporationInventor: Richard G Cliff