Array Processor Element Interconnection Patents (Class 712/11)
-
Patent number: 7899962Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).Type: GrantFiled: December 3, 2009Date of Patent: March 1, 2011Inventors: Martin Vorbach, Robert Münch
-
Patent number: 7889725Abstract: A computer cluster arranged at a lattice point in a lattice-like interconnection network contains four nodes and an internal communication network. Two nodes can transmit packets to adjacent computer clusters located along the X direction, and the two other nodes can transmit packets to adjacent computer clusters located along the Y direction. Each node directly transmits a packet to an adjacent computer cluster in the direction in which the node can transmit packets, when the destination of the packet is located in the direction. When the destination of a packet to be transmitted from a node is not located in the direction in which the receiving node can transmit packets, the node transfers the packet to one of the other nodes through the internal communication network for transmitting the packet to the destination of the packet through the one of the other nodes.Type: GrantFiled: March 27, 2007Date of Patent: February 15, 2011Assignee: Fujitsu LimitedInventor: Yuichiro Ajima
-
Patent number: 7886128Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.Type: GrantFiled: June 3, 2009Date of Patent: February 8, 2011Inventor: Gerald George Pechanek
-
Patent number: 7865695Abstract: An integrated circuit in communication with a host circuit includes an interconnect bus and a plurality of programmable elements. Each of the programmable elements includes a control interface for receiving a control signal, the control signal causing the memory element to selectively operate in one of a plurality of modes. In a first mode, the memory element communicates stored data to the output port upon receiving the control signal; in a second mode the memory element communicates stored data to the output port upon detecting valid data at the input port; in a third mode the memory element stores a first data value consisting of at least a portion of a single data word received at the input port; and in a fourth mode the memory element stores a second data value consisting of at least a portion of each of two separate input values received at the input port. Each programmable element may write data to and read data from a memory element of any of the other programmable elements.Type: GrantFiled: April 19, 2007Date of Patent: January 4, 2011Assignee: L3 Communications Integrated Systems, L.P.Inventors: Jerry William Yancey, Yea Zong Kuo
-
Patent number: 7865694Abstract: A multi-layer silicon stack architecture includes one or more processing layers including one or more computing elements; one or more networking layers disposed between the processing layers, the network layer includes one or more networking elements, wherein each computing element includes a plurality of network connections to adjacently disposed networking elements.Type: GrantFiled: May 12, 2006Date of Patent: January 4, 2011Assignee: International Business Machines CorporationInventors: Kerry Bernstein, Timothy J. Dalton, Marc R. Faucher, Peter A. Sandon
-
Patent number: 7840779Abstract: Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.Type: GrantFiled: August 22, 2007Date of Patent: November 23, 2010Assignee: International Business Machines CorporationInventors: Charles J. Archer, Jeremy E. Berg, Michael A. Blocksome, Brian E. Smith
-
Patent number: 7836276Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.Type: GrantFiled: December 2, 2005Date of Patent: November 16, 2010Assignee: NVIDIA CorporationInventors: Brett W. Coon, John Erik Lindholm
-
Publication number: 20100287318Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).Type: ApplicationFiled: July 21, 2010Publication date: November 11, 2010Inventors: Martin Vorbach, Robert Münch
-
Patent number: 7830872Abstract: To provide a signal processing section of a software radio device or the like which can dynamically change connection itself of an internal function structure at the time of execution. A switching module ISM1(2) or the like selects and uses one of the plurality of the routing tables (60) or the like prepared according to the signal processing and executes routing control to respective processing modules a11 or the like based on the input data packet. The processing module a11 or the like executes each processing by using a parameter table or the like indicating the processing to be performed in accordance with the data packet.Type: GrantFiled: June 3, 2005Date of Patent: November 9, 2010Assignees: Toyota Infotechnology Center Co., Ltd., National Institute of Information and Communications TechnologyInventors: Akihisa Yokoyama, Hiroshi Harada, Hitoshi Inoue, Makoto Honda
-
Patent number: 7831801Abstract: A direct memory access (“DMA”)-based multi-processor array architecture that may be implemented in a single integrated circuit is described. The integrated circuit includes a plurality of processing units. A first processing unit and a second processing unit of the plurality of processing units are topologically coupled via a first DMA block. The first DMA block includes a first dual-ported random access memory and a first decoder. A multiple-processor array is provided by topologically coupling the first processing unit and the second processing unit via the first direct memory access block.Type: GrantFiled: August 30, 2006Date of Patent: November 9, 2010Assignee: Xilinx, Inc.Inventor: James Bryan Anderson
-
Publication number: 20100274975Abstract: In one embodiment, link logic of a multi-chip processor (MCP) formed using multiple processors may interface with a first point-to-point (PtP) link coupled between the MCP and an off-package agent and another PtP link coupled between first and second processors of the MCP, where the on-package PtP link operates at a greater bandwidth than the first PtP link. Other embodiments are described and claimed.Type: ApplicationFiled: April 27, 2009Publication date: October 28, 2010Inventors: Krishnakanth Sistla, Ganapati Srinivasa
-
Patent number: 7822889Abstract: A mechanism is provided for transmitting data in a data network. A first processor of the data network receives data to be transmitted to a second processor within the data network. A determination is made if the data has previously been routed through an indirect communication link from a source processor, the indirect communication link being a communication link that does not directly couple the source processor to a final destination processor which is to receive the data. A communication link is selected over which to transmit the data from the first processor to the second processor based on results of determining if the data has previously been routed through an indirect communication link. Finally, the data is transmitted from the first processor to the second processor using the selected communication link.Type: GrantFiled: August 27, 2007Date of Patent: October 26, 2010Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony
-
Patent number: 7805546Abstract: Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.Type: GrantFiled: July 27, 2007Date of Patent: September 28, 2010Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome
-
Patent number: 7796166Abstract: A hand-held digital camera device includes a programmable processing circuitry incorporating a four-way parallel VLIW vector processor; an image sensor interface connected to the programmable processing circuitry and configured to receive signals from an image sensor and pass data representing the signals to the programmable processing circuitry; and a printhead interface connected to the programmable processing circuitry and configured to receive data from the programmable processing circuitry and generate control signals to be received by a printhead of a printing mechanism. The programmable processing circuitry, the image sensor interface and the printhead interface all form part of CMOS integrated circuitry provided on a common wafer. An instruction set of the VLIW vector processor is tuned for image manipulation processing.Type: GrantFiled: August 13, 2009Date of Patent: September 14, 2010Assignee: Silverbrook Research Pty LtdInventor: Kia Silverbrook
-
Publication number: 20100229020Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.Type: ApplicationFiled: May 17, 2010Publication date: September 9, 2010Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
-
Patent number: 7788465Abstract: A processing system according to the invention comprises a plurality of processing elements (PE1, . . . , PE7). The processing elements comprise a controller and computation means. The plurality of processing elements is dynamically reconfigurable as mutually independently operating task units (TU1, TU2, TU3), which task units comprise one processing element (PE7) or a cluster of two or more processing elements (PE3, PE4, PE5, PE6). The processing elements within a cluster are arranged to execute instructions under a common thread of program control. In this way the processing system is capable of using the same sub-set of data-path elements to exploit instruction level parallelism or task level parallelism or a combination thereof, dependent on the application.Type: GrantFiled: December 4, 2003Date of Patent: August 31, 2010Assignee: Silicon Hive B.V.Inventors: Orlando Miguel Pires Dos Reis Moreira, Alexander Augusteijn, Bernardo De Oliveira Kastrup Pereira, Wim Feike Dominicus Yedema, Paul Ferenc Hoogendijk, Willem Charles Mallon
-
Patent number: 7788467Abstract: Methods and apparatus provide for a multiprocessor system including: a plurality of sub-processors operatively coupled to one another over a ring bus, whereby data may be transmitted over one or more paths on the ring bus between pairs of the sub-processors; and a plurality of programmable delay circuits, each associated with at least one of the sub-processors, and each being operable to alter a delay of data transfer at least one of into and out of its associated sub-processor in order to alter one or more latencies associated with the paths on the ring bus between pairs of the sub-processors.Type: GrantFiled: May 9, 2007Date of Patent: August 31, 2010Assignee: Sony Computer Entertainment Inc.Inventor: Akiyuki Hatakeyama
-
Patent number: 7788466Abstract: A plurality of digital signal processors (10), each contains a signal processing core (22), a memory (20) coupled to the processing core (22) and a multiplexed data input (16) coupled to the memory (20). Each digital signal processor has a plurality of outputs for outputting data from the signal processing core (22). A remote write only structure (14a-d) couples outputs of respective groups of the digital signal processors (10) each to the multiplexed data input (16) of respective particular digital signal processor (10), the respective group for the particular digital signal processor (10) not including the particular digital signal processor (10). Thus, each processor (10) writes data for other processors directly from the processor, without storing the data in memory first for handling by an I/O processor, and reads data from other processors (10) via memory, where it is received via an input that does not share resources with the output of the processor (10).Type: GrantFiled: September 3, 2004Date of Patent: August 31, 2010Assignee: NXP B.V.Inventors: Henricus Hubertus Van Den Berg, Evert-Jan Daniël Pol
-
Patent number: 7783861Abstract: When an instruction code “MVLR” is sent from a control processor in a PE having a mask register MR in operation setting, when the direction register F is ON, if a counter and transfer result storing buffer T is ?M, a value of T?M is stored in buffer T, and if T is less than M, content of a first transport register L of a PE whose PE number counted from the left inside a PE block is T, is selected by a first selector and stored in buffer T and the mask register is set to non-operation. When the direction register is OFF, if T is ??M, a value of T+M is stored in buffer T, and if T is greater than ?M, content of R of a PE whose PE number is ?T, counted from the right inside the PE block, is selected by a second selector and stored in buffer T, and MR is set to non-operation. Entire PEs transfer content of L and R to M-adjacent left and right PEs, and data transferred from M-adjacent right and M-adjacent left PEs are stored in L and R respectively.Type: GrantFiled: February 27, 2007Date of Patent: August 24, 2010Assignee: NEC CorporationInventor: Shorin Kyo
-
Patent number: 7779229Abstract: A processor arrangement having a strip structure for parallel data processing is configured so that local data from the individual processing units or strips is brought together in a rapid manner. Input data, intermediate data and/or output data from various processing units are linked together in an operation which is at least partially combinatory. The data linking operation is not clock controlled. The linking of the local data from various strips in this manner reduces delays in parallel data processing in the processor arrangement. The combinatory data linking operation can provide an overall data linking outcome within an individual clock cycle.Type: GrantFiled: February 12, 2003Date of Patent: August 17, 2010Assignee: NXP B.V.Inventor: Wolfram Drescher
-
Patent number: 7779177Abstract: A reconfigurable multi-processor computing system including a plurality of configurable processing elements each having a plurality of integrated high-speed serial input/output ports. Interconnects link the plurality of processing elements, wherein at least one of the integrated high-speed serial input/output ports of each processing element is connected by at least one interconnect to at least one of the integrated high-speed serial input/output ports of each other processing element, thereby creating a full mesh network. The full mesh network is located on a processor card, multiples of which may be grouped in a shelf having a backplane card with a shelf controller card for providing cross-connects between processor cards. Multiple shelves may be interconnected to form a large computer system.Type: GrantFiled: August 2, 2005Date of Patent: August 17, 2010Assignee: Arches Computing SystemsInventor: Paul Chow
-
Patent number: 7774579Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The tile is configured to control access to a resource of the tile based on access information associated with the resource.Type: GrantFiled: April 14, 2006Date of Patent: August 10, 2010Assignee: Tilera CorporationInventors: David Wentzlaff, Anant Agarwal
-
Publication number: 20100191911Abstract: An integrated circuit having an array of programmable processing elements and a memory interface linked by an on-chip communication network. Each processing element includes a plurality of processing cores and a local memory. The memory interface block is operably coupled to external memory and to the on-chip communication network. The memory interface supports accessing the external memory in response to messages communicated from the processing elements of the array over the on-chip communication network. A portion of the local memory for a plurality of the processing elements of the array as well as a portion of the external memory are both allocated to store data shared by a plurality of processing elements of the array during execution of programmed operations distributed thereon.Type: ApplicationFiled: December 16, 2009Publication date: July 29, 2010Inventors: Marco Heddes, Massimo Ravasi, Rakesh Kumar Malik, Timothy M. Shanley, Michael Singngee Yeo
-
Patent number: 7765338Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FET computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.Type: GrantFiled: July 9, 2007Date of Patent: July 27, 2010Assignee: Altera CorporationInventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
-
Patent number: 7765382Abstract: A semiconductor device includes a plurality of processing clusters that operate synchronously internally and arranged in a M×N matrix. Each processing cluster is formed as a plurality of processing elements and clocked buses that interconnect the processing elements within each processing cluster. A self-synchronous cluster wrapper is operative with the processing elements such that each processing cluster forms a programmable module. Self-synchronous global and local buses interconnect the processing clusters for communicating externally. An input/output circuit interconnects the global and local buses.Type: GrantFiled: April 4, 2007Date of Patent: July 27, 2010Assignee: Harris CorporationInventor: David B. Chester
-
Patent number: 7761687Abstract: A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.Type: GrantFiled: June 26, 2007Date of Patent: July 20, 2010Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Dong Chen, George Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Shawn Hall, Rudolf A. Haring, Philip Heidelberger, Gerard V. Kopcsay, Martin Ohmacht, Valentina Salapura, Krishnan Sugavanam, Todd Takken
-
Publication number: 20100174883Abstract: A processor includes a compute array comprising a first plurality of compute engines serially connected along a data flow path such that data flows between successive compute engines at successive times. The first plurality of compute engines includes an initial compute engine and a final compute engine. The data flow path includes a recirculation path connecting the final compute engine to the initial compute engine with no compute engine therebetween.Type: ApplicationFiled: February 5, 2010Publication date: July 8, 2010Inventors: Boris Lerner, Douglas Garde
-
Publication number: 20100169896Abstract: An electronic device is provided which comprises a plurality of processing units (IP; IP1-IP4), an interconnect (IPCU; N) for coupling the processing units (IP; IP1-IP4) to enable a communication between the processing units (IP; IP1-IP4) and at least one event monitor (EM) for detecting events in the communication in the electronic device. The electronic device furthermore comprises a first controller unit for controlling the interconnect (IPCU; N) according to one or more of the events detected by the at least one event monitor (EM).Type: ApplicationFiled: August 7, 2007Publication date: July 1, 2010Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.Inventors: Martinus Theodorus Bennebroek, Kees Gerard Willem Goossens, Hubertus Gerardus Hendrikus Vermeulen
-
Publication number: 20100158076Abstract: A method and apparatus for correlation of a received DSSS signal with a PN sequence, thus significantly reducing the processing time and operating power needed to acquire phase information for DSSS de-spreading and demodulation. The apparatus utilizes a multiprocessor array 10. In one embodiment, multiple processors 15 are located on a single-die 25, connected by single drop busses 20 to form low-operating-power apparatus. The method provides for fast sequential correlation of a received digital signal. In an alternate embodiment, the present invention is a single-die, low-operating-power apparatus and method for fast parallel correlation of a received digital signal. In yet another alternate embodiment, the present invention is a single-die, low-operating-power apparatus and method for fast correlation of a received digital signal using a hybrid of parallel and sequential methods.Type: ApplicationFiled: December 19, 2008Publication date: June 24, 2010Applicant: VNS PORTFOLIO LLCInventors: Les O. Snlyely, Paul Michael Ebert
-
Publication number: 20100161938Abstract: An integrated circuit having an array of programmable processing elements linked by an on-chip communication network. Each processing element includes a plurality of processing cores, a local memory, and thread scheduling means for scheduling execution of threads on the processing cores of the given processing element. The thread scheduling means assigns threads to the processing cores of the given processing element in a configurable manner. The configuration of the thread scheduling means defines one or more logical symmetric multiprocessors for executing threads on the given processing element. A logical symmetric multiprocessor is realized by a defined set of processing cores assigned to a group of threads executing on the given processing element.Type: ApplicationFiled: December 23, 2008Publication date: June 24, 2010Inventors: Marco Heddes, Massimo Ravasi
-
Patent number: 7734894Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.Type: GrantFiled: April 28, 2008Date of Patent: June 8, 2010Assignee: Tilera CorporationInventors: David Wentzlaff, Anant Agarwal
-
Patent number: 7734896Abstract: A reconfigurable integrated circuit device which converts an arbitrary calculation state dynamically, based on configuration data, includes a plurality of processor elements, each of which has an input terminal, an output terminal, a plurality of arithmetic units which are provided in parallel and each of which performs calculation processing in synchronous with a clock signal, and an intra-processor network which connects them in an arbitrary state; and an inter-processor network which connects between processor elements in an arbitrary state. Based on configuration data, the intra-processor network is reconfigurable to a desired connection state, and further, based on the configuration data, the inter-processor network is reconfigurable to a desired connection state.Type: GrantFiled: March 28, 2006Date of Patent: June 8, 2010Assignee: Fujitsu Microelectronics LimitedInventor: Hiroshi Furukawa
-
Patent number: 7725304Abstract: A method and apparatus for coupling data between discrete processor-based emulation chips is described. The apparatus is a processor-based hardware emulation integrated circuits (chips) element comprising a plurality of discrete hardware emulation chips, each emulation chip coupled to another emulation chip by a crossbar for coupling data between the plurality of chips. The method comprises providing data to a crossbar from a first discrete emulation chip, selecting the data from the crossbar using a discrete second emulation chip, and storing the data in a data array in the second discrete emulation chip.Type: GrantFiled: May 22, 2006Date of Patent: May 25, 2010Assignee: Cadence Design Systems, Inc.Inventors: William F. Beausoleil, Beshara G. Elmufdi
-
Publication number: 20100100703Abstract: A system and a method for parallel computing for solving complex problems is envisaged. Particularly, hierarchical parallel computing system is envisaged by this invention, which is formed by multiple levels of groups, where each group consists of multiple processing elements. Each group of the parallel computing system models as processing element to its immediate upper layer. Thus, each processing element is hierarchically tagged to its immediate upper level, and a multi-level tier of groups are formed. In accordance with this invention, the parallel computing system operates by breaking any problem hierarchically, first across the groups and then within the groups. This hierarchical breakup of the problem helps in significantly improving the time required for processing a problem.Type: ApplicationFiled: October 15, 2009Publication date: April 22, 2010Applicant: Computational Research Laboratories Ltd.Inventors: Chandan Basu, Mandar Nadgir, Avinash Pandey
-
Patent number: 7694064Abstract: In an embodiment, a multi-processor computer system includes multiple cells, where a cell may include one or more processors and memory resources. The system may further include a global crossbar network and multiple cell-to-global-crossbar connectors, to connect the multiple cells with the global crossbar network. In an embodiment, the system further includes at least one cell-to-cell connector, to directly connect at least one pair of the multiple cells. In another embodiment, the system further includes one or more local crossbar networks, multiple cell-to-local-crossbar connectors, and local input/output backplanes connected to the one or more local crossbar networks.Type: GrantFiled: December 29, 2004Date of Patent: April 6, 2010Assignee: Hewlett-Packard Development Company, L.P.Inventors: Mark Shaw, Russ William Herrell, Stuart Allen Berke
-
Publication number: 20100082863Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).Type: ApplicationFiled: December 3, 2009Publication date: April 1, 2010Inventors: MARTIN VORBACH, Robert Münch
-
Patent number: 7676661Abstract: A fast linked multiprocessor network including a plurality of processing modules implemented on a field programmable gate array and a plurality of configurable uni-directional links coupled among at least two of the plurality processing modules provide a streaming communication channel between at least two of the plurality of processing modules. Such configuration provides a function accelerator that can feed at least one processor with data values using one custom instruction to put data values on at least one uni-directional serial link and that can extract data values from at least one processor using one custom instruction to get data values from the at least one uni-directional serial link.Type: GrantFiled: October 5, 2004Date of Patent: March 9, 2010Assignee: Xilinx, Inc.Inventors: Sundararajarao Mohan, Satish R. Ganesan, Goran Bilski
-
Patent number: 7661006Abstract: A computer implemented method, apparatus, and computer program product for managing symmetric multiprocessor interconnects. The process identifies functional communication connections between each processor in a plurality of processors on a multiprocessor to form identified functional communication connections. The process maps every functional communication connection between any two processors in the plurality of processors, based on the identified functional communication connections, to form an interconnect matrix. The process creates a path map using the interconnect matrix. The path map comprises a sequence of communication connections between the plurality of processors. The process initializes the plurality of processors using the path map.Type: GrantFiled: January 9, 2007Date of Patent: February 9, 2010Assignee: International Business Machines CorporationInventors: Luai A. Abou-Emara, Mark David McLaughlin, Jorge N. Yanez
-
Patent number: 7650434Abstract: A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node.Type: GrantFiled: February 25, 2002Date of Patent: January 19, 2010Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
-
Patent number: 7650448Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).Type: GrantFiled: January 10, 2008Date of Patent: January 19, 2010Assignee: Pact XPP Technologies AGInventors: Martin Vorbach, Robert Münch
-
Patent number: 7650483Abstract: A data processing apparatus and method are provided for handling execution of instructions within a data processing apparatus having a plurality of processing units. Each processing unit is operable to execute a sequence of instructions so as to perform associated operations, and at least a subset of the processing units form a cluster. Instruction forwarding logic is provided which for at least one instruction executed by at least one of the processing units in the cluster causes that instruction to be executed by each of the other processing units in the cluster, for example by causing that instruction to be inserted into the sequences of instructions executed by each of the other processing units in the cluster.Type: GrantFiled: November 3, 2006Date of Patent: January 19, 2010Assignee: ARM LimitedInventors: Elodie Charra, Frederic Claude Marie Piry, Richard Roy Grisenthwaite, Mélanie Emanuelle Lucie Vincent, Norbert Bernard Eugéne Lataille, Jocelyn Francois Orion Jaubert, Stuart David Biles
-
Patent number: 7636835Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises one or more interface modules including circuitry to transfer data to and from a device external to the tiles; and a sub-port routing network including circuitry to route data between a port of a switch and a plurality of sub-ports coupled to one or more interface modules.Type: GrantFiled: April 14, 2006Date of Patent: December 22, 2009Assignee: Tilera CorporationInventors: Carl G. Ramey, David Wentzlaff, Anant Agarwal
-
Patent number: 7624248Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises: a processor, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, according to a switch instruction indicating an input port to which each of multiple output ports of the switch is to be coupled, and a translation lookaside buffer coupled to the switch to translate virtual memory addresses of switch instructions to physical memory addresses of the switch instructions.Type: GrantFiled: April 14, 2006Date of Patent: November 24, 2009Assignee: Tilera CorporationInventors: David Wentzlaff, Anant Agarwal
-
Patent number: 7624250Abstract: The disclosure describes a processor having processor cores integrated on the same die that have different functional operationality. The processor also includes a chain of multiple dedicated unidirectional connections spanning processor cores. The multiple dedicated unidirectional connections terminate in registers within the respective processor cores. The registers may form a queue such as a ring queue.Type: GrantFiled: December 5, 2005Date of Patent: November 24, 2009Assignee: Intel CorporationInventors: Sinn Wee Lau, Choon Yee Loh, Kar Meng Chan
-
Patent number: 7602423Abstract: A monolithic integrated circuit includes programmable processing circuitry. An image sensor interface is connected to the processing circuitry and is configured to receive signals from an image sensor and to pass data representing the signals to the programmable processing circuitry. A printhead interface is connected to the processing circuitry and is configured to receive data from the processing circuitry and to generate control signals to be received by a printhead of a printing mechanism.Type: GrantFiled: October 14, 2003Date of Patent: October 13, 2009Assignee: Silverbrook Research Pty LtdInventor: Kia Silverbrook
-
Patent number: 7599998Abstract: A data processing apparatus comprises at least one source processor core, at least two destination processor cores, a message handler and a bus arrangement providing a data communication path between the source core, the destination cores and the message handler. The message handler has plurality of message-handling modules. At least one of the message-handling modules has a message receipt indicator that is modifiable by each of the destination processor cores to indicate that a message has been received at its destination. This message-handling module also has a transmission completion detector operable to detect, in dependence upon a message receipt indicator value that a message has been received by all of the at least two destination processor cores and to initiate transmission of an acknowledgement signal to the source processor core.Type: GrantFiled: July 7, 2004Date of Patent: October 6, 2009Assignee: ARM LimitedInventors: Mark James Galbraith, Harry Samuel Thomas Fearnhamm, Nicholas Esca Smith, Bruce James Mathewson
-
Patent number: 7595659Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.Type: GrantFiled: October 8, 2001Date of Patent: September 29, 2009Assignee: Pact XPP Technologies AGInventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin Nückel, Volker Baumgarte, Prashant Rao, Jens Oertel
-
Patent number: 7594060Abstract: Data buffering allocation in a microprocessor complex for a request of memory allocation is supported through a remote buffer batch allocation protocol. The separation of control and data placement allows simultaneous maximization of microprocessor complex load sharing, and minimization of inter-processor signaling/metadata migration. Separating processing control from data placement allows the location of data buffering to be chosen so as to maximize bus bandwidth utilization and achieve non-blocking switch behavior. This separation reduces the need for inter-processor communication and associated interrupts thus improving computation efficiency and performance.Type: GrantFiled: August 23, 2006Date of Patent: September 22, 2009Assignee: Sun Microsystems, Inc.Inventors: Andrew W. Wilson, John Acton, Charles Binford, Daniel R. Cassiday, Raymond J. Lanza
-
Patent number: 7590821Abstract: A digital signal processing integrated circuit contains an array of interconnected and programmed or programmable digital signal processors (10). Configurable multiplexing circuits (12), are placed between IO connections (11a,b) and the IO ports of at least a plurality of the digital signal processors (10). The multiplexing circuits (12) are configured under control of configuration data, so that the multiplexing circuit (12) give the effect of accessing the IO connection only to IO signals from the IO port or ports of one or ones of the respective plurality of digital signal processors (10) that are selected by the configuration data. Preferably, each digital signal processor (10) has its IO part coupled in common to a plurality of the multiplexing circuits (12) separately from the other digital signal processing circuits.Type: GrantFiled: January 31, 2005Date of Patent: September 15, 2009Assignee: NXP B.V.Inventors: Henricus Hubertus Van Den Berg, Harpreet Singh Bhullar, Pieter Voorthuijsen
-
Patent number: 7581079Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.Type: GrantFiled: March 26, 2006Date of Patent: August 25, 2009Inventor: Gerald George Pechanek