Array Processor Element Interconnection Patents (Class 712/11)
  • Patent number: 7899962
    Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).
    Type: Grant
    Filed: December 3, 2009
    Date of Patent: March 1, 2011
    Inventors: Martin Vorbach, Robert Münch
  • Patent number: 7889725
    Abstract: A computer cluster arranged at a lattice point in a lattice-like interconnection network contains four nodes and an internal communication network. Two nodes can transmit packets to adjacent computer clusters located along the X direction, and the two other nodes can transmit packets to adjacent computer clusters located along the Y direction. Each node directly transmits a packet to an adjacent computer cluster in the direction in which the node can transmit packets, when the destination of the packet is located in the direction. When the destination of a packet to be transmitted from a node is not located in the direction in which the receiving node can transmit packets, the node transfers the packet to one of the other nodes through the internal communication network for transmitting the packet to the destination of the packet through the one of the other nodes.
    Type: Grant
    Filed: March 27, 2007
    Date of Patent: February 15, 2011
    Assignee: Fujitsu Limited
    Inventor: Yuichiro Ajima
  • Patent number: 7886128
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Grant
    Filed: June 3, 2009
    Date of Patent: February 8, 2011
    Inventor: Gerald George Pechanek
  • Patent number: 7865695
    Abstract: An integrated circuit in communication with a host circuit includes an interconnect bus and a plurality of programmable elements. Each of the programmable elements includes a control interface for receiving a control signal, the control signal causing the memory element to selectively operate in one of a plurality of modes. In a first mode, the memory element communicates stored data to the output port upon receiving the control signal; in a second mode the memory element communicates stored data to the output port upon detecting valid data at the input port; in a third mode the memory element stores a first data value consisting of at least a portion of a single data word received at the input port; and in a fourth mode the memory element stores a second data value consisting of at least a portion of each of two separate input values received at the input port. Each programmable element may write data to and read data from a memory element of any of the other programmable elements.
    Type: Grant
    Filed: April 19, 2007
    Date of Patent: January 4, 2011
    Assignee: L3 Communications Integrated Systems, L.P.
    Inventors: Jerry William Yancey, Yea Zong Kuo
  • Patent number: 7865694
    Abstract: A multi-layer silicon stack architecture includes one or more processing layers including one or more computing elements; one or more networking layers disposed between the processing layers, the network layer includes one or more networking elements, wherein each computing element includes a plurality of network connections to adjacently disposed networking elements.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: January 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kerry Bernstein, Timothy J. Dalton, Marc R. Faucher, Peter A. Sandon
  • Patent number: 7840779
    Abstract: Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
    Type: Grant
    Filed: August 22, 2007
    Date of Patent: November 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Jeremy E. Berg, Michael A. Blocksome, Brian E. Smith
  • Patent number: 7836276
    Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.
    Type: Grant
    Filed: December 2, 2005
    Date of Patent: November 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm
  • Publication number: 20100287318
    Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).
    Type: Application
    Filed: July 21, 2010
    Publication date: November 11, 2010
    Inventors: Martin Vorbach, Robert Münch
  • Patent number: 7830872
    Abstract: To provide a signal processing section of a software radio device or the like which can dynamically change connection itself of an internal function structure at the time of execution. A switching module ISM1(2) or the like selects and uses one of the plurality of the routing tables (60) or the like prepared according to the signal processing and executes routing control to respective processing modules a11 or the like based on the input data packet. The processing module a11 or the like executes each processing by using a parameter table or the like indicating the processing to be performed in accordance with the data packet.
    Type: Grant
    Filed: June 3, 2005
    Date of Patent: November 9, 2010
    Assignees: Toyota Infotechnology Center Co., Ltd., National Institute of Information and Communications Technology
    Inventors: Akihisa Yokoyama, Hiroshi Harada, Hitoshi Inoue, Makoto Honda
  • Patent number: 7831801
    Abstract: A direct memory access (“DMA”)-based multi-processor array architecture that may be implemented in a single integrated circuit is described. The integrated circuit includes a plurality of processing units. A first processing unit and a second processing unit of the plurality of processing units are topologically coupled via a first DMA block. The first DMA block includes a first dual-ported random access memory and a first decoder. A multiple-processor array is provided by topologically coupling the first processing unit and the second processing unit via the first direct memory access block.
    Type: Grant
    Filed: August 30, 2006
    Date of Patent: November 9, 2010
    Assignee: Xilinx, Inc.
    Inventor: James Bryan Anderson
  • Publication number: 20100274975
    Abstract: In one embodiment, link logic of a multi-chip processor (MCP) formed using multiple processors may interface with a first point-to-point (PtP) link coupled between the MCP and an off-package agent and another PtP link coupled between first and second processors of the MCP, where the on-package PtP link operates at a greater bandwidth than the first PtP link. Other embodiments are described and claimed.
    Type: Application
    Filed: April 27, 2009
    Publication date: October 28, 2010
    Inventors: Krishnakanth Sistla, Ganapati Srinivasa
  • Patent number: 7822889
    Abstract: A mechanism is provided for transmitting data in a data network. A first processor of the data network receives data to be transmitted to a second processor within the data network. A determination is made if the data has previously been routed through an indirect communication link from a source processor, the indirect communication link being a communication link that does not directly couple the source processor to a final destination processor which is to receive the data. A communication link is selected over which to transmit the data from the first processor to the second processor based on results of determining if the data has previously been routed through an indirect communication link. Finally, the data is transmitted from the first processor to the second processor using the selected communication link.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: October 26, 2010
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony
  • Patent number: 7805546
    Abstract: Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.
    Type: Grant
    Filed: July 27, 2007
    Date of Patent: September 28, 2010
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome
  • Patent number: 7796166
    Abstract: A hand-held digital camera device includes a programmable processing circuitry incorporating a four-way parallel VLIW vector processor; an image sensor interface connected to the programmable processing circuitry and configured to receive signals from an image sensor and pass data representing the signals to the programmable processing circuitry; and a printhead interface connected to the programmable processing circuitry and configured to receive data from the programmable processing circuitry and generate control signals to be received by a printhead of a printing mechanism. The programmable processing circuitry, the image sensor interface and the printhead interface all form part of CMOS integrated circuitry provided on a common wafer. An instruction set of the VLIW vector processor is tuned for image manipulation processing.
    Type: Grant
    Filed: August 13, 2009
    Date of Patent: September 14, 2010
    Assignee: Silverbrook Research Pty Ltd
    Inventor: Kia Silverbrook
  • Publication number: 20100229020
    Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.
    Type: Application
    Filed: May 17, 2010
    Publication date: September 9, 2010
    Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
  • Patent number: 7788465
    Abstract: A processing system according to the invention comprises a plurality of processing elements (PE1, . . . , PE7). The processing elements comprise a controller and computation means. The plurality of processing elements is dynamically reconfigurable as mutually independently operating task units (TU1, TU2, TU3), which task units comprise one processing element (PE7) or a cluster of two or more processing elements (PE3, PE4, PE5, PE6). The processing elements within a cluster are arranged to execute instructions under a common thread of program control. In this way the processing system is capable of using the same sub-set of data-path elements to exploit instruction level parallelism or task level parallelism or a combination thereof, dependent on the application.
    Type: Grant
    Filed: December 4, 2003
    Date of Patent: August 31, 2010
    Assignee: Silicon Hive B.V.
    Inventors: Orlando Miguel Pires Dos Reis Moreira, Alexander Augusteijn, Bernardo De Oliveira Kastrup Pereira, Wim Feike Dominicus Yedema, Paul Ferenc Hoogendijk, Willem Charles Mallon
  • Patent number: 7788467
    Abstract: Methods and apparatus provide for a multiprocessor system including: a plurality of sub-processors operatively coupled to one another over a ring bus, whereby data may be transmitted over one or more paths on the ring bus between pairs of the sub-processors; and a plurality of programmable delay circuits, each associated with at least one of the sub-processors, and each being operable to alter a delay of data transfer at least one of into and out of its associated sub-processor in order to alter one or more latencies associated with the paths on the ring bus between pairs of the sub-processors.
    Type: Grant
    Filed: May 9, 2007
    Date of Patent: August 31, 2010
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Akiyuki Hatakeyama
  • Patent number: 7788466
    Abstract: A plurality of digital signal processors (10), each contains a signal processing core (22), a memory (20) coupled to the processing core (22) and a multiplexed data input (16) coupled to the memory (20). Each digital signal processor has a plurality of outputs for outputting data from the signal processing core (22). A remote write only structure (14a-d) couples outputs of respective groups of the digital signal processors (10) each to the multiplexed data input (16) of respective particular digital signal processor (10), the respective group for the particular digital signal processor (10) not including the particular digital signal processor (10). Thus, each processor (10) writes data for other processors directly from the processor, without storing the data in memory first for handling by an I/O processor, and reads data from other processors (10) via memory, where it is received via an input that does not share resources with the output of the processor (10).
    Type: Grant
    Filed: September 3, 2004
    Date of Patent: August 31, 2010
    Assignee: NXP B.V.
    Inventors: Henricus Hubertus Van Den Berg, Evert-Jan Daniël Pol
  • Patent number: 7783861
    Abstract: When an instruction code “MVLR” is sent from a control processor in a PE having a mask register MR in operation setting, when the direction register F is ON, if a counter and transfer result storing buffer T is ?M, a value of T?M is stored in buffer T, and if T is less than M, content of a first transport register L of a PE whose PE number counted from the left inside a PE block is T, is selected by a first selector and stored in buffer T and the mask register is set to non-operation. When the direction register is OFF, if T is ??M, a value of T+M is stored in buffer T, and if T is greater than ?M, content of R of a PE whose PE number is ?T, counted from the right inside the PE block, is selected by a second selector and stored in buffer T, and MR is set to non-operation. Entire PEs transfer content of L and R to M-adjacent left and right PEs, and data transferred from M-adjacent right and M-adjacent left PEs are stored in L and R respectively.
    Type: Grant
    Filed: February 27, 2007
    Date of Patent: August 24, 2010
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Patent number: 7779229
    Abstract: A processor arrangement having a strip structure for parallel data processing is configured so that local data from the individual processing units or strips is brought together in a rapid manner. Input data, intermediate data and/or output data from various processing units are linked together in an operation which is at least partially combinatory. The data linking operation is not clock controlled. The linking of the local data from various strips in this manner reduces delays in parallel data processing in the processor arrangement. The combinatory data linking operation can provide an overall data linking outcome within an individual clock cycle.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: August 17, 2010
    Assignee: NXP B.V.
    Inventor: Wolfram Drescher
  • Patent number: 7779177
    Abstract: A reconfigurable multi-processor computing system including a plurality of configurable processing elements each having a plurality of integrated high-speed serial input/output ports. Interconnects link the plurality of processing elements, wherein at least one of the integrated high-speed serial input/output ports of each processing element is connected by at least one interconnect to at least one of the integrated high-speed serial input/output ports of each other processing element, thereby creating a full mesh network. The full mesh network is located on a processor card, multiples of which may be grouped in a shelf having a backplane card with a shelf controller card for providing cross-connects between processor cards. Multiple shelves may be interconnected to form a large computer system.
    Type: Grant
    Filed: August 2, 2005
    Date of Patent: August 17, 2010
    Assignee: Arches Computing Systems
    Inventor: Paul Chow
  • Patent number: 7774579
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The tile is configured to control access to a resource of the tile based on access information associated with the resource.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: August 10, 2010
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Publication number: 20100191911
    Abstract: An integrated circuit having an array of programmable processing elements and a memory interface linked by an on-chip communication network. Each processing element includes a plurality of processing cores and a local memory. The memory interface block is operably coupled to external memory and to the on-chip communication network. The memory interface supports accessing the external memory in response to messages communicated from the processing elements of the array over the on-chip communication network. A portion of the local memory for a plurality of the processing elements of the array as well as a portion of the external memory are both allocated to store data shared by a plurality of processing elements of the array during execution of programmed operations distributed thereon.
    Type: Application
    Filed: December 16, 2009
    Publication date: July 29, 2010
    Inventors: Marco Heddes, Massimo Ravasi, Rakesh Kumar Malik, Timothy M. Shanley, Michael Singngee Yeo
  • Patent number: 7765338
    Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FET computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: July 27, 2010
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
  • Patent number: 7765382
    Abstract: A semiconductor device includes a plurality of processing clusters that operate synchronously internally and arranged in a M×N matrix. Each processing cluster is formed as a plurality of processing elements and clocked buses that interconnect the processing elements within each processing cluster. A self-synchronous cluster wrapper is operative with the processing elements such that each processing cluster forms a programmable module. Self-synchronous global and local buses interconnect the processing clusters for communicating externally. An input/output circuit interconnects the global and local buses.
    Type: Grant
    Filed: April 4, 2007
    Date of Patent: July 27, 2010
    Assignee: Harris Corporation
    Inventor: David B. Chester
  • Patent number: 7761687
    Abstract: A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
    Type: Grant
    Filed: June 26, 2007
    Date of Patent: July 20, 2010
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Dong Chen, George Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Shawn Hall, Rudolf A. Haring, Philip Heidelberger, Gerard V. Kopcsay, Martin Ohmacht, Valentina Salapura, Krishnan Sugavanam, Todd Takken
  • Publication number: 20100174883
    Abstract: A processor includes a compute array comprising a first plurality of compute engines serially connected along a data flow path such that data flows between successive compute engines at successive times. The first plurality of compute engines includes an initial compute engine and a final compute engine. The data flow path includes a recirculation path connecting the final compute engine to the initial compute engine with no compute engine therebetween.
    Type: Application
    Filed: February 5, 2010
    Publication date: July 8, 2010
    Inventors: Boris Lerner, Douglas Garde
  • Publication number: 20100169896
    Abstract: An electronic device is provided which comprises a plurality of processing units (IP; IP1-IP4), an interconnect (IPCU; N) for coupling the processing units (IP; IP1-IP4) to enable a communication between the processing units (IP; IP1-IP4) and at least one event monitor (EM) for detecting events in the communication in the electronic device. The electronic device furthermore comprises a first controller unit for controlling the interconnect (IPCU; N) according to one or more of the events detected by the at least one event monitor (EM).
    Type: Application
    Filed: August 7, 2007
    Publication date: July 1, 2010
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.
    Inventors: Martinus Theodorus Bennebroek, Kees Gerard Willem Goossens, Hubertus Gerardus Hendrikus Vermeulen
  • Publication number: 20100158076
    Abstract: A method and apparatus for correlation of a received DSSS signal with a PN sequence, thus significantly reducing the processing time and operating power needed to acquire phase information for DSSS de-spreading and demodulation. The apparatus utilizes a multiprocessor array 10. In one embodiment, multiple processors 15 are located on a single-die 25, connected by single drop busses 20 to form low-operating-power apparatus. The method provides for fast sequential correlation of a received digital signal. In an alternate embodiment, the present invention is a single-die, low-operating-power apparatus and method for fast parallel correlation of a received digital signal. In yet another alternate embodiment, the present invention is a single-die, low-operating-power apparatus and method for fast correlation of a received digital signal using a hybrid of parallel and sequential methods.
    Type: Application
    Filed: December 19, 2008
    Publication date: June 24, 2010
    Applicant: VNS PORTFOLIO LLC
    Inventors: Les O. Snlyely, Paul Michael Ebert
  • Publication number: 20100161938
    Abstract: An integrated circuit having an array of programmable processing elements linked by an on-chip communication network. Each processing element includes a plurality of processing cores, a local memory, and thread scheduling means for scheduling execution of threads on the processing cores of the given processing element. The thread scheduling means assigns threads to the processing cores of the given processing element in a configurable manner. The configuration of the thread scheduling means defines one or more logical symmetric multiprocessors for executing threads on the given processing element. A logical symmetric multiprocessor is realized by a defined set of processing cores assigned to a group of threads executing on the given processing element.
    Type: Application
    Filed: December 23, 2008
    Publication date: June 24, 2010
    Inventors: Marco Heddes, Massimo Ravasi
  • Patent number: 7734894
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.
    Type: Grant
    Filed: April 28, 2008
    Date of Patent: June 8, 2010
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Patent number: 7734896
    Abstract: A reconfigurable integrated circuit device which converts an arbitrary calculation state dynamically, based on configuration data, includes a plurality of processor elements, each of which has an input terminal, an output terminal, a plurality of arithmetic units which are provided in parallel and each of which performs calculation processing in synchronous with a clock signal, and an intra-processor network which connects them in an arbitrary state; and an inter-processor network which connects between processor elements in an arbitrary state. Based on configuration data, the intra-processor network is reconfigurable to a desired connection state, and further, based on the configuration data, the inter-processor network is reconfigurable to a desired connection state.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: June 8, 2010
    Assignee: Fujitsu Microelectronics Limited
    Inventor: Hiroshi Furukawa
  • Patent number: 7725304
    Abstract: A method and apparatus for coupling data between discrete processor-based emulation chips is described. The apparatus is a processor-based hardware emulation integrated circuits (chips) element comprising a plurality of discrete hardware emulation chips, each emulation chip coupled to another emulation chip by a crossbar for coupling data between the plurality of chips. The method comprises providing data to a crossbar from a first discrete emulation chip, selecting the data from the crossbar using a discrete second emulation chip, and storing the data in a data array in the second discrete emulation chip.
    Type: Grant
    Filed: May 22, 2006
    Date of Patent: May 25, 2010
    Assignee: Cadence Design Systems, Inc.
    Inventors: William F. Beausoleil, Beshara G. Elmufdi
  • Publication number: 20100100703
    Abstract: A system and a method for parallel computing for solving complex problems is envisaged. Particularly, hierarchical parallel computing system is envisaged by this invention, which is formed by multiple levels of groups, where each group consists of multiple processing elements. Each group of the parallel computing system models as processing element to its immediate upper layer. Thus, each processing element is hierarchically tagged to its immediate upper level, and a multi-level tier of groups are formed. In accordance with this invention, the parallel computing system operates by breaking any problem hierarchically, first across the groups and then within the groups. This hierarchical breakup of the problem helps in significantly improving the time required for processing a problem.
    Type: Application
    Filed: October 15, 2009
    Publication date: April 22, 2010
    Applicant: Computational Research Laboratories Ltd.
    Inventors: Chandan Basu, Mandar Nadgir, Avinash Pandey
  • Patent number: 7694064
    Abstract: In an embodiment, a multi-processor computer system includes multiple cells, where a cell may include one or more processors and memory resources. The system may further include a global crossbar network and multiple cell-to-global-crossbar connectors, to connect the multiple cells with the global crossbar network. In an embodiment, the system further includes at least one cell-to-cell connector, to directly connect at least one pair of the multiple cells. In another embodiment, the system further includes one or more local crossbar networks, multiple cell-to-local-crossbar connectors, and local input/output backplanes connected to the one or more local crossbar networks.
    Type: Grant
    Filed: December 29, 2004
    Date of Patent: April 6, 2010
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Mark Shaw, Russ William Herrell, Stuart Allen Berke
  • Publication number: 20100082863
    Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).
    Type: Application
    Filed: December 3, 2009
    Publication date: April 1, 2010
    Inventors: MARTIN VORBACH, Robert Münch
  • Patent number: 7676661
    Abstract: A fast linked multiprocessor network including a plurality of processing modules implemented on a field programmable gate array and a plurality of configurable uni-directional links coupled among at least two of the plurality processing modules provide a streaming communication channel between at least two of the plurality of processing modules. Such configuration provides a function accelerator that can feed at least one processor with data values using one custom instruction to put data values on at least one uni-directional serial link and that can extract data values from at least one processor using one custom instruction to get data values from the at least one uni-directional serial link.
    Type: Grant
    Filed: October 5, 2004
    Date of Patent: March 9, 2010
    Assignee: Xilinx, Inc.
    Inventors: Sundararajarao Mohan, Satish R. Ganesan, Goran Bilski
  • Patent number: 7661006
    Abstract: A computer implemented method, apparatus, and computer program product for managing symmetric multiprocessor interconnects. The process identifies functional communication connections between each processor in a plurality of processors on a multiprocessor to form identified functional communication connections. The process maps every functional communication connection between any two processors in the plurality of processors, based on the identified functional communication connections, to form an interconnect matrix. The process creates a path map using the interconnect matrix. The path map comprises a sequence of communication connections between the plurality of processors. The process initializes the plurality of processors using the path map.
    Type: Grant
    Filed: January 9, 2007
    Date of Patent: February 9, 2010
    Assignee: International Business Machines Corporation
    Inventors: Luai A. Abou-Emara, Mark David McLaughlin, Jorge N. Yanez
  • Patent number: 7650434
    Abstract: A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: January 19, 2010
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
  • Patent number: 7650448
    Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: January 19, 2010
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Robert Münch
  • Patent number: 7650483
    Abstract: A data processing apparatus and method are provided for handling execution of instructions within a data processing apparatus having a plurality of processing units. Each processing unit is operable to execute a sequence of instructions so as to perform associated operations, and at least a subset of the processing units form a cluster. Instruction forwarding logic is provided which for at least one instruction executed by at least one of the processing units in the cluster causes that instruction to be executed by each of the other processing units in the cluster, for example by causing that instruction to be inserted into the sequences of instructions executed by each of the other processing units in the cluster.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: January 19, 2010
    Assignee: ARM Limited
    Inventors: Elodie Charra, Frederic Claude Marie Piry, Richard Roy Grisenthwaite, Mélanie Emanuelle Lucie Vincent, Norbert Bernard Eugéne Lataille, Jocelyn Francois Orion Jaubert, Stuart David Biles
  • Patent number: 7636835
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises one or more interface modules including circuitry to transfer data to and from a device external to the tiles; and a sub-port routing network including circuitry to route data between a port of a switch and a plurality of sub-ports coupled to one or more interface modules.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: December 22, 2009
    Assignee: Tilera Corporation
    Inventors: Carl G. Ramey, David Wentzlaff, Anant Agarwal
  • Patent number: 7624248
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises: a processor, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, according to a switch instruction indicating an input port to which each of multiple output ports of the switch is to be coupled, and a translation lookaside buffer coupled to the switch to translate virtual memory addresses of switch instructions to physical memory addresses of the switch instructions.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: November 24, 2009
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Patent number: 7624250
    Abstract: The disclosure describes a processor having processor cores integrated on the same die that have different functional operationality. The processor also includes a chain of multiple dedicated unidirectional connections spanning processor cores. The multiple dedicated unidirectional connections terminate in registers within the respective processor cores. The registers may form a queue such as a ring queue.
    Type: Grant
    Filed: December 5, 2005
    Date of Patent: November 24, 2009
    Assignee: Intel Corporation
    Inventors: Sinn Wee Lau, Choon Yee Loh, Kar Meng Chan
  • Patent number: 7602423
    Abstract: A monolithic integrated circuit includes programmable processing circuitry. An image sensor interface is connected to the processing circuitry and is configured to receive signals from an image sensor and to pass data representing the signals to the programmable processing circuitry. A printhead interface is connected to the processing circuitry and is configured to receive data from the processing circuitry and to generate control signals to be received by a printhead of a printing mechanism.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: October 13, 2009
    Assignee: Silverbrook Research Pty Ltd
    Inventor: Kia Silverbrook
  • Patent number: 7599998
    Abstract: A data processing apparatus comprises at least one source processor core, at least two destination processor cores, a message handler and a bus arrangement providing a data communication path between the source core, the destination cores and the message handler. The message handler has plurality of message-handling modules. At least one of the message-handling modules has a message receipt indicator that is modifiable by each of the destination processor cores to indicate that a message has been received at its destination. This message-handling module also has a transmission completion detector operable to detect, in dependence upon a message receipt indicator value that a message has been received by all of the at least two destination processor cores and to initiate transmission of an acknowledgement signal to the source processor core.
    Type: Grant
    Filed: July 7, 2004
    Date of Patent: October 6, 2009
    Assignee: ARM Limited
    Inventors: Mark James Galbraith, Harry Samuel Thomas Fearnhamm, Nicholas Esca Smith, Bruce James Mathewson
  • Patent number: 7595659
    Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.
    Type: Grant
    Filed: October 8, 2001
    Date of Patent: September 29, 2009
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin Nückel, Volker Baumgarte, Prashant Rao, Jens Oertel
  • Patent number: 7594060
    Abstract: Data buffering allocation in a microprocessor complex for a request of memory allocation is supported through a remote buffer batch allocation protocol. The separation of control and data placement allows simultaneous maximization of microprocessor complex load sharing, and minimization of inter-processor signaling/metadata migration. Separating processing control from data placement allows the location of data buffering to be chosen so as to maximize bus bandwidth utilization and achieve non-blocking switch behavior. This separation reduces the need for inter-processor communication and associated interrupts thus improving computation efficiency and performance.
    Type: Grant
    Filed: August 23, 2006
    Date of Patent: September 22, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Andrew W. Wilson, John Acton, Charles Binford, Daniel R. Cassiday, Raymond J. Lanza
  • Patent number: 7590821
    Abstract: A digital signal processing integrated circuit contains an array of interconnected and programmed or programmable digital signal processors (10). Configurable multiplexing circuits (12), are placed between IO connections (11a,b) and the IO ports of at least a plurality of the digital signal processors (10). The multiplexing circuits (12) are configured under control of configuration data, so that the multiplexing circuit (12) give the effect of accessing the IO connection only to IO signals from the IO port or ports of one or ones of the respective plurality of digital signal processors (10) that are selected by the configuration data. Preferably, each digital signal processor (10) has its IO part coupled in common to a plurality of the multiplexing circuits (12) separately from the other digital signal processing circuits.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: September 15, 2009
    Assignee: NXP B.V.
    Inventors: Henricus Hubertus Van Den Berg, Harpreet Singh Bhullar, Pieter Voorthuijsen
  • Patent number: 7581079
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Grant
    Filed: March 26, 2006
    Date of Patent: August 25, 2009
    Inventor: Gerald George Pechanek