Array Processor Element Interconnection Patents (Class 712/11)
  • Patent number: 8190855
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises one or more interface modules including circuitry to transfer data to and from a device external to the tiles; and a sub-port routing network including circuitry to route data between a port of a switch and a plurality of sub-ports coupled to one or more interface modules.
    Type: Grant
    Filed: February 25, 2008
    Date of Patent: May 29, 2012
    Assignee: Tilera Corporation
    Inventors: Carl G. Ramey, David Wentzlaff, Anant Agarwal
  • Patent number: 8185719
    Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.
    Type: Grant
    Filed: November 18, 2010
    Date of Patent: May 22, 2012
    Assignee: XMOS Limited
    Inventor: Michael David May
  • Patent number: 8181003
    Abstract: Improved instruction set and core design, control and communication for programmable microprocessors is disclosed, involving the strategy for replacing centralized program sequencing in present-day and prior art processors with a novel distributed program sequencing wherein each functional unit has its own instruction fetch and decode block, and each functional unit has its own local memory for program storage; and wherein computational hardware execution units and memory units are flexibly pipelined as programmable embedded processors with reconfigurable pipeline stages of different order in response to varying application instruction sequences that establish different configurations and switching interconnections of the hardware units.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: May 15, 2012
    Assignee: Axis Semiconductor, Inc.
    Inventors: Xiaolin Wang, Qian Wu, Benjamin Marshall, Fugui Wang, Gregory Pitarys, Ke Ning
  • Patent number: 8169440
    Abstract: A method of processing data relating to geometrical primitives is disclosed. Each of the primitives has a plurality of vertices. The method uses a plurality of processing elements in parallel with one another, and comprises assigning respective vertex data to the processing elements, on each processing element, and in parallel with one another, performing at least one processing step on vertex data to produce processed vertex data, and transferring processed vertex data between processing elements so as to assemble primitive data.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: May 1, 2012
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 8169434
    Abstract: An octree GPU construction system and method for constructing a complete octree data structure on a graphics processing unit (GPU). Embodiments of the octree GPU construction system and method first defines a complete octree data structure as forming a complete partition of the 3-D space and including a vertex, edge, face, and node arrays, and neighborhood information. Embodiments of the octree GPU construction system and method input a point cloud and construct a node array. Next, neighboring nodes are computed for each of the nodes in the node arrays by using at least two pre-computed look-up tables (such as a parent look-up table and a child look-up table). Embodiments of the octree GPU construction system and method then use the neighboring nodes and neighborhood information to compute a vertex array, edge array, and face array are computed by determining owner information and self-ownership information based on the neighboring nodes.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: May 1, 2012
    Assignee: Microsoft Corporation
    Inventors: Kun Zhou, Minmin Gong, Baining Guo
  • Patent number: 8156311
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Grant
    Filed: November 27, 2010
    Date of Patent: April 10, 2012
    Inventor: Gerald George Pechanek
  • Patent number: 8151088
    Abstract: A plurality of processor tiles are provided, each processor tile including a processor core. An interconnection network interconnects the processor cores and enables transfer of data among the processor cores. The interconnection network has a plurality of dimensions and is configurable to transmit data from an initial processor core or an input/output device to an intermediate processor core based on a first dimension ordering policy, and from the intermediate processor core to a destination processor core. The first dimension ordering policy specifies an ordering of the dimensions of the interconnection network when routing data through the interconnection network.
    Type: Grant
    Filed: July 8, 2008
    Date of Patent: April 3, 2012
    Assignee: Tilera Corporation
    Inventors: Liewei Bao, Ian Rudolf Bratt
  • Patent number: 8145880
    Abstract: According to some embodiments, an integrated circuit comprises a microprocessor matrix of mesh-interconnected matrix processors. Each processor comprises a data switch including a data switch link register and matrix routing logic. The data switch link register includes one or more matrix link-enable register fields specifying a link enable status (e.g. a message-independent, p-to-p, and/or broadcast link enable status) for each inter-processor matrix link of the processor. The matrix routing logic routes inter-processor messages according to the matrix link-enable register field(s). A particular link may be selected by a current matrix processor by selecting an ordered list of matrix links according to a relationship between ?H and ?V, and choosing the first enabled link in the selected list for routing.
    Type: Grant
    Filed: July 7, 2008
    Date of Patent: March 27, 2012
    Assignee: Ovics
    Inventors: Sorin C Cismas, Ilie Garbacea
  • Patent number: 8140827
    Abstract: A system and method which provides for efficient data transmission between multiple microprocessors in a computer system is disclosed. A physical data path is divided into one or more data queues which may be virtual connection queues. The virtual connection queues are configured to adaptively split or merge based on traffic conditions therein.
    Type: Grant
    Filed: June 19, 2007
    Date of Patent: March 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Victor K. Liang, Shiang-Feng Lee
  • Patent number: 8131975
    Abstract: In some embodiments, an integrated circuit comprises a microprocessor matrix including a plurality of mesh-interconnected matrix processors, wherein each matrix processor comprises a data switch configured to direct inter-processor communications within the matrix, and a mapping unit configured to generate a data switch functionality map for a plurality of data switches in the microprocessor matrix. The data switch functionality map is generated by sending a first message through the matrix, and, setting a first functionality status designation for the first data switch in the data switch functionality map upon receiving a reply to the first message from a first data switch through the matrix.
    Type: Grant
    Filed: July 7, 2008
    Date of Patent: March 6, 2012
    Assignee: Ovics
    Inventors: Sorin C Cismas, Ilie Garbacea
  • Patent number: 8127111
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.
    Type: Grant
    Filed: April 28, 2008
    Date of Patent: February 28, 2012
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Patent number: 8127061
    Abstract: A processor chip includes data processing elements that each has dedicated to it a respective switch for dynamically establishing an interconnection between the data processing elements conditional upon verification of a validity of the interconnection, which verification is automatically performed by at least one of the data processing elements to be interconnected.
    Type: Grant
    Filed: February 18, 2003
    Date of Patent: February 28, 2012
    Inventors: Martin Vorbach, Volker Baumgarte, Gerd Ehlers
  • Publication number: 20120017066
    Abstract: Data processing device comprising a multidimensional array of ALUs, having at least two dimension where the number of ALUs in the dimension is greater or equal to 2, adapted to process data without register caused latency between at least some of the ALUs in the corresponding array.
    Type: Application
    Filed: February 14, 2011
    Publication date: January 19, 2012
    Inventors: Martin Vorbach, Frank May
  • Patent number: 8099583
    Abstract: A new signal processor technique and apparatus combining microprocessor technology with switch fabric telecommunication technology to achieve a programmable processor architecture wherein the processor and the connections among its functional blocks are configured by software for each specific application by communication through a switch fabric in a dynamic, parallel and flexible fashion to achieve a reconfigurable pipeline, wherein the length of the pipeline stages and the order of the stages varies from time to time and from application to application, admirably handling the explosion of varieties of diverse signal processing needs in single devices such as handsets, set-top boxes and the like with unprecedented performance, cost and power savings, and with full application flexibility.
    Type: Grant
    Filed: October 6, 2007
    Date of Patent: January 17, 2012
    Assignee: Axis Semiconductor, Inc.
    Inventor: Xiaolin Wang
  • Patent number: 8082372
    Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.
    Type: Grant
    Filed: May 23, 2011
    Date of Patent: December 20, 2011
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
  • Patent number: 8065456
    Abstract: An advanced processor comprises a plurality of multithreaded processor cores configured to support a plurality of software generated read or write instructions for interfacing with a star topology serial bus interface. The multiple-core processor has at least one of an internal fast messaging network or an interface switch interconnect configured to link the processor cores together such that each processor core has a data pathway to each of the other processor cores without going through memory. The fast messaging network or interface switch is also configured to be operably coupled to the star topology serial bus interface. In one aspect of an embodiment of the invention, the fast messaging network connects to a high-bandwidth star-topology serial bus such as a PCI express (PCIe) interface capable of supporting multiple high-bandwidth PCIe lanes.
    Type: Grant
    Filed: January 24, 2008
    Date of Patent: November 22, 2011
    Assignee: NetLogic Microsystems, Inc.
    Inventors: Julianne Jiang Zhu, David T. Hass
  • Patent number: 8050256
    Abstract: A processor includes a plurality of processor tiles, each tile including a processor core, and an interconnection network interconnects the processor cores and enables transfer of data among the processor cores. The interconnection network has a plurality of dimensions in which an ordering of dimensions for routing data is configurable.
    Type: Grant
    Filed: July 8, 2008
    Date of Patent: November 1, 2011
    Assignee: Tilera Corporation
    Inventors: Leiwei Bao, Ian Rudolf Bratt
  • Patent number: 8045546
    Abstract: A plurality of processor tiles are provided, each processor tile including a processor core. An interconnection network interconnects the processor cores and enables transfer of data among the processor cores. An extension network connects input/output ports of the interconnection network to input/output ports of one or more peripheral devices, each input/output port of the interconnection network being associated with one of the processor tiles such that each input/output port of the interconnection network sends input data to the corresponding processor tile and receives output data from the corresponding processor tile. The extension network is configurable such that a mapping between input/output ports of the interconnection network and input/output ports of the one or more peripheral devices is configurable.
    Type: Grant
    Filed: July 8, 2008
    Date of Patent: October 25, 2011
    Assignee: Tilera Corporation
    Inventors: Leiwei Bao, Ian Rudolf Bratt
  • Patent number: 8041925
    Abstract: A reconfigurable integrated circuit includes a plurality of function blocks and a plurality of programmable switches to switchably connect between function blocks included in the plurality of function blocks. The plurality of function blocks each includes at least one operation unit or one memory unit. The plurality of function blocks each includes at least one data input port connected to at least on of the plurality of programmable switches and at least one data output port connected to at least one of the plurality of programmable switches. Further, at least a pair of function blocks included in the plurality of function blocks is connected without intervening the programmable switch and data being output from a direct output port included in one of the pair of function blocks can be input to a direct input port included in the other of the pair of function blocks.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: October 18, 2011
    Assignee: Renesas Electronics Corporation
    Inventors: Toshirou Kitaoka, Taro Fujii, Kouichirou Furuta, Masato Motomura, Toru Awashima, Takao Toi
  • Patent number: 8037224
    Abstract: An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. The data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. In one aspect of an embodiment of the invention, the messaging network connects to a high-bandwidth star-topology serial bus such as a PCI express (PCIe) interface capable of supporting multiple high-bandwidth PCIe lanes. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner.
    Type: Grant
    Filed: July 31, 2007
    Date of Patent: October 11, 2011
    Assignee: NetLogic Microsystems, Inc.
    Inventors: Julianne Jiang Zhu, David T. Hass
  • Patent number: 8019970
    Abstract: A design structure embodied in a machine readable medium used in a design process includes a multi-layer silicon stack architecture having one or more processing layers comprised of one or more computing elements; one or more networking layers disposed between the processing layers, the network layer comprised of one or more networking elements, wherein each computing element comprises a plurality of network connections to adjacently disposed networking elements and each networking element may provide network access to a plurality of other computing elements through a single hop of the network.
    Type: Grant
    Filed: November 28, 2007
    Date of Patent: September 13, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kerry Bernstein, Timothy J. Dalton, Marc R. Faucher, Peter A. Sandon
  • Patent number: 8019915
    Abstract: The invention relates to a method and a device for controlling access to multiple applications which are each implemented as a client application in an operating system environment of a data processing device from a shared memory system. The problem addressed by the invention is that of providing an improved method and an improved device for controlling access to multiple applications which are each implemented as a client application in an operating system environment of a data processing device from a shared memory system, which allow an efficient exchange of data for input/output. In particular, interaction with multimedia data in such an operating environment should be optimized.
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: September 13, 2011
    Assignee: Tixel GmbH
    Inventors: Lars Eric Fuerst, Ralf Einhom, Carsten Herpel, Ralf Koehler
  • Patent number: 8018849
    Abstract: The flow of data in an integrated circuit is controlled. The integrated circuit comprising a plurality of tiles, each tile comprising a processor, a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles, and a receive buffer to store data from the switch. At a first tile, a count is maintained of data that has been sent to a second tile without receiving an acknowledgement up to a credit limit. At the second tile, data that arrives from the first tile when the receive buffer is full is sent to a memory outside of the tile.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: September 13, 2011
    Assignee: Tilera Corporation
    Inventor: David Wentzlaff
  • Publication number: 20110219208
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).
    Type: Application
    Filed: January 10, 2011
    Publication date: September 8, 2011
    Applicant: International Business Machines Corporation
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
  • Publication number: 20110213946
    Abstract: A parallel computing system includes a plurality of processors multi-dimensionally commented by an interconnection network, wherein each of the processors in the parallel computing system determines, in dimensional order, communication channels to other processors in the interconnection network, each of the processors sets, as relative coordinates of destination processors with respect to the plurality of processors in data communications performed at a same timing, relative coordinates common to all of the processors, and each of the processors performs data communications with destination processors having the set relative coordinates.
    Type: Application
    Filed: August 31, 2010
    Publication date: September 1, 2011
    Applicant: FUJITSU LIMITED
    Inventors: Yuichiro Ajima, Toshiyuki Shimizu, Hiroaki Ishihata
  • Patent number: 8010917
    Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.
    Type: Grant
    Filed: December 26, 2007
    Date of Patent: August 30, 2011
    Assignee: Cadence Design Systems, Inc.
    Inventors: David Cross, Eric Nequist
  • Patent number: 8010771
    Abstract: The invention provides a communication system including a plurality of communication nodes respectively arranged at predetermined lattice points in lattice space forming a three-dimensional rectangular solid, a communication link that interconnects communication nodes arranged at adjacent lattice points, and a shortcut link that connects, for at least two faces that are not an end face on the lattice space among faces formed of communication nodes of which any adjacent lattice points do not have communication nodes, a communication node constituting one face and a communication node constituting another face.
    Type: Grant
    Filed: December 4, 2006
    Date of Patent: August 30, 2011
    Assignee: International Business Machines Corporation
    Inventors: Takeshi Inagaki, Aya Minami, Yohichi Miwa
  • Patent number: 7991978
    Abstract: Data processing on a network on chip (‘NOC’) that includes integrated processor (‘IP’) blocks, each of a plurality of the IP blocks including at least one computer processor, each such computer processor implementing a plurality of hardware threads of execution; low latency, high bandwidth application messaging interconnects; memory communications controllers; network interface controllers; and routers; each of the IP blocks adapted to a router through a separate one of the low latency, high bandwidth application messaging interconnects, a separate one of the memory communications controllers, and a separate one of the network interface controllers; each application messaging interconnect abstracting into an architected state of each processor, for manipulation by computer programs executing on the processor, hardware inter-thread communications among the hardware threads of execution; each memory communications controller controlling communication between an IP block and memory; each network interface contro
    Type: Grant
    Filed: May 9, 2008
    Date of Patent: August 2, 2011
    Assignee: International Business Machines Corporation
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Eric O. Mejdrich, Paul E. Schardt
  • Patent number: 7987338
    Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.
    Type: Grant
    Filed: May 17, 2010
    Date of Patent: July 26, 2011
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
  • Patent number: 7987339
    Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.
    Type: Grant
    Filed: June 30, 2010
    Date of Patent: July 26, 2011
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
  • Publication number: 20110173413
    Abstract: Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.
    Type: Application
    Filed: March 12, 2010
    Publication date: July 14, 2011
    Applicant: International Business Machines Corporation
    Inventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Alan Gara, Philip Heidleberger, Robert M. Senger, Valentina Salapura, Burkhard Steinmacher-Burow, Yutaka Sugawara, Todd E. Takken
  • Patent number: 7975080
    Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.
    Type: Grant
    Filed: June 21, 2010
    Date of Patent: July 5, 2011
    Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
  • Publication number: 20110161625
    Abstract: A Wings array system for communicating between nodes using store and load instructions is described. Couplings between nodes are made according to a 1 to N adjacency of connections in each dimension of a G×H matrix of nodes, where G?N and H?N and N is a positive odd integer. Also, a 3D Wings neural network processor is described as a 3D G×H×K network of neurons, each neuron with an N×N×N array of synaptic weight values stored in coupled memory nodes, where G?N, H?N, K?N, and N is determined from a 1 to N adjacency of connections used in the G×H×K network. Further, a hexagonal processor array is organized according to an INFORM coordinate system having axes at 60 degree spacing. Nodes communicate on row paths parallel to an FM dimension of communication, column paths parallel to an IO dimension of communication, and diagonal paths parallel to an NR dimension of communication.
    Type: Application
    Filed: February 28, 2011
    Publication date: June 30, 2011
    Inventor: Gerald George Pechanek
  • Publication number: 20110153936
    Abstract: An aggregate symmetric multiprocessor (SMP) data processing system includes a first SMP computer including at least first and second processing units and a first system memory pool and a second SMP computer including at least third and fourth processing units and second and third system memory pools. The second system memory pool is a restricted access memory pool inaccessible to the fourth processing unit and accessible to at least the second and third processing units, and the third system memory pool is accessible to both the third and fourth processing units. An interconnect couples the second processing unit in the first SMP computer for load-store coherent, ordered access to the second system memory pool in the second SMP computer, such that the second processing unit in the first SMP computer and the second system memory pool in the second SMP computer form a synthetic third SMP computer.
    Type: Application
    Filed: December 21, 2009
    Publication date: June 23, 2011
    Applicant: International Business Machines Corporation
    Inventor: William J. Starke
  • Publication number: 20110145544
    Abstract: Multi-level hierarchical routing matrices for pattern-recognition processors are provided. One such routing matrix may include one or more programmable and/or non-programmable connections in and between levels of the matrix. The connections may couple routing lines to feature cells, groups, rows, blocks, or any other arrangement of components of the pattern-recognition processor.
    Type: Application
    Filed: December 15, 2009
    Publication date: June 16, 2011
    Applicant: Micron Technology, Inc.
    Inventors: Harold B Noyes, David R. Brown
  • Patent number: 7962716
    Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.
    Type: Grant
    Filed: November 17, 2004
    Date of Patent: June 14, 2011
    Assignee: QST Holdings, Inc.
    Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
  • Patent number: 7962717
    Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: June 14, 2011
    Assignee: XMOS Limited
    Inventor: Michael David May
  • Patent number: 7962721
    Abstract: There is provided an information processing apparatus. The apparatus comprises: a processor; at least one I2C device; and a processor support chip. The processor support chip comprises a local service controller and a jointly addressable memory space and has an interface to the processor for the transfer of information therebetween. The processor support chip also has an interface for communication with a service processor; and an I2C interface for communication with the at least one I2C device. The local service controller has exclusive read and write access to the I2C interface; and is operable to maintain a data structure indicating a current value associated with the I2C device in the jointly addressable memory space for access by the processor and the service processor.
    Type: Grant
    Filed: May 4, 2004
    Date of Patent: June 14, 2011
    Assignee: Oracle America, Inc.
    Inventors: James E. King, Rhod J. Jones, Paul J. Garnett
  • Patent number: 7953956
    Abstract: A reconfigurable circuit of reduced circuit scale. The reconfigurable circuit of the present invention comprises a plurality of ALUs capable of changing functions. The plurality of ALUs are arranged in a matrix. At least one connection unit capable of establishing connection between the ALUs selectively is provided between the stages of the ALUs. This connection unit is not intended to allow connection between all the logic circuits in adjoining stages, but is configured so that the logic circuits are each connectable with only some of the logic circuits pertaining to the other stages. The connection limitation allows a reduction in circuit scale.
    Type: Grant
    Filed: December 21, 2004
    Date of Patent: May 31, 2011
    Assignee: Sanyo Electric Co., Ltd.
    Inventors: Makoto Okada, Tatsuo Hiramatsu, Hiroshi Nakajima, Makoto Ozone
  • Patent number: 7944932
    Abstract: A data processing system includes a plurality of processing units each having a respective point-to-point communication link with each of multiple others of the plurality of processing units but fewer than all of the plurality of processing units. Each of the plurality of processing units includes interconnect logic, coupled to each point-to-point communication link of that processing unit, that broadcasts operations received from one of the multiple others of the plurality of processing units to one or more of the plurality of processing units.
    Type: Grant
    Filed: April 1, 2008
    Date of Patent: May 17, 2011
    Assignee: International Business Machines Corporation
    Inventors: Leo J. Clark, James S. Fields, Jr., Guy L. Guthrie, William J. Starke
  • Patent number: 7940755
    Abstract: An architecture for a specialized electronic computer for high-speed data lookup employs a set of tiles each with independent processors and lookup memory portions. The tiles may be programmed to interconnect to form different memory topologies optimized for the particular task.
    Type: Grant
    Filed: March 19, 2009
    Date of Patent: May 10, 2011
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Cristian Estan, Karthikeyan Sankaralingam
  • Patent number: 7941634
    Abstract: Specialized image processing circuitry is usually implemented in hardware in a massively parallel way as a single instruction multiple data (SIMD) architecture. The invention prevents long and complicated connection paths between a processing element and the memory subsystem, and improves maximum operating frequency. An optimized architecture for image processing has processing elements that are arranged in a two-dimensional structure, and each processing element has a local storage containing a plurality of reference pixels that are not neighbors in the reference image. Instead, the reference pixels belong to different blocks of the reference image, which may vary for different encoding schemes.
    Type: Grant
    Filed: November 14, 2007
    Date of Patent: May 10, 2011
    Assignee: Thomson Licensing
    Inventors: Marco Georgi, Klaus Gaedke, Malte Borsum
  • Patent number: 7937557
    Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.
    Type: Grant
    Filed: March 16, 2004
    Date of Patent: May 3, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7937558
    Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.
    Type: Grant
    Filed: February 8, 2008
    Date of Patent: May 3, 2011
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
  • Patent number: 7934031
    Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: April 26, 2011
    Assignee: California Institute of Technology
    Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
  • Publication number: 20110072237
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Application
    Filed: November 27, 2010
    Publication date: March 24, 2011
    Inventor: Gerald George Pechanek
  • Patent number: 7908422
    Abstract: A system and method for single hop, processor-to-processor communication in a multiprocessing system over a plurality of crossbars are disclosed. Briefly described, one embodiment is a multiprocessing system comprising a plurality of processors having a plurality of high-bandwidth point-to-point links; a plurality of processor clusters, each processor cluster having a predefined number of the processors residing therein; and a plurality of crossbars, one of the crossbars coupling each of the processors of one of the plurality of processor clusters to each of the processors of another of the plurality of processor clusters, such that all processors are coupled to each of the other processors, and such that the number of crossbars is equal to [X*(X?1)/2], wherein X equals the number of processor clusters.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: March 15, 2011
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Gary B. Gostin, Mark E. Shaw
  • Patent number: 7904695
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A slot sequencer (42) in each of the computers produces a timing pulse to cause the computer (12) to execute a next instruction. However, when the present instruction is a read or write type instruction, the slot sequencer does not produce the pulse until an acknowledge signal (86) starts it. The acknowledge signal (86) is produced when it is recognized that the communication has been completed by the other computer (12).
    Type: Grant
    Filed: February 16, 2006
    Date of Patent: March 8, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7904288
    Abstract: A hardware emulator having a variable input emulation group is described. Each emulation group comprises two or more processors, where one of the processors (a first processor) is coupled to a data input selector and another one of the processors (a second processor) processes a first amount of data received from a data array. The data input selector receives the first amount of data and a second amount of data from the data array, and selects a third amount of data from among the first and second amounts of data. The third amount of data is provided to the first processor for evaluation.
    Type: Grant
    Filed: November 6, 2006
    Date of Patent: March 8, 2011
    Assignee: Cadence Design Systems, Inc.
    Inventors: William F. Beausoleil, Beshara G. Elmufdi, Mitchell G. Poplack, Tai Su
  • Patent number: 7904615
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A plurality of read lines (18), write lines (20) and data lines (22) interconnect the computers (12). When one computer (12) sets a read line (18) high and the other computer sets a corresponding write line (20) then data is transferred on the data lines (22). When both the read line (18) and corresponding write line (20) go low this allows both communicating computers (12) to know that the communication is completed. An acknowledge line (72) goes high to restart the computers (12).
    Type: Grant
    Filed: February 16, 2006
    Date of Patent: March 8, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore