Array Processor Patents (Class 712/10)
  • Patent number: 7734896
    Abstract: A reconfigurable integrated circuit device which converts an arbitrary calculation state dynamically, based on configuration data, includes a plurality of processor elements, each of which has an input terminal, an output terminal, a plurality of arithmetic units which are provided in parallel and each of which performs calculation processing in synchronous with a clock signal, and an intra-processor network which connects them in an arbitrary state; and an inter-processor network which connects between processor elements in an arbitrary state. Based on configuration data, the intra-processor network is reconfigurable to a desired connection state, and further, based on the configuration data, the inter-processor network is reconfigurable to a desired connection state.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: June 8, 2010
    Assignee: Fujitsu Microelectronics Limited
    Inventor: Hiroshi Furukawa
  • Patent number: 7734827
    Abstract: Secure operation of cell processors is disclosed. A cell processor receives a secure file image from a client device at a cell processor of a host device (host cell processor), wherein the secure file image includes an encrypted SPU image.
    Type: Grant
    Filed: October 24, 2005
    Date of Patent: June 8, 2010
    Assignee: Sony Computer Entertainment, Inc.
    Inventor: Tatsuya Iwamoto
  • Patent number: 7725680
    Abstract: An application specific integrated circuit (ASIC) comprises a first bus that communicates with inputs and outputs of N processing modules, where N is an integer greater than 1. A control module communicates with the first bus and a second bus that is different than the first bus, and that generates first control signals. A routing module communicates with the first bus, receives data via the second bus from a first memory, selectively routes the data to a first of the inputs, and selectively routes one of the outputs to a second of the inputs. The routing module selects the first and second of the inputs based on the first control signals.
    Type: Grant
    Filed: July 5, 2007
    Date of Patent: May 25, 2010
    Assignee: Marvell International Ltd.
    Inventors: William R. Schmidt, Douglas G. Keithley
  • Patent number: 7725679
    Abstract: The present invention provides a method and system to implement a distributed array using the distributed property as an attribute attachable to an array. The present invention maintains the top level array implementation so as to avoid making the top level users to learn how to use a brand new class for creating and manipulating distributed arrays.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: May 25, 2010
    Assignee: The MathWorks, Inc.
    Inventors: Penelope Anderson, Cleve Moler, Jos Martin, Loren Shure
  • Patent number: 7716393
    Abstract: A system includes a plurality of integrated circuits for propagating data between at least one central processing unit and another component of the system. The plurality of integrated circuits are configured for proximity I/O communication. The plurality of integrated circuits is configured such that data propagation through the plurality of integrated circuits is unaffected by a rotation of at least one of the plurality of integrated circuits by 90 degrees.
    Type: Grant
    Filed: June 9, 2005
    Date of Patent: May 11, 2010
    Assignee: Oracle America, Inc.
    Inventors: Xavier-Francois Vigouroux, Bernard Tourancheau, Cedric Koch-Hofer
  • Publication number: 20100115235
    Abstract: A technique for optimizing grace period detection in a uniprocessor environment. An update operation is performed on a data element that is shared with non-preemptible readers of the data element. A call is issued to a synchronous grace period detection method. The synchronous grace period detection method performs synchronous grace period detection and returns from the call if the data processing system implements a multi-processor environment at the time of the call. The synchronous grace period detection determines the end of a grace period in which the readers have passed through a quiescent state and cannot be maintaining references to the pre-update view of the shared data. The synchronous grace period detection method returns from the call without performing grace period detection if the data processing system implements a uniprocessor environment at the time of the call.
    Type: Application
    Filed: November 3, 2008
    Publication date: May 6, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Joshua A. Triplett
  • Patent number: 7712080
    Abstract: The present invention relates generally to computer programming, and more particularly to systems and methods for parallel distributed programming. Generally, a parallel distributed program is configured to operate across multiple processors and multiple memories. In one aspect of the invention, a parallel distributed program includes a distributed shared variable located across the multiple memories and distributed programs capable of operating across multiple processors.
    Type: Grant
    Filed: May 21, 2004
    Date of Patent: May 4, 2010
    Assignee: The Regents of the University of California
    Inventors: Lei Pan, Lubomir R. Bic, Michael B. Dillencourt
  • Patent number: 7707388
    Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.
    Type: Grant
    Filed: November 29, 2006
    Date of Patent: April 27, 2010
    Assignee: XMTT Inc.
    Inventor: Uzi Vishkin
  • Patent number: 7694106
    Abstract: A multiprocessor system includes a judging unit judging whether a read command inputted to a global address crossbar is a read command to a memory on an own system board, an executing unit speculatively executing, when the judging unit judges that the read command is a read command to the memory on the own system board, the read command before global access based on an address notified from the global address crossbar, a setting unit setting for queuing data read from the memory in a data queue provided on a CPU without queuing the data in a data queue provided on the memory, and an instructing unit instructing, based on notification from the global address crossbar, the data queue provided on the CPU to discard the data or transmit the data to the CPU.
    Type: Grant
    Filed: April 20, 2007
    Date of Patent: April 6, 2010
    Assignee: Fujitsu Limited
    Inventors: Toshikazu Ueki, Takaharu Ishizuka, Makoto Hataida, Takashi Yamamoto, Yuka Hosokawa, Takeshi Owaki, Daisuke Itou
  • Patent number: 7689808
    Abstract: A data processor includes a reader for reading a bit stream stored in a storage if there is free space of 8 bits or more in a buffer and outputting to a first array changer, the first array changer for changing an array sequence of the 8 bits in reversed sequence for a PNG bit stream but does not change the array sequence for a JPEG bit stream, a second array changer for further changing the array sequence of the 8 bits to output in case of PNG but outputting as it is in case of JPEG when reading out fixed length data of 8 bits from the buffer to the second processor, and a first processor for reading bits of VLC by 10 bits each from the buffer.
    Type: Grant
    Filed: April 24, 2007
    Date of Patent: March 30, 2010
    Assignee: NEC Electronics Corporation
    Inventor: Akihisa Ono
  • Patent number: 7680962
    Abstract: An array type processor comprises a data path unit to execute processing, and a state management unit to control the state of the data path unit in accordance with a command that specifies processing on the data. An input DMA circuit reads from a memory information and data to be processed including a command corresponding to the data. The input DMA circuit first transfers the command to the state management unit, and then transfers the data to be processed to the data path unit.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: March 16, 2010
    Assignee: NEC Electronics Corporation
    Inventors: Kenichiro Anjo, Katsumi Togawa, Ryoko Sasaki, Taro Fujii, Masato Motomura
  • Patent number: 7673118
    Abstract: This present invention brings to the multiprocessor what vectorization brought to the single processor. It provides similar tools to speed communication that have traditionally been used to speed computation; namely, the capability to program optimal communication algorithms on an architecture that can replicate their performance in terms of wall clock time. In addition to the usual complement of logic and arithmetic units, each processor contains a programmable communication unit that orchestrates traffic between the network and registers that communicate directly with comparable registers in neighboring processors. Communication tasks are performed out of these registers like computational tasks on a vector uniprocessor. The architecture is balanced and the hardware/software combination is scalable to any number of processors.
    Type: Grant
    Filed: June 1, 2006
    Date of Patent: March 2, 2010
    Inventor: Paul N. Swarztrauber
  • Patent number: 7673117
    Abstract: An operation apparatus able to continuously perform processing involving computations differing according to the input conditions, able to keep down useless processing when for example the processing may be interrupted in the middle if certain conditions are satisfied, and able to achieve an improvement of transfer efficiency of course and also able to keep down any increase of the system cost and able to reduce the processing time and power consumption, including an address generator for generating first source data and outputting it together with a control signal, an address generator for generating second source data and outputting it together with control signals, and an operation element for performing predetermined operation with respect to the first source data from the first generator and the second source data by the second generator while switching the type of operation in accordance with a control signal and having registers for temporarily holding operation results, wherein reading and writing of
    Type: Grant
    Filed: January 30, 2006
    Date of Patent: March 2, 2010
    Assignee: Sony Corporation
    Inventor: Tomoo Nagai
  • Patent number: 7661006
    Abstract: A computer implemented method, apparatus, and computer program product for managing symmetric multiprocessor interconnects. The process identifies functional communication connections between each processor in a plurality of processors on a multiprocessor to form identified functional communication connections. The process maps every functional communication connection between any two processors in the plurality of processors, based on the identified functional communication connections, to form an interconnect matrix. The process creates a path map using the interconnect matrix. The path map comprises a sequence of communication connections between the plurality of processors. The process initializes the plurality of processors using the path map.
    Type: Grant
    Filed: January 9, 2007
    Date of Patent: February 9, 2010
    Assignee: International Business Machines Corporation
    Inventors: Luai A. Abou-Emara, Mark David McLaughlin, Jorge N. Yanez
  • Patent number: 7650484
    Abstract: An array-type computer processor including a data path unit communicating with a state control unit obtains data of a predetermined number of cooperative partial instruction codes, and operates with temporarily holding only a predetermined number of data-obtained instruction codes comprising cooperative partial instruction codes corresponding to contexts and operation states for the data path unit and the state control unit, respectively, from an external program memory which stores data of a computer program.
    Type: Grant
    Filed: February 3, 2005
    Date of Patent: January 19, 2010
    Assignees: NEC Corporation, NEC Electronics Corporation
    Inventors: Takeshi Inuo, Nobuki Kajihara, Takao Toi, Tooru Awashima, Hirokazu Kami, Taro Fujii, Kenichiro Anjo, Kouichiro Furuta, Masato Motomura
  • Patent number: 7647472
    Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.
    Type: Grant
    Filed: August 25, 2006
    Date of Patent: January 12, 2010
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Thomas B. Brightman, Andrew D. Funk, David J. Husak, Edward J. McLellan, Andrew T. Brown, John F. Brown, James A. Farrell, Donald A. Priore, Mark A. Sankey, Paul Schmitt
  • Patent number: 7647485
    Abstract: A data processing device for debugging code for a parallel arithmetic device that includes a plurality of data processing circuits arranged in a matrix and that causes, for each operating cycle, successive transitions of operation states in accordance with object code includes: operation execution means for causing the parallel arithmetic device to execute state transitions by means of the object code; device halt means for temporarily halting the state transitions for each operating cycle; a result output means for reading and supplying as output at least a portion of held data, connection relations, and operation commands of the plurality of data processing circuits of the halted parallel arithmetic device; a resume input means for receiving as input a resume command of the state transitions; and an operation resumption means for causing the operation execution means to resume the state transitions upon input of a resume command.
    Type: Grant
    Filed: August 27, 2004
    Date of Patent: January 12, 2010
    Assignees: NEC Corporation, NEC Electronics Corporation
    Inventors: Hirokazu Kami, Takao Toi, Toru Awashima, Kenichiro Anjo, Koichiro Furuta, Taro Fujii, Masato Motomura
  • Patent number: 7644255
    Abstract: Methods and apparatus provide for disabling at least some data path processing circuits of a SIMD processing pipeline, in which the processing circuits are organized into a matrix of slices and stages, in response to one or more enable flags during a given cycle.
    Type: Grant
    Filed: January 13, 2005
    Date of Patent: January 5, 2010
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Yonetaro Totsuka
  • Patent number: 7644254
    Abstract: A method and apparatus for dynamically rerouting node processes on the compute nodes of a massively parallel computer system using hint bits to route around failed nodes or congested networks without restarting applications executing on the system. When a node has a failure or there are indications that it may fail, the application software on the system is suspended while the data on the failed node is moved to a backup node. The torus network traffic is routed around the failed node and traffic for the failed node is rerouted to the backup node. The application can then resume operation without restarting from the beginning.
    Type: Grant
    Filed: April 18, 2007
    Date of Patent: January 5, 2010
    Assignee: International Business Machines Corporation
    Inventors: David L. Darrington, Patrick Joseph McCarthy, Amanda Peters, Albert Sidelnik, Brian Edward Smith, Brent Allen Swartz
  • Patent number: 7640155
    Abstract: A target interface system for interfacing selected components of a communication system and methods for manufacturing and using same. The target interface system includes target interface logic that is distributed among a plurality of reconfigurable logic devices. Being coupled via a serial link, the reconfigurable logic devices each have an input connection for receiving incoming data packets and an output connection for providing outgoing data packets. The serial link couples the input and output connections of successive reconfigurable logic devices to form a dataring structure for distributing the data packets among the reconfigurable logic devices. Thereby, the dataring structure maintains data synchronization among the reconfigurable logic devices such that the distribution of the target interface logic among the reconfigurable logic devices is transparent to software.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: December 29, 2009
    Assignee: QuickTurn Design Systems, Inc.
    Inventors: Mitchell G. Poplack, John A. Maher
  • Patent number: 7636835
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises one or more interface modules including circuitry to transfer data to and from a device external to the tiles; and a sub-port routing network including circuitry to route data between a port of a switch and a plurality of sub-ports coupled to one or more interface modules.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: December 22, 2009
    Assignee: Tilera Corporation
    Inventors: Carl G. Ramey, David Wentzlaff, Anant Agarwal
  • Patent number: 7634159
    Abstract: An array transform system for parallel computation of a plurality of elements of an array transform includes a memory for storing an array of data elements. Each column of data elements from the memory is copied to a shifter that shifts the column of data elements in accordance with a shift value to produce a shifted column of data elements. The shifted columns of data elements are accumulated in a plurality of accumulators, with each accumulator producing an element of the array transform. A controller controls the shift value dependent upon the position of the column of data elements in the array of data elements.
    Type: Grant
    Filed: December 8, 2004
    Date of Patent: December 15, 2009
    Assignee: Motorola, Inc.
    Inventors: Malcolm R. Dwyer, James E. Crenshaw, Zhiyuan Li
  • Patent number: 7627736
    Abstract: A data processing apparatus includes a plurality of processing elements arranged in a single instruction multiple data array. The apparatus is operable to process multiple instructions streams in parallel with one another.
    Type: Grant
    Filed: May 18, 2007
    Date of Patent: December 1, 2009
    Assignee: ClearSpeed Technology plc
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Publication number: 20090259824
    Abstract: A reconfigurable integrated circuit is provided wherein the available hardware resources can be optimised for a particular application. Dynamically reconfiguring (in both real-time and non real-time) the available resources and sharing a plurality of processing elements with a plurality of controller elements achieve this. In a preferred embodiment the integrated circuit includes a plurality of processing blocks, which interface to a reconfigurable interconnection means. A processing block has two forms, namely a shared resource block and a dedicated resource block. Each processing block consists of one or a plurality of controller elements and a plurality of processing elements. The controller element and processing element generally comprise diverse rigid coarse and fine grained circuits and are interconnected through dedicated and reconfigurable interconnect. The processing blocks can be configured as a hierarchy of blocks and or fractal architecture.
    Type: Application
    Filed: June 24, 2009
    Publication date: October 15, 2009
    Applicant: AKYA (HOLDINGS) LIMITED
    Inventors: Graeme Roy SMITH, Dyson WILKES
  • Patent number: 7595659
    Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.
    Type: Grant
    Filed: October 8, 2001
    Date of Patent: September 29, 2009
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin NĂĽckel, Volker Baumgarte, Prashant Rao, Jens Oertel
  • Patent number: 7581079
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Grant
    Filed: March 26, 2006
    Date of Patent: August 25, 2009
    Inventor: Gerald George Pechanek
  • Patent number: 7577823
    Abstract: The present invention relates to a multi-processor computer system comprising at least two processors for parallel execution of processes, at least two cache memory units, each being associated with and connected to a separate processor, a connection bus connecting said processors and said cache memory units, and a process list unit connected to said connection line for storing a process list of processes to be available for execution by said processors.
    Type: Grant
    Filed: June 23, 2003
    Date of Patent: August 18, 2009
    Assignee: NXP B.V.
    Inventor: Jan Hoogerbrugge
  • Patent number: 7577820
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: August 18, 2009
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Patent number: 7571300
    Abstract: A memory system includes a plurality of memory blocks, each having a dedicated local arithmetic logic unit (ALU). A data value having a plurality of bytes is stored such that each of the bytes is stored in a corresponding one of the memory blocks. In a read-modify-write operation, each byte of the data value is read from the corresponding memory block, and is provided to the corresponding ALU. Similarly, each byte of a modify data value is provided to a corresponding ALU on a memory data bus. Each ALU combines the read byte with the modify byte to create a write byte. Because the write bytes are all generated locally within the ALUs, long signal delay paths are avoided. Each ALU also generates two possible carry bits in parallel, and then uses the actual received carry bit to select from the two possible carry bits.
    Type: Grant
    Filed: January 8, 2007
    Date of Patent: August 4, 2009
    Assignee: Integrated Device Technologies, Inc.
    Inventor: Tak Kwong Wong
  • Patent number: 7568084
    Abstract: A basic cell capable of a fixed operating frequency regardless of the configuration information, which is also capable of effectively utilizing the arithmetic logic circuit within the cell in a LSI semiconductor integrated circuit, is capable of dynamic changes in configuration information. The circuit has an input switch ISW connected to multiple data input nodes, an output switch OSW connected to multiple data output nodes, a first data path containing an arithmetic logic circuit ALU and a result storage flip-flop CFF0 between the input switch ISW and output switch OSW. The second data path containing a data transfer flip-flop between an input switch ISW and an output switch OSW, and the result storage flip-flop CFF stores the calculated result data from the arithmetic logic circuit ALU, and the data transfer flip-flop holds data input from any of the multiple data input nodes.
    Type: Grant
    Filed: July 9, 2004
    Date of Patent: July 28, 2009
    Assignee: Hitachi, Ltd.
    Inventors: Hiroshi Tanaka, Yohei Akita, Tetsuro Honmura, Fumio Arakawa, Takanobu Tsunoda
  • Patent number: 7565287
    Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: July 21, 2009
    Assignee: Altera Corporation
    Inventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
  • Patent number: 7558943
    Abstract: A processing unit includes a control processor and a plurality of element processors having register files. At least two of the element processors pre-receive different parameters, store the parameter data in the register files, receive the same memory address and the same instruction broadcast by the control processor, read the same data from the external memory via a memory port based on the memory address, and perform at least one of logic computation and arithmetic computation for the same data in accordance with the same instruction and based on the different parameters.
    Type: Grant
    Filed: July 26, 2005
    Date of Patent: July 7, 2009
    Assignee: Riken
    Inventors: Toshikazu Ebisuzaki, Jun-ichiro Makino
  • Patent number: 7555566
    Abstract: A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: June 30, 2009
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Dong Chen, George L. Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Gerard V. Kopcsay, Lawrence S. Mok, Todd E. Takken
  • Patent number: 7555637
    Abstract: A computer (12) having multiple data paths (38a-d) connecting to other devices, which may be similar computers. A register (40d) is provided that has bits (110) programmatically settable to address each of the data paths such that the computer can communicate via multiple of the data paths based on which bits are concurrently set in the register. The bits respectively represent instances of the other devices as source devices that the computer can read data from and instances of the other devices as destination devices that the computer can write data to. A single address in the register can represent both a source device and a destination device for data communicated by the computer. Optionally, multiple of the computers can be connected in series (termed a pipeline) or to form an array (10).
    Type: Grant
    Filed: April 27, 2007
    Date of Patent: June 30, 2009
    Assignee: VNS Portfolio LLC
    Inventor: John W. Rible
  • Patent number: 7555630
    Abstract: A context forwarding bus efficiently communicates control and data between processing elements in a processor unit having a plurality of processing elements. Control and data information is transferred over a first bus from processing element to processing element.
    Type: Grant
    Filed: December 21, 2004
    Date of Patent: June 30, 2009
    Assignee: Intel Corporation
    Inventors: Sanjeev Jain, Gilbert M. Wolrich, Mark B. Rosenbluth
  • Patent number: 7552312
    Abstract: Methods, parallel computers, and products are provided for identifying messaging completion on a parallel computer. The parallel computer includes a plurality of compute nodes, the compute nodes coupled for data communications by at least two independent data communications networks including a binary tree data communications network optimal for collective operations that organizes the nodes as a tree and a torus data communications network optimal for point to point operations that organizes the nodes as a torus. Embodiments include reading all counters at each node of the torus data communications network; calculating at each node a current node value in dependence upon the values read from the counters at each node; and determining for all nodes whether the current node value for each node is the same as a previously calculated node value for each node.
    Type: Grant
    Filed: February 9, 2007
    Date of Patent: June 23, 2009
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Camesha R. Hardwick, Patrick J. McCarthy, Brian P. Wallenfelt
  • Patent number: 7539845
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises an interface coupled to a plurality of the tiles to transfer data between one or more switches of the tiles and one or more switches of tiles in an externally coupled integrated circuit.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: May 26, 2009
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Carl G. Ramey, Anant Agarwal
  • Patent number: 7532244
    Abstract: A high-speed vision sensor includes: an analog-to-digital converter array 13, in which one analog-to-digital converter 210 is provided in correspondence with all the photodetector elements 120 that are located on each row in a photodetector array 11; a parallel processing system 14 that includes processor elements 400 and shift registers 410, both of which form a one-to-one correspondence with the photodetector elements 120; and data buses 17, 18 and data buffers 19 and 20 for data transfer to processing elements 400. The processing elements 400 perform high-speed image processing between adjacent pixels by parallel processings. By using the data buses 17, 18, it is possible to attain, at a high rate of speed, such calculation processing that requires data supplied from outside.
    Type: Grant
    Filed: August 17, 2005
    Date of Patent: May 12, 2009
    Assignee: Hamamatsu Photonics K.K.
    Inventors: Masatoshi Ishikawa, Haruyoshi Toyoda
  • Publication number: 20090119479
    Abstract: An integrated circuit arrangement has a processor array (2) with processor elements (4) and a memory (6) with memory elements (8) arranged in rows (32) and columns (30). The columns (30) of memory elements (8) are addressed by respective processor elements (4). An input sequencer (14) and feedback path (24) cooperate to reorder input data in the memory (6) to carry out both block and line based processing.
    Type: Application
    Filed: May 16, 2007
    Publication date: May 7, 2009
    Applicant: NXP B.V.
    Inventors: Richard P. Kleihorst, Anteneh A. Abbo, Vishal S. Choudhary
  • Patent number: 7526630
    Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.
    Type: Grant
    Filed: January 4, 2007
    Date of Patent: April 28, 2009
    Assignee: Clearspeed Technology, PLC
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russel David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7523292
    Abstract: A multiplicity of processor elements, which both individually execute data processing in accordance with instruction codes that have been set as data and for which mutual connection relations are switch-controlled, are arranged in matrix form, and the instruction codes of this multiplicity of processor elements are successively switched by a state control unit. The state control units are composed of a plurality of units that intercommunicate to realize linked operation, and the multiplicity of processor elements is divided into a number of element areas that corresponds to the number of state control units. The plurality of state control units are arranged for each of the plurality of element areas and are connected to the processor elements, whereby the plurality of state control units can individually control a plurality of small-scale state transitions, or the plurality of state control units can cooperate to control a single large-scale state transition.
    Type: Grant
    Filed: October 10, 2003
    Date of Patent: April 21, 2009
    Assignee: NEC Electronics Corporation
    Inventors: Taro Fujii, Koichiro Furuta, Masato Motomura, Kenichiro Anjo, Yoshikazu Yabe, Toru Awashima, Takao Toi, Noritsugu Nakamura
  • Patent number: 7519975
    Abstract: A method and apparatus for exception handling in a multi-processor environment are described. In an embodiment, a method for handling a number of exceptions within a processor in a multi-processing system includes receiving an exception within the processor, wherein each processor in the multi-processor system shares a same memory. The method also includes executing a number of instructions at an address within a common interrupt handling vector address space of the same memory. The number of instructions cause the processor to determine an identification of the processor based on a query that is internal to the processor. Additionally, the method includes modifying execution flow of the exception to execute an interrupt handler located within one of a number of different interrupt handling vector address spaces.
    Type: Grant
    Filed: July 24, 2006
    Date of Patent: April 14, 2009
    Assignee: Redback Networks, Inc.
    Inventor: Sanjay Lal
  • Patent number: 7515899
    Abstract: Additional computing power is captured using the idle processing power of mobile phones incorporated into a grid computing system, wherein the system is capable of pushing projects out to available mobile phones for processing during idle operation times. To further efficiently utilize the unused processing cycles of mobile phones, a unique protocol is utilized to coordinate processing tasks which makes use of existing short messages techniques to communicate projects. The unique protocol is combination of bootstrapping using standard compression techniques along with an adaptive compression scheme.
    Type: Grant
    Filed: April 23, 2008
    Date of Patent: April 7, 2009
    Assignee: International Business Machines Corporation
    Inventors: Hollie Carr, Peter Mattison, Christopher E. Sharp
  • Patent number: 7516300
    Abstract: An integrated active memory device includes an array of processing elements coupled to a dynamic random access memory device and to a component supplying instructions to the processing elements. The processing elements are logically arranged in a plurality of logical rows and logical columns. The array is logically folded to minimize the length of the longest path between processing elements by physically interleaving the processing elements so that the processing elements in different logical rows a physically interleaved with each other and the processing elements in different logical columns a physically interleaved with each other.
    Type: Grant
    Filed: October 5, 2006
    Date of Patent: April 7, 2009
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Publication number: 20090070549
    Abstract: The connection architecture of a network on a chip (NoC) is described in which (a) nodes in octahedron sections are connected in an arc Benes network, (b) a hierarchy of node clusters are connected using a globally asynchronous locally asynchronous (GALA) configuration, (c) a double wishbone 2D torus ring is applied to connection between network layers and (d) data is routed using buffer modulation.
    Type: Application
    Filed: September 12, 2008
    Publication date: March 12, 2009
    Applicant: Solomon Research LLC
    Inventor: Neal Solomon
  • Publication number: 20090063811
    Abstract: A system is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.
    Type: Application
    Filed: August 27, 2007
    Publication date: March 5, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, Edward J. Seminaro, William E. Speight
  • Patent number: 7487271
    Abstract: A multiprocessor system (100) for sharing memory has a memory (102), and two or more processors (104). The processors are programmed to establish (202) memory buffer pools between the processors, and for each memory buffer pool, establish (204) an array of buffer pointers that point to corresponding memory buffers. The processors are further programmed to, for each array of buffer pointers, establish (206) a consumption pointer for the processor owning the memory buffer pool, and a release pointer for another processor sharing said memory buffer pool, each pointer initially pointing to a predetermined location of the array, and adjust (208-236) the consumption and release pointers according to buffers consumed and released.
    Type: Grant
    Filed: September 22, 2005
    Date of Patent: February 3, 2009
    Assignee: Motorola, Inc.
    Inventors: Charbel Khawand, Jean Khawand, Bin Liu
  • Patent number: 7478222
    Abstract: Disclosed is an array of programmable data-processing cells configured as a plurality of cross-connected pipelines. An apparatus includes cells capable of performing data-processing functions selectable by a presented instruction. A first set of cells includes an input cell, an output cell, and a series of at least one interior cell providing an acyclic data processing path from the input cell to the output cell. Additional cells are similarly configured. Memory presents configuration instructions to cells in response to a configuration code. Data advances through ranks of the cells. The configuration code advances to memory associated with a rank in tandem with the data.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: January 13, 2009
    Inventor: Karl M. Fant
  • Patent number: 7471643
    Abstract: A heterogeneous array includes clusters of processing elements. The clusters include a combination of ALUs and multiplexers linked by direct connections and various general-purpose routing networks. The multiplexers are controlled by the ALUs in the same cluster, or alternatively by ALUs in other clusters, via a dedicated multiplexer control network. Components of applications configured onto the array are selectively implemented in either multiplexers or ALUs, as determined by the relative efficiency of implementing the component in one or the other type of processing element, and by the relative availability of the processing element types. Multiplexer control signals are generated from combinations of ALU status signals, and optionally routed to control multiplexers in different clusters.
    Type: Grant
    Filed: July 1, 2002
    Date of Patent: December 30, 2008
    Assignee: Panasonic Corporation
    Inventor: Anthony I. Stansfield
  • Patent number: 7472392
    Abstract: One aspect of the present invention relates to a method for balancing the load of an n-dimensional array of processing elements (PEs), wherein each dimension of the array includes the processing elements arranged in a plurality of lines and wherein each of the PEs has a local number of tasks associated therewith. The method comprises balancing at least one line of PEs in a first dimension, balancing at least one line of PEs in a next dimension, and repeating the balancing at least one line of PEs in a next dimension for each dimension of the n-dimensional array. The method may further comprise selecting one or more lines within said first dimension and shifting the number of tasks assigned to PEs in said selected one or more lines.
    Type: Grant
    Filed: October 20, 2003
    Date of Patent: December 30, 2008
    Assignee: Micron Technology, Inc.
    Inventor: Mark Beaumont