Array Processor Operation Patents (Class 712/16)

Application specific (Class 712/17)

Data flow array processor (Class 712/18)

Systolic array processor (Class 712/19)

Multimode (e.g., mimd to simd, etc.) (Class 712/20)

Multiple instruction, multiple data (mimd) (Class 712/21)

Single instruction, multiple data (simd) (Class 712/22)

Matrix processor data switch routing systems and methods

Patent number: 8145880

Abstract: According to some embodiments, an integrated circuit comprises a microprocessor matrix of mesh-interconnected matrix processors. Each processor comprises a data switch including a data switch link register and matrix routing logic. The data switch link register includes one or more matrix link-enable register fields specifying a link enable status (e.g. a message-independent, p-to-p, and/or broadcast link enable status) for each inter-processor matrix link of the processor. The matrix routing logic routes inter-processor messages according to the matrix link-enable register field(s). A particular link may be selected by a current matrix processor by selecting an ordered list of matrix links according to a relationship between ?H and ?V, and choosing the first enabled link in the selected list for routing.

Type: Grant

Filed: July 7, 2008

Date of Patent: March 27, 2012

Assignee: Ovics

Inventors: Sorin C Cismas, Ilie Garbacea
Image processing method and device

Patent number: 8135229

Abstract: An image processing method and device for processing multiple rows of pixels of an image simultaneously with a single instruction. The processing includes selecting a pixel window having a plurality of pixels of an image spanning across multiple rows and columns, building vertical and horizontal load registers to include the plurality of pixels of the selected pixel window, and simultaneously processing selected pixels of the plurality of pixels included in the vertical and horizontal load registers using a single instruction, wherein the vertical and horizontal load registers are shifted when the selected pixels are processed. Accordingly, a method and device for efficient processing of an image is provided.

Type: Grant

Filed: January 8, 2009

Date of Patent: March 13, 2012

Assignee: Marvell International Technology Ltd.

Inventors: Douglas G. Keithley, Roy G. Moss
Matrix processor initialization systems and methods

Patent number: 8131975

Abstract: In some embodiments, an integrated circuit comprises a microprocessor matrix including a plurality of mesh-interconnected matrix processors, wherein each matrix processor comprises a data switch configured to direct inter-processor communications within the matrix, and a mapping unit configured to generate a data switch functionality map for a plurality of data switches in the microprocessor matrix. The data switch functionality map is generated by sending a first message through the matrix, and, setting a first functionality status designation for the first data switch in the data switch functionality map upon receiving a reply to the first message from a first data switch through the matrix.

Type: Grant

Filed: July 7, 2008

Date of Patent: March 6, 2012

Assignee: Ovics

Inventors: Sorin C Cismas, Ilie Garbacea
Processing system with interspersed processors using selective data transfer through communication elements

Patent number: 8112612

Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.

Type: Grant

Filed: May 17, 2010

Date of Patent: February 7, 2012

Assignee: Coherent Logix, Incorporated

Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
Data processing apparatus and method of controlling the data processing apparatus

Patent number: 8108661

Abstract: Provided are a data processing apparatus and a method of controlling the data processing apparatus. The data processing apparatus may select a single stream processor from a plurality of stream processors based on stream processor status information, and input data into the selected stream processor. The stream processor status information may include first status information of a processor core and second status information of at least one internal memory.

Type: Grant

Filed: April 2, 2009

Date of Patent: January 31, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Won Jong Lee, Chan Min Park, Shi Hwa Lee
Linking functional blocks for sequential operation by DONE and GO components of respective blocks pointing to same memory location to store completion indicator read as start indicator

Patent number: 8103855

Abstract: The present disclosure provides a methodology for reducing congestion of a processing unit, preferably by configuring a plurality of functional blocks to run in parallel or in series without the influence or input from the processing unit. In an embodiment, the present method chains a plurality of functional blocks together by software so that one functional block starts after the completion of another functional block. The configuration of the chain can be series, parallel, and any combination thereof, arranged to meet the circuit's objective. The chaining can be configured and re-configured, preferably by software input. The chaining can also be performed at design time or at run time. The chaining can also be modified, preferably at design time, but can also be modified at run time.

Type: Grant

Filed: June 29, 2008

Date of Patent: January 24, 2012

Assignee: Navosha Corporation

Inventors: Hirak Mitra, Raj Kulkarni, Richard Wicks, Michael Moon
Coherency groups of serially coupled processing cores propagating coherency information containing write packet to memory

Patent number: 8090913

Abstract: A system has a first plurality of cores in a first coherency group. Each core transfers data in packets. The cores are directly coupled serially to form a serial path. The data packets are transferred along the serial path. The serial path is coupled at one end to a packet switch. The packet switch is coupled to a memory. The first plurality of cores and the packet switch are on an integrated circuit. The memory may or may not be on the integrated circuit. In another aspect a second plurality of cores in a second coherency group is coupled to the packet switch. The cores of the first and second pluralities may be reconfigured to form or become part of coherency groups different from the first and second coherency groups.

Type: Grant

Filed: December 20, 2010

Date of Patent: January 3, 2012

Assignee: Freescale Semiconductor, Inc.

Inventors: Perry H. Pelley, III, George P. Hoekstra, Lucio F. Pessoa
Stream processing system having a reconfigurable memory module

Patent number: 8086824

Abstract: A stream processing system includes a stream processing module coupled to a memory module and operable so as to fetch stream elements from the memory module, to process the stream elements fetched thereby, and to store processed stream elements in the memory module. The stream processing module includes a number (N) of stream processing units, and the memory module is configured with a number (N) of memory bank units each corresponding to a respective one of the stream processing units. The memory module is reconfigurable based on a desired inter-level configuration so that each of the memory bank units is configured to have a memory size sufficient to meet processing requirement of the respective one of the stream processing units.

Type: Grant

Filed: May 5, 2009

Date of Patent: December 27, 2011

Assignee: National Taiwan University

Inventors: You-Ming Tsao, Liang-Gee Chen, Shao-Yi Chien
MESSAGE BROADCAST WITH ROUTER BYPASSING

Publication number: 20110314255

Abstract: A processor and method for broadcasting data among a plurality of processing cores is disclosed. The processor includes a plurality of processing cores connected by point-to-point connections. A first of the processing cores includes a router that includes at least an allocation unit and an output port. The allocation unit is configured to determine that respective input buffers on at least two others of the processing cores are available to receive given data. The output port is usable by the router to send the given data across one of the point-to-point connections. The router is configured to send the given data contingent on determining that the respective input buffers are available. Furthermore, the processor is configured to deliver the data to the at least two other processing cores in response to the first processing core sending the data once across the point-to-point connection.

Type: Application

Filed: June 17, 2010

Publication date: December 22, 2011

Inventors: Tushar Krishna, Bradford M. Beckmann, Steven K. Reinhardt
Processor for Large Graph Algorithm Computations and Matrix Operations

Publication number: 20110307685

Abstract: A multiprocessor system and method for performing matrix operations includes multiple processors cooperatively performing a sparse matrix operation. Distributed among the processors are non-zero matrix elements of first and second sparse matrices. Mapped across the processors are the matrix elements of a results matrix. Each processor receives, from the other processors, non-zero matrix elements of the first matrix that had been distributed to those other processors and generates partial results based on the received non-zero matrix elements of the first matrix and on the non-zero matrix elements of the second matrix distributed to that processor. Each processor receives those partial results generated by other processors and associated with the matrix elements of the results matrix mapped to that processor.

Type: Application

Filed: June 6, 2011

Publication date: December 15, 2011

Inventor: William S. Song
Reconfigurable array processor for floating-point operations

Patent number: 8078835

Abstract: A processor for performing floating-point operations includes an array of processing elements arranged to enable a floating-point operation. Each processing element includes an arithmetic logic unit to receive two input values and perform integer arithmetic on the received input values. The processing elements in the array are connected together in groups of two or more processing elements to enable floating-point operation.

Type: Grant

Filed: May 23, 2008

Date of Patent: December 13, 2011

Assignee: Core Logic, Inc.

Inventors: Hoon Mo Yang, Man Hwee Jo, Il Hyun Park, Ki Young Choi
Processor architectures for enhanced computational capability

Patent number: 8078834

Abstract: A digital signal processor includes a control block configured to issue instructions based on a stored program, and a compute array including two or more compute engines configured such that each of the issued instructions executes in successive compute engines of at least a subset of the compute engines at successive times. The digital signal processor may be utilized with a control processor or as a stand-alone processor. The compute array may be configured such that each of the issued instructions flows through successive compute engines of at least a subset of the compute engines at successive times.

Type: Grant

Filed: January 9, 2008

Date of Patent: December 13, 2011

Assignee: Analog Devices, Inc.

Inventor: Douglas Garde
Method and arrangement for cache memory management, related processor architecture

Patent number: 8078804

Abstract: A data cache memory coupled to a processor including processor clusters are adapted to operate simultaneously on scalar and vectorial data by providing data locations in the data cache memory for storing data for processing. The data locations are accessed either in a scalar mode or in a vectorial mode. This is done by explicitly mapping the data locations that are scalar and the data locations that are vectorial.

Type: Grant

Filed: June 26, 2007

Date of Patent: December 13, 2011

Assignees: STMicroelectronics S.r.l., STMicroelectronics N.V.

Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elena Salurso, Elio Guidetti
Performing A Deterministic Reduction Operation In A Parallel Computer

Publication number: 20110296137

Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.

Type: Application

Filed: May 28, 2010

Publication date: December 1, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
Shared-memory multiprocessor system and method for processing information

Patent number: 8065337

Abstract: Large-scale table data stored in a shared memory are sorted by a plurality of processors in parallel. According to the present invention, the records subjected to processing are first divided for allocation to the plurality of processors. Then, each processor counts the numbers of local occurrences of the field value sequence numbers associated with the records to be processed. The numbers of local occurrences of the field value sequence numbers counted by each processor is then converted into global cumulative numbers, i.e., the cumulative numbers used in common by the plurality of processors. Finally, each processor utilizes the global cumulative numbers as pointers to rearrange the order of the allocated records.

Type: Grant

Filed: August 13, 2010

Date of Patent: November 22, 2011

Assignee: Turbo Data Laboratories, Inc.

Inventor: Shinji Furusho
Programmable arithmetic logic unit cluster

Patent number: 8051277

Abstract: A Programmable Arithmetic Logic Unit Cluster is claimed. The plurality of Programmable Logic Blocks (50) in the cluster are in a physically linear sequence; but, will process data in parallel when the data pathways permit. A physically linear and operationally parallel design is possible mainly due to an Internal Register Bus (30). A small subset of data from the Internal Register Bus (30) is modified by each Arithmetic Logic Unit (53). The greater chunk of information of the Internal Register Bus (30) passes, without changes made to the data, through a plurality of two-to-one-multiplexers (54) as data inputs, bypassing the Data Selection (52) and Arithmetic Logic Unit's (53) circuitry. Only the data specified as the Accumulator (41) and Carry History (31) are modified by the blocks (50). The Accumulator (41) and the Carry Output Line (44) are distributed back onto the Internal Register Bus (30) for subsequent blocks or for the Output Register File (79) by said two-to-one-multiplexers (54).

Type: Grant

Filed: October 12, 2010

Date of Patent: November 1, 2011

Inventor: Greg S. Callen
APPARATUS AND METHOD FOR EXECUTING MEDIA PROCESSING APPLICATIONS

Publication number: 20110258413

Abstract: An apparatus and method for executing media processing applications in a heterogeneous multicore system are provided. The media processing application executing apparatus includes a configuration deciding unit to decide a configuration for a combination of computational kernels and cores in which the computation kernels are to be executed. The computation kernels are media processing components included in a media processing application. The media processing application executing apparatus also includes an execution unit including multiple heterogeneous cores, to execute the media processing application based on the determined configuration.

Type: Application

Filed: December 30, 2010

Publication date: October 20, 2011

Applicant: Samsung Electronics Co., Ltd.

Inventors: Seung-Mo Cho, Hyo-Jung Song, Sung-Hak Lee, Dong-Woo Im, Oh-Young Jang, Sung-Jong Seo
Techniques for asynchronous command interface for scalable and active data warehousing

Patent number: 8027962

Abstract: Techniques for asynchronous command processing within a parallel processing environment are provided. A command is raised or received within a parallel processing data warehousing environment. A job or a component of the job is dynamically monitored, controlled, or modified in response to the real-time processing of the command. The job is actively processing within the parallel processing data warehousing environment when the command is received and processed against the job or the component of the job.

Type: Grant

Filed: September 14, 2007

Date of Patent: September 27, 2011

Assignee: Teradata US, Inc.

Inventors: Alex P Yung, Clovis Franklin Lofton
Runtime instruction decoding modification in a multi-processing array

Patent number: 8028150

Abstract: A method and system for decoding and modifying processor instructions in runtime according to certain rules in order to separately control processing elements embedded within a multi-processor array using a single instruction. The present invention allows multiple processing elements and/or execution units in a multi-processor array to perform different operations, based upon a variable or variables such as their location in the multi-processor array, while accepting a single instruction as an input.

Type: Grant

Filed: November 16, 2007

Date of Patent: September 27, 2011

Inventors: Shlomo Selim Rakib, Yoram Zarai
Multiple-core processor supporting multiple instruction set architectures

Patent number: 8028290

Abstract: Multiple instruction set architectures are supported in a system that provides a power-efficient and flexible platform for virtual machine environments requiring multiple support for multiple instruction set architectures (ISAs). A processor includes multiple cores having disparate native ISAs and that may be selectively enabled for operation, so that power is conserved when support for a particular ISA is not required of the processor. A hypervisor controls operation of the cores, locates a core and enables it if necessary when a request to instantiate a virtual machine having a specified ISA is received. The ISA may be specified by a particular operating system and/or application program requirements.

Type: Grant

Filed: August 30, 2006

Date of Patent: September 27, 2011

Assignee: International Business Machines Corporation

Inventors: James Walter Rymarczyk, Michael Ignatowski, Thomas J. Heller, Jr.
DYNAMIC ATOMIC BITSETS

Publication number: 20110219209

Abstract: Embodiments of the present invention provide techniques, including systems, methods, and computer readable medium, for dynamic atomic bitsets. A dynamic atomic bitset is a data structure that provides a bitset that can grow or shrink in size as required. The dynamic atomic bitset is non-blocking, wait-free, and thread-safe.

Type: Application

Filed: March 4, 2010

Publication date: September 8, 2011

Applicant: Oracle International Corporation

Inventor: Nathan Reynolds
System and Method for Power Optimization

Publication number: 20110213947

Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.

Type: Application

Filed: May 25, 2010

Publication date: September 1, 2011

Inventors: John George Mathieson, Phil Carmack, Brian Smith
System and Method for Power Optimization

Publication number: 20110213998

Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.

Type: Application

Filed: May 25, 2010

Publication date: September 1, 2011

Inventors: John George Mathieson, Phil Carmack, Brian Smith
SYSTEM AND METHOD FOR PROCESSING DATA USING A MATRIX OF PROCESSING UNITS

Publication number: 20110202587

Abstract: A system and method for processing data utilizes a matrix of processing units using an array of commands stored in memory to process input data words to generate output data words, which can be used in various applications.

Type: Application

Filed: October 9, 2009

Publication date: August 18, 2011

Applicant: NXP B.V.

Inventor: Xavier Chabot
Data Processing Architecture

Publication number: 20110185151

Abstract: A parallel processor is described which is operated in a SIMD manner. The processor comprises: a plurality of processing elements connected in a string and grouped into a plurality of processing units, wherein each processing unit comprises a plurality of processing elements which each have direct interconnections with all of the other processing elements within the respective processing unit, the interconnections enabling data transfer between any two elements within a unit to be effected in a single clock cycle.

Type: Application

Filed: May 20, 2009

Publication date: July 28, 2011

Inventors: Martin Whitaker, John Lancaster
Processing system with interspersed processors and dynamic pathway creation

Patent number: 7987339

Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.

Type: Grant

Filed: June 30, 2010

Date of Patent: July 26, 2011

Assignee: Coherent Logix, Incorporated

Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
Processing system with interspersed processors using shared memory of communication elements

Patent number: 7987338

Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.

Type: Grant

Filed: May 17, 2010

Date of Patent: July 26, 2011

Assignee: Coherent Logix, Incorporated

Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
Communications in a processor array

Patent number: 7987340

Abstract: Data is transmitted from a sending processor over a network to one or more receiving processor in a forward direction during an allocated slot, and acknowledge signals are sent in a reverse direction during the same allocated slot, to indicate whether the receiving processor is able to receive data If one or more of the receiving processors indicates that it is unable to receive the data, the data is retransmitted during the next allocated slot. This means that the sending processor is able to determine within the slot period whether a retransmission is necessary, but that the slot period only needs to be long enough for one-way communication.

Type: Grant

Filed: February 19, 2004

Date of Patent: July 26, 2011

Inventors: Gajinder Panesar, Anthony Peter John Claydon, William Philip Robbins, Alex Orr, Andrew Duller
POWER SAVING ASYNCHRONOUS COMPUTER

Publication number: 20110179251

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The sleeping computer (12) can be awaiting data or instructions (12). In the case of instructions, the sleeping computer (12) can be waiting to store the instructions or to immediately execute the instructions. In the later case, the instructions are placed in an instruction register (30a) when they are received and executed therefrom, without first placing the instructions first into memory. The instructions can include a micro-loop (100) which is capable of performing a series of operations repeatedly.

Type: Application

Filed: March 21, 2011

Publication date: July 21, 2011

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Integrated computer array with independent functional configurations

Patent number: 7984266

Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.

Type: Grant

Filed: June 5, 2007

Date of Patent: July 19, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Method of rotating data in a plurality of processing elements

Publication number: 20110167240

Abstract: A method of rotating data in a plurality of processing elements comprises a plurality of shifting operations and a plurality of storing operations, with the shifting and storing operations coordinated to enable a three shears operation to be performed on the data. The plurality of storing operations is responsive to the processing element's positions.

Type: Application

Filed: March 15, 2011

Publication date: July 7, 2011

Inventor: Mark Beaumont
SYSTEMS AND METHODS FOR COLLECTING DATA FROM MULTIPLE CORE PROCESSORS

Publication number: 20110153982

Abstract: Systems and methods are disclosed for collecting data from cores of a multi-core processor using collection packets. A collection packet can traverse through cores of the multi-core processor while accumulating requested data. Upon completing the accumulation of the requested data from all required cores, the collection packet can be transmitted to a system operator for system maintenance and/or monitoring.

Type: Application

Filed: December 21, 2009

Publication date: June 23, 2011

Applicant: BBN TECHNOLOGIES CORP.

Inventor: Craig Partridge
Multicore Processor Including Two or More Collision Domain Networks

Publication number: 20110154345

Abstract: Implementations and techniques for multicore processors having a domain interconnection network configured to associate a first collision domain network with a second collision domain network in communication are generally disclosed.

Type: Application

Filed: December 21, 2009

Publication date: June 23, 2011

Inventor: Ezekiel Kruglick
Parallel data processing apparatus

Patent number: 7966475

Abstract: A data processor comprises a plurality of processing elements arranged for parallel processing of data, and a controller for controlling the plurality of processing elements. The controller is operable to determine respective status information for a plurality of processing threads, and to control processing of the processing threads by the plurality of processors in dependence upon such status information.

Type: Grant

Filed: January 10, 2007

Date of Patent: June 21, 2011

Assignee: Rambus Inc.

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
Message routing scheme

Patent number: 7962717

Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.

Type: Grant

Filed: March 14, 2007

Date of Patent: June 14, 2011

Assignee: XMOS Limited

Inventor: Michael David May
Parallel data processing apparatus

Patent number: 7958332

Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.

Type: Grant

Filed: March 13, 2009

Date of Patent: June 7, 2011

Assignee: Rambus Inc.

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
Method and apparatus for scalable and super-scalable information processing using binary gate circuits structured by code-selected pass transistors

Publication number: 20110131392

Abstract: A processing space comprises an array of transistors empowered by forming connections through circuit pass transistors to power and data input/output means and connections therebetween through signal pass transistors. By structuring the needed circuits at the site(s) of the data the von Neumann bottleneck is eliminated, which increases the computing power of the apparatus substantially, thus to enable non-stop Information Processing on steady streams of data and code, with no repetitive instruction and data transfers required. That code will identify the physical locations of every transistor in the processing space, and will enable only the pass transistors therein needed to structure the circuits of any arithmetical/logical algorithm in a processing space of any size, speed, and level of computer power. By joining one processing space to another the apparatus also exhibits super-scalability.

Type: Application

Filed: January 7, 2011

Publication date: June 2, 2011

Inventor: William Stuart Lovell
Groups of serially coupled processor cores propagating memory write packet while maintaining coherency within each group towards a switch coupled to memory partitions

Patent number: 7941637

Abstract: A system has a first plurality of cores in a first coherency group. Each core transfers data in packets. The cores are directly coupled serially to form a serial path. The data packets are transferred along the serial path. The serial path is coupled at one end to a packet switch. The packet switch is coupled to a memory. The first plurality of cores and the packet switch are on an integrated circuit. The memory may or may not be on the integrated circuit. In another aspect a second plurality of cores in a second coherency group is coupled to the packet switch. The cores of the first and second pluralities may be reconfigured to form or become part of coherency groups different from the first and second coherency groups.

Type: Grant

Filed: April 15, 2008

Date of Patent: May 10, 2011

Assignee: Freescale Semiconductor, Inc.

Inventors: Perry H. Pelley, III, George P. Hoekstra, Lucio F. C. Pessoa
PROCESSOR MEMORY SYSTEM

Publication number: 20110107058

Abstract: A plurality of processing elements (PEs) include memory local to at least one of the processing elements in a data packet-switched network interconnecting the processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid to connect the PEs and their local memories to a common controller. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The local memories share the same memory map or space. The packet-switched network supports multiple concurrent transfers between PEs and memory. Memory accesses include block and/or broadcast read and write operations, in which data can be replicated within the nodes and, according to the operation, written into the shared memory or into the local PE memory.

Type: Application

Filed: January 11, 2011

Publication date: May 5, 2011

Inventor: Ray MCCONNELL
System and method for intercommunication between computers in an array

Patent number: 7937557

Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.

Type: Grant

Filed: March 16, 2004

Date of Patent: May 3, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Processing system with interspersed processors and communication elements

Patent number: 7937558

Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.

Type: Grant

Filed: February 8, 2008

Date of Patent: May 3, 2011

Assignee: Coherent Logix, Incorporated

Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
Method and apparatus for monitoring inputs to an asyncrhonous, homogenous, reconfigurable computer array

Patent number: 7934075

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously and operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The instructions executed by the computers (12) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In one application, the sleeping computer (12) is awakened by an input such that it commences an action that would otherwise required an interrupt of an otherwise active computer. For example, one computer (12f) can be used to monitor an input/output port of the computer array (10).

Type: Grant

Filed: May 26, 2006

Date of Patent: April 26, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Programmable pipeline array

Patent number: 7930517

Abstract: An array of programmable data-processing cells configured as a plurality of cross-connected pipelines. An apparatus includes cells capable of performing data-processing functions selectable by a presented instruction. A first set of cells includes an input cell, an output cell, and a series of at least one interior cell providing an acyclic data processing path from the input cell to the output cell. Additional cells are similarly configured. Memory presents configuration instructions to cells in response to a configuration code. Data advances through ranks of the cells. The configuration code advances to memory associated with a rank in tandem with the data.

Type: Grant

Filed: January 9, 2009

Date of Patent: April 19, 2011

Assignee: Wave Semiconductor, Inc.

Inventor: Karl M. Fant
Plural SIMD arrays processing threads fetched in parallel and prioritized by thread manager sequentially transferring instructions to array controller for distribution

Patent number: 7925861

Abstract: A data processor comprises a plurality of processing elements arranged in a first plurality of single instruction multiple data (SIMD) processing arrays, and comprises a second plurality of controllers for transferring instructions to the processing arrays. Each controller is operable to retrieve a plurality of incoming instruction streams in parallel with one another and operable to supply incoming instruction streams to one of a plurality of processing arrays.

Type: Grant

Filed: January 31, 2007

Date of Patent: April 12, 2011

Assignee: Rambus Inc.

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
Processor and method for executing a program loop within an instruction word

Patent number: 7913069

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. Instruction words (48) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In a particular example, the series of operations are included in a single instruction word (48). The micro-loop (100) in combination with the ability of the computers (12) to send instruction words (48) to a neighboring computer (12) provides a powerful tool for allowing a computer (12) to utilize the resources of a neighboring computer (12).

Type: Grant

Filed: May 26, 2006

Date of Patent: March 22, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
MESSAGE ROUTING SCHEME

Publication number: 20110066825

Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.

Type: Application

Filed: November 18, 2010

Publication date: March 17, 2011

Applicant: XMOS LTD.

Inventor: Michael David MAY
Virtual world simulation systems and methods utilizing parallel coprocessors, and computer program products thereof

Patent number: 7908462

Abstract: The current invention provides a virtual world simulation system capable of hosting with massive amount of concurrent players by integrating commodity parallel co-processors into servers. The current invention proposes novel parallel processing algorithms to make use of commodity parallel co-processors like a graphic processing unit (GPU) or any specialized hardware with parallel architecture design like a field-programmable gate array (FPGA), to accelerate virtual world simulation.

Type: Grant

Filed: June 9, 2010

Date of Patent: March 15, 2011

Assignee: Zillians Incorporated

Inventor: Mu Chi Sung
Methods and apparatus for providing data transfer control

Patent number: 7908409

Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.

Type: Grant

Filed: August 6, 2009

Date of Patent: March 15, 2011

Assignee: Altera Corporation

Inventors: Edwin Franklin Barry, Edward A. Wolff
Asynchronous computer communication

Patent number: 7904615

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A plurality of read lines (18), write lines (20) and data lines (22) interconnect the computers (12). When one computer (12) sets a read line (18) high and the other computer sets a corresponding write line (20) then data is transferred on the data lines (22). When both the read line (18) and corresponding write line (20) go low this allows both communicating computers (12) to know that the communication is completed. An acknowledge line (72) goes high to restart the computers (12).

Type: Grant

Filed: February 16, 2006

Date of Patent: March 8, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Asynchronous power saving computer

Patent number: 7904695

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A slot sequencer (42) in each of the computers produces a timing pulse to cause the computer (12) to execute a next instruction. However, when the present instruction is a read or write type instruction, the slot sequencer does not produce the pulse until an acknowledge signal (86) starts it. The acknowledge signal (86) is produced when it is recognized that the communication has been completed by the other computer (12).

Type: Grant

Filed: February 16, 2006

Date of Patent: March 8, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore

prev 1 2 3 4 5 6 7 next