Array Processor Patents (Class 712/10)

Array processor element interconnection (Class 712/11)

Array processor operation (Class 712/16)

Method and system for state tracking and recovery in multiprocessing computing systems

Patent number: 7996585

Abstract: Disclosed are a method and system of tracking real time use of I/O control blocks on a processing unit basis, in a multiprocessing system, such that in the case of a processing unit failure, a list accurately and concisely identifies the control blocks that need to be recovered. This eliminates the need to scan all the I/O control blocks, greatly reducing the overall system recovery time and minimizing impact to the rest of the running system. The preferred embodiment of the invention uses a task control block structure to record which I/O control blocks are in use by each Processing Unit. Also, the lock word structure defined in the I/O control blocks is provided with an index back into the task control block to facilitate managing the task control block entries.

Type: Grant

Filed: September 9, 2005

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Janet R. Easton, Elke Nass, Kenneth J. Oakes, Andrew W. Piechowski, Martin Taubert, John S. Trotter, Ambrose Verdibello, Joachim von Buttlar, Robert Whalen, Jr.
Integrated computer array with independent functional configurations

Patent number: 7984266

Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.

Type: Grant

Filed: June 5, 2007

Date of Patent: July 19, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller

Patent number: 7975080

Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

Type: Grant

Filed: June 21, 2010

Date of Patent: July 5, 2011

Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
Parallel data processing apparatus

Patent number: 7966475

Abstract: A data processor comprises a plurality of processing elements arranged for parallel processing of data, and a controller for controlling the plurality of processing elements. The controller is operable to determine respective status information for a plurality of processing threads, and to control processing of the processing threads by the plurality of processors in dependence upon such status information.

Type: Grant

Filed: January 10, 2007

Date of Patent: June 21, 2011

Assignee: Rambus Inc.

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements

Patent number: 7962716

Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.

Type: Grant

Filed: November 17, 2004

Date of Patent: June 14, 2011

Assignee: QST Holdings, Inc.

Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
Message routing scheme

Patent number: 7962717

Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.

Type: Grant

Filed: March 14, 2007

Date of Patent: June 14, 2011

Assignee: XMOS Limited

Inventor: Michael David May
Processing stream instruction in IC of mesh connected matrix of processors containing pipeline coupled switch transferring messages over consecutive cycles from one link to another link or memory

Patent number: 7958341

Abstract: In some embodiments, each matrix processor in a matrix of mesh-interconnected matrix processors includes an instruction processing pipeline, and a hardware data switch capable of streaming data to/from one or more inter-processor matrix links and/or a matrix processor local memory links in response to execution of a data streaming instruction by the instruction processing pipeline. The data switch can transfer each data stream, which includes multiple words, at wire speed, one word per cycle. After initiating a data stream, the processing pipeline can execute other instructions, including streaming instructions, while a stream transfer is in progress. Different data streaming instructions may be used to transfer data streams from local memory to one or more inter-processor links, from an inter-processor link to local memory, from an inter-processor link to one or more inter-processor links, and from an inter-processor link to one or more inter-processor links and synchronously to local memory.

Type: Grant

Filed: July 7, 2008

Date of Patent: June 7, 2011

Assignee: Ovics

Inventors: Sorin C Cismas, Ilie Garbacea
Parallel data processing apparatus

Patent number: 7958332

Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.

Type: Grant

Filed: March 13, 2009

Date of Patent: June 7, 2011

Assignee: Rambus Inc.

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
Reconfigurable circuit with a limitation on connection and method of determining functions of logic circuits in the reconfigurable circuit

Patent number: 7953956

Abstract: A reconfigurable circuit of reduced circuit scale. The reconfigurable circuit of the present invention comprises a plurality of ALUs capable of changing functions. The plurality of ALUs are arranged in a matrix. At least one connection unit capable of establishing connection between the ALUs selectively is provided between the stages of the ALUs. This connection unit is not intended to allow connection between all the logic circuits in adjoining stages, but is configured so that the logic circuits are each connectable with only some of the logic circuits pertaining to the other stages. The connection limitation allows a reduction in circuit scale.

Type: Grant

Filed: December 21, 2004

Date of Patent: May 31, 2011

Assignee: Sanyo Electric Co., Ltd.

Inventors: Makoto Okada, Tatsuo Hiramatsu, Hiroshi Nakajima, Makoto Ozone
Lookup engine with programmable memory topology

Patent number: 7940755

Abstract: An architecture for a specialized electronic computer for high-speed data lookup employs a set of tiles each with independent processors and lookup memory portions. The tiles may be programmed to interconnect to form different memory topologies optimized for the particular task.

Type: Grant

Filed: March 19, 2009

Date of Patent: May 10, 2011

Assignee: Wisconsin Alumni Research Foundation

Inventors: Cristian Estan, Karthikeyan Sankaralingam
Array of processing elements with local registers

Patent number: 7941634

Abstract: Specialized image processing circuitry is usually implemented in hardware in a massively parallel way as a single instruction multiple data (SIMD) architecture. The invention prevents long and complicated connection paths between a processing element and the memory subsystem, and improves maximum operating frequency. An optimized architecture for image processing has processing elements that are arranged in a two-dimensional structure, and each processing element has a local storage containing a plurality of reference pixels that are not neighbors in the reference image. Instead, the reference pixels belong to different blocks of the reference image, which may vary for different encoding schemes.

Type: Grant

Filed: November 14, 2007

Date of Patent: May 10, 2011

Assignee: Thomson Licensing

Inventors: Marco Georgi, Klaus Gaedke, Malte Borsum
Apparatus and method for a programmable security processor

Patent number: 7937594

Abstract: A digital logic circuit comprises a programmable logic device and a programmable security circuit. The programmable security circuit stores a set of authorized configuration security keys. The programmable security circuit compares the authorized configuration security keys with an incoming configuration request, and selectively enables a new configuration for the programmable logic device in response to the configuration request. In another exemplary embodiment, a programmable security circuit also stores a set of authorized operation security keys. The programmable security circuit compares the authorized operation security keys with an incoming operation request from the programmable logic device, and selectively enables an operation within the programmable logic device in response to the operation request.

Type: Grant

Filed: May 16, 2006

Date of Patent: May 3, 2011

Assignee: Infineon Technologies AG

Inventors: Stephen L. Wasson, David K. Varn, John D. Ralston
System and method for intercommunication between computers in an array

Patent number: 7937557

Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.

Type: Grant

Filed: March 16, 2004

Date of Patent: May 3, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Method and apparatus for monitoring inputs to an asyncrhonous, homogenous, reconfigurable computer array

Patent number: 7934075

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously and operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The instructions executed by the computers (12) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In one application, the sleeping computer (12) is awakened by an input such that it commences an action that would otherwise required an interrupt of an otherwise active computer. For example, one computer (12f) can be used to monitor an input/output port of the computer array (10).

Type: Grant

Filed: May 26, 2006

Date of Patent: April 26, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Reshuffled communications processes in pipelined asynchronous circuits

Patent number: 7934031

Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 26, 2011

Assignee: California Institute of Technology

Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
Programmable pipeline array

Patent number: 7930517

Abstract: An array of programmable data-processing cells configured as a plurality of cross-connected pipelines. An apparatus includes cells capable of performing data-processing functions selectable by a presented instruction. A first set of cells includes an input cell, an output cell, and a series of at least one interior cell providing an acyclic data processing path from the input cell to the output cell. Additional cells are similarly configured. Memory presents configuration instructions to cells in response to a configuration code. Data advances through ranks of the cells. The configuration code advances to memory associated with a rank in tandem with the data.

Type: Grant

Filed: January 9, 2009

Date of Patent: April 19, 2011

Assignee: Wave Semiconductor, Inc.

Inventor: Karl M. Fant
PXE booting a storage processor from a peer storage processor

Patent number: 7930533

Abstract: A system for pre-execution environment (PXE) booting a storage processor from a peer storage processor allows for the ability to reboot and/or restart the storage processor without an externally connected PXE server. In response to a reboot request of the storage processor, the peer storage processor pushes an operating system boot image and/or other information to the storage processor for PXE booting the storage processor, and vice versa. The system may also operate with multiple coupled computers.

Type: Grant

Filed: September 26, 2007

Date of Patent: April 19, 2011

Assignee: EMC Corporation

Inventors: Ying Guo, Qing Liu, Kevin Richards
Maximized memory throughput using cooperative thread arrays

Patent number: 7925860

Abstract: In parallel processing devices, for streaming computations, processing of each data element of the stream may not be computationally intensive and thus processing may take relatively small amounts of time to compute as compared to memory accesses times required to read the stream and write the results. Therefore, memory throughput often limits the performance of the streaming computation. Generally stated, provided are methods for achieving improved, optimized, or ultimately, maximized memory throughput in such memory-throughput-limited streaming computations. Streaming computation performance is maximized by improving the aggregate memory throughput across the plurality of processing elements and threads. High aggregate memory throughput is achieved by balancing processing loads between threads and groups of threads and a hardware memory interface coupled to the parallel processing devices.

Type: Grant

Filed: May 14, 2007

Date of Patent: April 12, 2011

Assignee: NVIDIA Corporation

Inventors: Norbert Juffa, Brett W. Coon
Network on chip that maintains cache coherency with invalidate commands

Patent number: 7917703

Abstract: A network on chip (‘NOC’) comprising integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controller, each IP block coupled to a router through a memory communications controller and a network interface controller, the NOC also including a port on a router of the network through which is received an invalidate command, the invalidate command including an identification of a cache line, the invalidate command representing an instruction to invalidate the cache line, the router configured to send the invalidate command to an IP block served by the router; the router further configured to send the invalidate command horizontally and vertically to neighboring routers if the port is a vertical port; and the router further configured to send the invalidate command only horizontally to neighboring routers if the port is a horizontal port.

Type: Grant

Filed: December 13, 2007

Date of Patent: March 29, 2011

Assignee: International Business Machines Corporation

Inventors: Miguel Comparan, Russell D. Hoover, Jamie R. Kuesel, Eric O. Mejdrich, Alfred T. Watson, III
Processor and method for executing a program loop within an instruction word

Patent number: 7913069

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. Instruction words (48) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In a particular example, the series of operations are included in a single instruction word (48). The micro-loop (100) in combination with the ability of the computers (12) to send instruction words (48) to a neighboring computer (12) provides a powerful tool for allowing a computer (12) to utilize the resources of a neighboring computer (12).

Type: Grant

Filed: May 26, 2006

Date of Patent: March 22, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Systems, methods, and computer readable media for preemption in asynchronous systems using anti-tokens

Patent number: 7913007

Abstract: Systems, methods, and computer program products for preemption in asynchronous systems using anti-tokens are disclosed. According to one aspect, configurable system for constructing asynchronous application specific integrated data pipeline circuits with preemption includes a plurality of modular circuit stages that are connectable with each other and with other circuit elements to form multi-stage asynchronous application specific integrated data pipeline circuits for asynchronously sending data and tokens in a forward direction through the pipeline and for asynchronously sending anti-tokens in a backward direction through the pipeline. Each stage is configured to perform a handshaking protocol with other pipeline stages, the protocol including receiving either a token from the previous stage or an anti-token from the next stage, and in response, sending both a token forward to the next stage and an anti-token backward to the previous stage.

Type: Grant

Filed: September 29, 2008

Date of Patent: March 22, 2011

Assignee: The University of North Carolina

Inventors: Montek Singh, Manoj Kumar Ampalam
System and method for efficiently executing single program multiple data (SPMD) programs

Patent number: 7904905

Abstract: A system and method is disclosed for efficiently executing single program multiple data (SPMD) programs in a microprocessor. A micro single instruction multiple data (SIMD) unit is located within the microprocessor. A job buffer that is coupled to the micro SIMD unit dynamically allocates tasks to the micro SIMD unit. The SPMD programs each comprise a plurality of input data streams having moderate diversification of control flows. The system executes each SPMD program once for each input data stream of the plurality of input data streams.

Type: Grant

Filed: November 14, 2003

Date of Patent: March 8, 2011

Assignee: STMicroelectronics, Inc.

Inventor: Stefano Cervini
Asynchronous computer communication

Patent number: 7904615

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A plurality of read lines (18), write lines (20) and data lines (22) interconnect the computers (12). When one computer (12) sets a read line (18) high and the other computer sets a corresponding write line (20) then data is transferred on the data lines (22). When both the read line (18) and corresponding write line (20) go low this allows both communicating computers (12) to know that the communication is completed. An acknowledge line (72) goes high to restart the computers (12).

Type: Grant

Filed: February 16, 2006

Date of Patent: March 8, 2011

Assignee: VNS Portfolio LLC

Inventor: Charles H. Moore
Processor to processor communication in a data driven architecture

Patent number: 7889951

Abstract: In an embodiment, an apparatus includes a first processor that includes a first processor element. The apparatus also includes a second processor that includes a second processor element. The first processor is configured to transmit data to the second processor through a third processor, wherein no processor element within the third processor is configured to perform a process operation on the data as part of the transmission of the data from the first processor to the second processor.

Type: Grant

Filed: June 19, 2003

Date of Patent: February 15, 2011

Assignee: Intel Corporation

Inventor: Louis A. Lippincott
Processor memory system

Patent number: 7890733

Abstract: A data processor comprises a plurality of processing elements (PEs), with memory local to at least one of the processing elements, and a data packet-switched network interconnecting the processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid, e.g., in a SIMD array, so as to connect the PEs and their local memories to a common controller. Transaction-enabled PEs and nodes set flags, which are maintained until the transaction is completed and signal status to the controller e.g., over a series of OR-gates. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The local memories share the same memory map or space. External memory may also be connected to the “end” nodes interfacing with the network, eg to provide cache.

Type: Grant

Filed: August 11, 2005

Date of Patent: February 15, 2011

Assignee: Rambus Inc.

Inventor: Ray McConnell
Computer cluster

Patent number: 7889725

Abstract: A computer cluster arranged at a lattice point in a lattice-like interconnection network contains four nodes and an internal communication network. Two nodes can transmit packets to adjacent computer clusters located along the X direction, and the two other nodes can transmit packets to adjacent computer clusters located along the Y direction. Each node directly transmits a packet to an adjacent computer cluster in the direction in which the node can transmit packets, when the destination of the packet is located in the direction. When the destination of a packet to be transmitted from a node is not located in the direction in which the receiving node can transmit packets, the node transfers the packet to one of the other nodes through the internal communication network for transmitting the packet to the destination of the packet through the one of the other nodes.

Type: Grant

Filed: March 27, 2007

Date of Patent: February 15, 2011

Assignee: Fujitsu Limited

Inventor: Yuichiro Ajima
Load balancing for a system of cryptographic processors

Patent number: 7870395

Abstract: In an array of groups of cryptographic processors, the processors in each group operate together but are securely connected through an external shared memory. The processors in each group include cryptographic engines capable of operating in a pipelined fashion. Instructions in the form of request blocks are supplied to the array in a balanced fashion to assure that the processors are occupied processing instructions.

Type: Grant

Filed: October 20, 2006

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventors: Thomas J. Dewkett, Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
PARALLEL OPERATION DEVICE ALLOWING EFFICIENT PARALLEL OPERATIONAL PROCESSING

Publication number: 20100325386

Abstract: In arithmetic/logic units (ALU) provided corresponding to entries, an MIMD instruction decoder generating a group of control signals in accordance with a Multiple Instruction-Multiple Data (MIMD) instruction and an MIMD register storing data designating the MIMD instruction are provided, and an inter-ALU communication circuit is provided. The amount and direction of movement of the inter-ALU communication circuit are set by data bits stored in a movement data register. It is possible to execute data movement and arithmetic/logic operation with the amount of movement and operation instruction set individually for each ALU unit. Therefore, in a Single Instruction-Multiple Data type processing device, Multiple Instruction-Multiple Data operation can be executed at high speed in a flexible manner.

Type: Application

Filed: June 23, 2010

Publication date: December 23, 2010

Applicant: Renesas Technology Corp.

Inventors: Toshinori Sueyoshi, Masahiro Iida, Mitsutaka Nakano, Fumiaki Senoue, Katsuya Mizumoto
ACTIVE MEMORY COMMAND ENGINE AND METHOD

Publication number: 20100318765

Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.

Type: Application

Filed: August 20, 2010

Publication date: December 16, 2010

Applicant: Micron Technology, Inc.

Inventor: Graham Kirsch
Managing buffer storage in a parallel processing environment

Patent number: 7853774

Abstract: An integrated circuit including a plurality of tiles. Each tile comprises a processor; a switch including switching circuitry to forward data words over data paths from other tiles to the processor and to switches of other tiles; and memory coupled to the switch to buffer data transmitted among the tiles. The switches form a plurality of networks among the tiles. At least one of the networks is configured to transmit data among the tiles using an approach that reserves sufficient buffer space in the memories coupled to the switches to avoid deadlock conditions, and at least one of the networks is configured to transmit data among the tiles using an approach to detect and recover from deadlock conditions.

Type: Grant

Filed: December 21, 2005

Date of Patent: December 14, 2010

Assignee: Tilera Corporation

Inventor: David Wentzlaff
Memory compression method and apparatus for heterogeneous processor architectures in an information handling system

Patent number: 7849241

Abstract: The disclosed heterogeneous processor compresses information to more efficiently store the information in a system memory coupled to the processor. The heterogeneous processor includes a general purpose processor core coupled to one or more processor cores that exhibit an architecture different from the architecture of the general purpose processor core. In one embodiment, the processor dedicates a processor core other than the general purpose processor core to memory compression and decompression tasks. In another embodiment, system memory stores both compressed information and uncompressed information.

Type: Grant

Filed: March 23, 2006

Date of Patent: December 7, 2010

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Barry L Minor
Method and apparatus for affinity-guided speculative helper threads in chip multiprocessors

Patent number: 7844801

Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.

Type: Grant

Filed: July 31, 2003

Date of Patent: November 30, 2010

Assignee: Intel Corporation

Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
Processor cluster architecture and associated parallel processing methods

Patent number: 7840778

Abstract: A parallel processing architecture comprising a cluster of embedded processors that share a common code distribution bus. Pages or blocks of code are concurrently loaded into respective program memories of some or all of these processors (typically all processors assigned to a particular task) over the code distribution bus, and are executed in parallel by these processors. A task control processor determines when all of the processors assigned to a particular task have finished executing the current code page, and then loads a new code page (e.g., the next sequential code page within a task) into the program memories of these processors for execution. The processors within the cluster preferably share a common memory (1 per cluster) that is used to receive data inputs from, and to provide data outputs to, a higher level processor. Multiple interconnected clusters may be integrated within a common integrated circuit device.

Type: Grant

Filed: August 31, 2006

Date of Patent: November 23, 2010

Inventors: Richard F. Hobson, Bill Ressl, Allan R. Dyck
System and method for processing thread groups in a SIMD architecture

Patent number: 7836276

Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.

Type: Grant

Filed: December 2, 2005

Date of Patent: November 16, 2010

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm
Multidimensional processor architecture

Patent number: 7831804

Abstract: A processor architecture includes a number of processing elements for treating input signals. The architecture is organized according to a matrix including rows and columns, the columns of which each include at least one microprocessor block having a computational part and a set of associated processing elements that are able to receive the same input signals. The number of associated processing elements is selectively variable in the direction of the column so as to exploit the parallelism of said signals. Additionally the processor architecture of the present invention enable dynamic switching between instruction parallelism and data parallel processing typical of vectorial functionality. The architecture can be scaled in various dimensions in an optimal configuration for the algorithm to be executed.

Type: Grant

Filed: May 30, 2008

Date of Patent: November 9, 2010

Assignee: ST Microelectronics S.R.L.

Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elio Guidetti
INTERLEAVED MULTI-THREADED VECTOR PROCESSOR

Publication number: 20100281234

Abstract: A method includes providing a processor configured to execute instructions. The method may further include providing a first set of registers in the processor to store first data and first instructions associated with a first thread, and providing a second set of registers in the processor to store second data and second instructions associated with a second thread. The method may further include transmitting the first data and first instructions associated with the first thread to the first set of registers, and executing the first instructions in order to process the first data. The method may further include transmitting the second data and second instructions to the second set of registers while executing the first instructions and processing the first data. A corresponding apparatus is also disclosed and claimed herein.

Type: Application

Filed: April 30, 2009

Publication date: November 4, 2010

Applicant: Novafora, Inc.

Inventors: MUHAMMAD AHMED, Marc Schaub, Shlomo Selim Rakib
Process for automatic dynamic reloading of data flow processors (DFPs) and units with two- or three-dimensional programmable cell architectures (FPGAs, DPGAs, and the like)

Patent number: 7822881

Abstract: In a data-processing method, first result data may be obtained using a plurality of configurable coarse-granular elements, the first result data may be written into a memory that includes spatially separate first and second memory areas and that is connected via a bus to the plurality of configurable coarse-granular elements, the first result data may be subsequently read out from the memory, and the first result data may be subsequently processed using the plurality of configurable coarse-granular elements. In a first configuration, the first memory area may be configured as a write memory, and the second memory area may be configured as a read memory. Subsequent to writing to and reading from the memory in accordance with the first configuration, the first memory area may be configured as a read memory, and the second memory area may be configured as a write memory.

Type: Grant

Filed: October 7, 2005

Date of Patent: October 26, 2010

Inventors: Martin Vorbach, Robert Münch
Data processing architectures

Patent number: 7818541

Abstract: A data processing architecture comprising: an input device for receiving an incoming stream of data packets; and a plurality of processing elements which are operable to process data received thereby; wherein the input device is operable to distribute data packets in whole or in part to the processing elements in dependence upon the data processing bandwidth of the processing elements.

Type: Grant

Filed: May 23, 2007

Date of Patent: October 19, 2010

Assignee: Clearspeed Technology Limited

Inventors: John Rhoades, Ken Cameron, Paul Winser, Ray McConnell, Gordon Faulds, Simon McIntosh-Smith, Anthony Spencer, Jeff Bond, Matthias Dejaegher, Danny Halamish, Gajinder Panesar
Virtual core management

Patent number: 7802073

Abstract: The present disclosure provides methods and systems adapted for use with a processor having one or more physical cores. The methods and systems include a virtual core management component adapted to map one or more virtual cores to at least one of the physical cores to enable execution of one or more programs by the at least one physical core. The one or more virtual cores include one or more logical states associated with the execution of the one or more programs. The methods and systems may include a memory component adapted to store the one or more virtual cores. The virtual core management component may be adapted to transfer the one or more virtual cores from the memory component to the at least one physical core.

Type: Grant

Filed: July 23, 2007

Date of Patent: September 21, 2010

Assignee: Oracle America, Inc.

Inventors: Yu Qing Cheng, John Gregory Favor, Carlos Puchol, Seungyoon Peter Song, Peter Glaskowsky, Laurent Moll, Joe Rowlands, Donald Alpert
Virtual core management

Patent number: 7797512

Abstract: A virtual core management system including one or more physical cores and one or more virtual cores. Each virtual core respectively includes a collection of logical states associated with execution of a corresponding program. The virtual core management system further includes one or more interrupt controllers configured to send one or more interrupt signals to interrupt execution of a corresponding program associated with at least one of the one or more virtual cores, and a virtual core management component configured to map the at least one virtual core to one of the one or more physical cores and route the one or more interrupt signals to the corresponding physical core.

Type: Grant

Filed: October 31, 2007

Date of Patent: September 14, 2010

Assignee: Oracle America, Inc.

Inventors: Yu Qing Cheng, John Gregory Favor, Peter N. Glaskowsky, Laurent R. Moll, Carlos Puchol, Seungyoon Peter Song
Active memory command engine and method

Patent number: 7793075

Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.

Type: Grant

Filed: July 9, 2008

Date of Patent: September 7, 2010

Assignee: Micron Technology, Inc.

Inventor: Graham Kirsch
Directing data in a parallel processing environment

Patent number: 7793074

Abstract: An apparatus comprises a plurality of processor cores, and an interconnection network to route data among the processor cores based on destination information in the data. The processor cores are configured to forward the data to a final destination if the destination information indicates that a destination processor core has been reached, or to forward the data to other processor cores if the destination information indicates that a destination processor core has not been reached. The final destination is one of a plurality of destinations indicated by the destination information, the destinations including a plurality of portions of the destination processor core.

Type: Grant

Filed: April 14, 2006

Date of Patent: September 7, 2010

Assignee: Tilera Corporation

Inventors: David Wentzlaff, Anant Agarwal
Parallel operation device allowing efficient parallel operational processing

Patent number: 7769980

Abstract: In arithmetic/logic units (ALU) provided corresponding to entries, an MIMD instruction decoder generating a group of control signals in accordance with a Multiple Instruction-Multiple Data (MIMD) instruction and an MIMD register storing data designating the MIMD instruction are provided, and an inter-ALU communication circuit is provided. The amount and direction of movement of the inter-ALU communication circuit are set by data bits stored in a movement data register. It is possible to execute data movement and arithmetic/logic operation with the amount of movement and operation instruction set individually for each ALU unit. Therefore, in a Single Instruction-Multiple Data type processing device, Multiple Instruction-Multiple Data operation can be executed at high speed in a flexible manner.

Type: Grant

Filed: August 16, 2007

Date of Patent: August 3, 2010

Assignee: Renesas Technology Corp.

Inventors: Toshinori Sueyoshi, Masahiro Iida, Mitsutaka Nakano, Fumiaki Senoue, Katsuya Mizumoto
Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller

Patent number: 7765338

Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FET computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

Type: Grant

Filed: July 9, 2007

Date of Patent: July 27, 2010

Assignee: Altera Corporation

Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
Processor device having a sequential data processing unit and an arrangement of data processing elements

Publication number: 20100174868

Abstract: Designing a coupling of a traditional processor, in particular a sequential processor, and a reconfigurable field of data processing units, in particular a runtime-reconfigurable field of data processing units is described.

Type: Application

Filed: March 22, 2010

Publication date: July 8, 2010

Inventor: Martin Vorbach
Apparatus and method for data migration in a storage processing device

Patent number: 7752361

Abstract: A system including a storage processing device with an input/output module. The input/output module has port processors to receive and transmit network traffic. The input/output module also has a switch connecting the port processors. Each port processor categorizes the network traffic as fast path network traffic or control path network traffic. The switch routes fast path network traffic from an ingress port processor to a specified egress port processor. The storage processing device also includes a control module to process the control path network traffic received from the ingress port processor. The control module routes processed control path network traffic to the switch for routing to a defined egress port processor. The control module is connected to the input/output module. The input/output module and the control module are configured to interactively support data virtualization, data migration, data journaling, and snapshotting.

Type: Grant

Filed: October 28, 2003

Date of Patent: July 6, 2010

Assignee: Brocade Communications Systems, Inc.

Inventors: Venkat Rangan, Edward D. McClanahan, Michael B. Schmitz
Scalable computer system

Patent number: 7752337

Abstract: The present invention utilizes a “small-world” network architecture, in which a relatively small number of random cross-links of nodes or vertices in a network can result in small characteristic path lengths, for the transfer of messages between modes or vertices in a telecommunications/computer network regardless of their location. The “small world” principle is usually considered to apply to many biological and social networks, as these systems generally exhibit properties that are not completely regular or completely random but somewhere in between. The present invention applies this small world principle to telecommunications/computer networks.

Type: Grant

Filed: June 30, 2000

Date of Patent: July 6, 2010

Assignee: Jifmar Pty Ltd

Inventors: Fergus O'Brien, Matthew Roughan
Method for providing physics simulation data

Patent number: 7739479

Abstract: A method of providing physics data within a game program or simulation using a hardware-based physics processing unit having unique architecture designed to efficiently calculate physics related data.

Type: Grant

Filed: November 19, 2003

Date of Patent: June 15, 2010

Assignee: NVIDIA Corporation

Inventors: Jean Pierre Bordes, Curtis Davis, Monier Maher, Manju Hegde, Otto A. Schmid
Methods and system for configurable domain specific abstract core

Patent number: 7739647

Abstract: The present invention provides a configurable domain specific abstract core (DSAC) for implementing applications within any domain. The DSAC comprises at least one function specific abstract module (FSAM) configurable at a plurality of stages for implementing a predetermined function belonging to one or more applications in the domain. The FSAM comprises a function specific abstract logic (FSAL) for implementing functional logic and a micro state engine (MSE) for generating and monitoring one or more control signals, at least one of the control signals being generated by execution of a dynamic script for controlling the FSAL.

Type: Grant

Filed: June 5, 2007

Date of Patent: June 15, 2010

Assignee: Infosys Technologies Ltd.

Inventors: Guruprasad Ramananda Athani, Ranju Philip Abraham, Shashi Basavappa Chinnikatte
Managing data forwarded between processors in a parallel processing environment based on operations associated with instructions issued by the processors

Patent number: 7734894

Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.

Type: Grant

Filed: April 28, 2008

Date of Patent: June 8, 2010

Assignee: Tilera Corporation

Inventors: David Wentzlaff, Anant Agarwal

prev 1 2 3 4 5 6 7 8 9 … next