Array Processor Patents (Class 712/10)
  • Patent number: 7996585
    Abstract: Disclosed are a method and system of tracking real time use of I/O control blocks on a processing unit basis, in a multiprocessing system, such that in the case of a processing unit failure, a list accurately and concisely identifies the control blocks that need to be recovered. This eliminates the need to scan all the I/O control blocks, greatly reducing the overall system recovery time and minimizing impact to the rest of the running system. The preferred embodiment of the invention uses a task control block structure to record which I/O control blocks are in use by each Processing Unit. Also, the lock word structure defined in the I/O control blocks is provided with an index back into the task control block to facilitate managing the task control block entries.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: August 9, 2011
    Assignee: International Business Machines Corporation
    Inventors: Janet R. Easton, Elke Nass, Kenneth J. Oakes, Andrew W. Piechowski, Martin Taubert, John S. Trotter, Ambrose Verdibello, Joachim von Buttlar, Robert Whalen, Jr.
  • Patent number: 7984266
    Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: July 19, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7975080
    Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.
    Type: Grant
    Filed: June 21, 2010
    Date of Patent: July 5, 2011
    Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
  • Patent number: 7966475
    Abstract: A data processor comprises a plurality of processing elements arranged for parallel processing of data, and a controller for controlling the plurality of processing elements. The controller is operable to determine respective status information for a plurality of processing threads, and to control processing of the processing threads by the plurality of processors in dependence upon such status information.
    Type: Grant
    Filed: January 10, 2007
    Date of Patent: June 21, 2011
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7962716
    Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.
    Type: Grant
    Filed: November 17, 2004
    Date of Patent: June 14, 2011
    Assignee: QST Holdings, Inc.
    Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
  • Patent number: 7962717
    Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: June 14, 2011
    Assignee: XMOS Limited
    Inventor: Michael David May
  • Patent number: 7958341
    Abstract: In some embodiments, each matrix processor in a matrix of mesh-interconnected matrix processors includes an instruction processing pipeline, and a hardware data switch capable of streaming data to/from one or more inter-processor matrix links and/or a matrix processor local memory links in response to execution of a data streaming instruction by the instruction processing pipeline. The data switch can transfer each data stream, which includes multiple words, at wire speed, one word per cycle. After initiating a data stream, the processing pipeline can execute other instructions, including streaming instructions, while a stream transfer is in progress. Different data streaming instructions may be used to transfer data streams from local memory to one or more inter-processor links, from an inter-processor link to local memory, from an inter-processor link to one or more inter-processor links, and from an inter-processor link to one or more inter-processor links and synchronously to local memory.
    Type: Grant
    Filed: July 7, 2008
    Date of Patent: June 7, 2011
    Assignee: Ovics
    Inventors: Sorin C Cismas, Ilie Garbacea
  • Patent number: 7958332
    Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.
    Type: Grant
    Filed: March 13, 2009
    Date of Patent: June 7, 2011
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7953956
    Abstract: A reconfigurable circuit of reduced circuit scale. The reconfigurable circuit of the present invention comprises a plurality of ALUs capable of changing functions. The plurality of ALUs are arranged in a matrix. At least one connection unit capable of establishing connection between the ALUs selectively is provided between the stages of the ALUs. This connection unit is not intended to allow connection between all the logic circuits in adjoining stages, but is configured so that the logic circuits are each connectable with only some of the logic circuits pertaining to the other stages. The connection limitation allows a reduction in circuit scale.
    Type: Grant
    Filed: December 21, 2004
    Date of Patent: May 31, 2011
    Assignee: Sanyo Electric Co., Ltd.
    Inventors: Makoto Okada, Tatsuo Hiramatsu, Hiroshi Nakajima, Makoto Ozone
  • Patent number: 7940755
    Abstract: An architecture for a specialized electronic computer for high-speed data lookup employs a set of tiles each with independent processors and lookup memory portions. The tiles may be programmed to interconnect to form different memory topologies optimized for the particular task.
    Type: Grant
    Filed: March 19, 2009
    Date of Patent: May 10, 2011
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Cristian Estan, Karthikeyan Sankaralingam
  • Patent number: 7941634
    Abstract: Specialized image processing circuitry is usually implemented in hardware in a massively parallel way as a single instruction multiple data (SIMD) architecture. The invention prevents long and complicated connection paths between a processing element and the memory subsystem, and improves maximum operating frequency. An optimized architecture for image processing has processing elements that are arranged in a two-dimensional structure, and each processing element has a local storage containing a plurality of reference pixels that are not neighbors in the reference image. Instead, the reference pixels belong to different blocks of the reference image, which may vary for different encoding schemes.
    Type: Grant
    Filed: November 14, 2007
    Date of Patent: May 10, 2011
    Assignee: Thomson Licensing
    Inventors: Marco Georgi, Klaus Gaedke, Malte Borsum
  • Patent number: 7937594
    Abstract: A digital logic circuit comprises a programmable logic device and a programmable security circuit. The programmable security circuit stores a set of authorized configuration security keys. The programmable security circuit compares the authorized configuration security keys with an incoming configuration request, and selectively enables a new configuration for the programmable logic device in response to the configuration request. In another exemplary embodiment, a programmable security circuit also stores a set of authorized operation security keys. The programmable security circuit compares the authorized operation security keys with an incoming operation request from the programmable logic device, and selectively enables an operation within the programmable logic device in response to the operation request.
    Type: Grant
    Filed: May 16, 2006
    Date of Patent: May 3, 2011
    Assignee: Infineon Technologies AG
    Inventors: Stephen L. Wasson, David K. Varn, John D. Ralston
  • Patent number: 7937557
    Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.
    Type: Grant
    Filed: March 16, 2004
    Date of Patent: May 3, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7934075
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously and operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The instructions executed by the computers (12) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In one application, the sleeping computer (12) is awakened by an input such that it commences an action that would otherwise required an interrupt of an otherwise active computer. For example, one computer (12f) can be used to monitor an input/output port of the computer array (10).
    Type: Grant
    Filed: May 26, 2006
    Date of Patent: April 26, 2011
    Assignee: VNS Portfolio LLC
    Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
  • Patent number: 7934031
    Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: April 26, 2011
    Assignee: California Institute of Technology
    Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
  • Patent number: 7930517
    Abstract: An array of programmable data-processing cells configured as a plurality of cross-connected pipelines. An apparatus includes cells capable of performing data-processing functions selectable by a presented instruction. A first set of cells includes an input cell, an output cell, and a series of at least one interior cell providing an acyclic data processing path from the input cell to the output cell. Additional cells are similarly configured. Memory presents configuration instructions to cells in response to a configuration code. Data advances through ranks of the cells. The configuration code advances to memory associated with a rank in tandem with the data.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: April 19, 2011
    Assignee: Wave Semiconductor, Inc.
    Inventor: Karl M. Fant
  • Patent number: 7930533
    Abstract: A system for pre-execution environment (PXE) booting a storage processor from a peer storage processor allows for the ability to reboot and/or restart the storage processor without an externally connected PXE server. In response to a reboot request of the storage processor, the peer storage processor pushes an operating system boot image and/or other information to the storage processor for PXE booting the storage processor, and vice versa. The system may also operate with multiple coupled computers.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: April 19, 2011
    Assignee: EMC Corporation
    Inventors: Ying Guo, Qing Liu, Kevin Richards
  • Patent number: 7925860
    Abstract: In parallel processing devices, for streaming computations, processing of each data element of the stream may not be computationally intensive and thus processing may take relatively small amounts of time to compute as compared to memory accesses times required to read the stream and write the results. Therefore, memory throughput often limits the performance of the streaming computation. Generally stated, provided are methods for achieving improved, optimized, or ultimately, maximized memory throughput in such memory-throughput-limited streaming computations. Streaming computation performance is maximized by improving the aggregate memory throughput across the plurality of processing elements and threads. High aggregate memory throughput is achieved by balancing processing loads between threads and groups of threads and a hardware memory interface coupled to the parallel processing devices.
    Type: Grant
    Filed: May 14, 2007
    Date of Patent: April 12, 2011
    Assignee: NVIDIA Corporation
    Inventors: Norbert Juffa, Brett W. Coon
  • Patent number: 7917703
    Abstract: A network on chip (‘NOC’) comprising integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controller, each IP block coupled to a router through a memory communications controller and a network interface controller, the NOC also including a port on a router of the network through which is received an invalidate command, the invalidate command including an identification of a cache line, the invalidate command representing an instruction to invalidate the cache line, the router configured to send the invalidate command to an IP block served by the router; the router further configured to send the invalidate command horizontally and vertically to neighboring routers if the port is a vertical port; and the router further configured to send the invalidate command only horizontally to neighboring routers if the port is a horizontal port.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: March 29, 2011
    Assignee: International Business Machines Corporation
    Inventors: Miguel Comparan, Russell D. Hoover, Jamie R. Kuesel, Eric O. Mejdrich, Alfred T. Watson, III
  • Patent number: 7913069
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. Instruction words (48) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In a particular example, the series of operations are included in a single instruction word (48). The micro-loop (100) in combination with the ability of the computers (12) to send instruction words (48) to a neighboring computer (12) provides a powerful tool for allowing a computer (12) to utilize the resources of a neighboring computer (12).
    Type: Grant
    Filed: May 26, 2006
    Date of Patent: March 22, 2011
    Assignee: VNS Portfolio LLC
    Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
  • Patent number: 7913007
    Abstract: Systems, methods, and computer program products for preemption in asynchronous systems using anti-tokens are disclosed. According to one aspect, configurable system for constructing asynchronous application specific integrated data pipeline circuits with preemption includes a plurality of modular circuit stages that are connectable with each other and with other circuit elements to form multi-stage asynchronous application specific integrated data pipeline circuits for asynchronously sending data and tokens in a forward direction through the pipeline and for asynchronously sending anti-tokens in a backward direction through the pipeline. Each stage is configured to perform a handshaking protocol with other pipeline stages, the protocol including receiving either a token from the previous stage or an anti-token from the next stage, and in response, sending both a token forward to the next stage and an anti-token backward to the previous stage.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: March 22, 2011
    Assignee: The University of North Carolina
    Inventors: Montek Singh, Manoj Kumar Ampalam
  • Patent number: 7904905
    Abstract: A system and method is disclosed for efficiently executing single program multiple data (SPMD) programs in a microprocessor. A micro single instruction multiple data (SIMD) unit is located within the microprocessor. A job buffer that is coupled to the micro SIMD unit dynamically allocates tasks to the micro SIMD unit. The SPMD programs each comprise a plurality of input data streams having moderate diversification of control flows. The system executes each SPMD program once for each input data stream of the plurality of input data streams.
    Type: Grant
    Filed: November 14, 2003
    Date of Patent: March 8, 2011
    Assignee: STMicroelectronics, Inc.
    Inventor: Stefano Cervini
  • Patent number: 7904615
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A plurality of read lines (18), write lines (20) and data lines (22) interconnect the computers (12). When one computer (12) sets a read line (18) high and the other computer sets a corresponding write line (20) then data is transferred on the data lines (22). When both the read line (18) and corresponding write line (20) go low this allows both communicating computers (12) to know that the communication is completed. An acknowledge line (72) goes high to restart the computers (12).
    Type: Grant
    Filed: February 16, 2006
    Date of Patent: March 8, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7889951
    Abstract: In an embodiment, an apparatus includes a first processor that includes a first processor element. The apparatus also includes a second processor that includes a second processor element. The first processor is configured to transmit data to the second processor through a third processor, wherein no processor element within the third processor is configured to perform a process operation on the data as part of the transmission of the data from the first processor to the second processor.
    Type: Grant
    Filed: June 19, 2003
    Date of Patent: February 15, 2011
    Assignee: Intel Corporation
    Inventor: Louis A. Lippincott
  • Patent number: 7890733
    Abstract: A data processor comprises a plurality of processing elements (PEs), with memory local to at least one of the processing elements, and a data packet-switched network interconnecting the processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid, e.g., in a SIMD array, so as to connect the PEs and their local memories to a common controller. Transaction-enabled PEs and nodes set flags, which are maintained until the transaction is completed and signal status to the controller e.g., over a series of OR-gates. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The local memories share the same memory map or space. External memory may also be connected to the “end” nodes interfacing with the network, eg to provide cache.
    Type: Grant
    Filed: August 11, 2005
    Date of Patent: February 15, 2011
    Assignee: Rambus Inc.
    Inventor: Ray McConnell
  • Patent number: 7889725
    Abstract: A computer cluster arranged at a lattice point in a lattice-like interconnection network contains four nodes and an internal communication network. Two nodes can transmit packets to adjacent computer clusters located along the X direction, and the two other nodes can transmit packets to adjacent computer clusters located along the Y direction. Each node directly transmits a packet to an adjacent computer cluster in the direction in which the node can transmit packets, when the destination of the packet is located in the direction. When the destination of a packet to be transmitted from a node is not located in the direction in which the receiving node can transmit packets, the node transfers the packet to one of the other nodes through the internal communication network for transmitting the packet to the destination of the packet through the one of the other nodes.
    Type: Grant
    Filed: March 27, 2007
    Date of Patent: February 15, 2011
    Assignee: Fujitsu Limited
    Inventor: Yuichiro Ajima
  • Patent number: 7870395
    Abstract: In an array of groups of cryptographic processors, the processors in each group operate together but are securely connected through an external shared memory. The processors in each group include cryptographic engines capable of operating in a pipelined fashion. Instructions in the form of request blocks are supplied to the array in a balanced fashion to assure that the processors are occupied processing instructions.
    Type: Grant
    Filed: October 20, 2006
    Date of Patent: January 11, 2011
    Assignee: International Business Machines Corporation
    Inventors: Thomas J. Dewkett, Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
  • Publication number: 20100325386
    Abstract: In arithmetic/logic units (ALU) provided corresponding to entries, an MIMD instruction decoder generating a group of control signals in accordance with a Multiple Instruction-Multiple Data (MIMD) instruction and an MIMD register storing data designating the MIMD instruction are provided, and an inter-ALU communication circuit is provided. The amount and direction of movement of the inter-ALU communication circuit are set by data bits stored in a movement data register. It is possible to execute data movement and arithmetic/logic operation with the amount of movement and operation instruction set individually for each ALU unit. Therefore, in a Single Instruction-Multiple Data type processing device, Multiple Instruction-Multiple Data operation can be executed at high speed in a flexible manner.
    Type: Application
    Filed: June 23, 2010
    Publication date: December 23, 2010
    Applicant: Renesas Technology Corp.
    Inventors: Toshinori Sueyoshi, Masahiro Iida, Mitsutaka Nakano, Fumiaki Senoue, Katsuya Mizumoto
  • Publication number: 20100318765
    Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.
    Type: Application
    Filed: August 20, 2010
    Publication date: December 16, 2010
    Applicant: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7853774
    Abstract: An integrated circuit including a plurality of tiles. Each tile comprises a processor; a switch including switching circuitry to forward data words over data paths from other tiles to the processor and to switches of other tiles; and memory coupled to the switch to buffer data transmitted among the tiles. The switches form a plurality of networks among the tiles. At least one of the networks is configured to transmit data among the tiles using an approach that reserves sufficient buffer space in the memories coupled to the switches to avoid deadlock conditions, and at least one of the networks is configured to transmit data among the tiles using an approach to detect and recover from deadlock conditions.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: December 14, 2010
    Assignee: Tilera Corporation
    Inventor: David Wentzlaff
  • Patent number: 7849241
    Abstract: The disclosed heterogeneous processor compresses information to more efficiently store the information in a system memory coupled to the processor. The heterogeneous processor includes a general purpose processor core coupled to one or more processor cores that exhibit an architecture different from the architecture of the general purpose processor core. In one embodiment, the processor dedicates a processor core other than the general purpose processor core to memory compression and decompression tasks. In another embodiment, system memory stores both compressed information and uncompressed information.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: December 7, 2010
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Barry L Minor
  • Patent number: 7844801
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: November 30, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
  • Patent number: 7840778
    Abstract: A parallel processing architecture comprising a cluster of embedded processors that share a common code distribution bus. Pages or blocks of code are concurrently loaded into respective program memories of some or all of these processors (typically all processors assigned to a particular task) over the code distribution bus, and are executed in parallel by these processors. A task control processor determines when all of the processors assigned to a particular task have finished executing the current code page, and then loads a new code page (e.g., the next sequential code page within a task) into the program memories of these processors for execution. The processors within the cluster preferably share a common memory (1 per cluster) that is used to receive data inputs from, and to provide data outputs to, a higher level processor. Multiple interconnected clusters may be integrated within a common integrated circuit device.
    Type: Grant
    Filed: August 31, 2006
    Date of Patent: November 23, 2010
    Inventors: Richard F. Hobson, Bill Ressl, Allan R. Dyck
  • Patent number: 7836276
    Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.
    Type: Grant
    Filed: December 2, 2005
    Date of Patent: November 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm
  • Patent number: 7831804
    Abstract: A processor architecture includes a number of processing elements for treating input signals. The architecture is organized according to a matrix including rows and columns, the columns of which each include at least one microprocessor block having a computational part and a set of associated processing elements that are able to receive the same input signals. The number of associated processing elements is selectively variable in the direction of the column so as to exploit the parallelism of said signals. Additionally the processor architecture of the present invention enable dynamic switching between instruction parallelism and data parallel processing typical of vectorial functionality. The architecture can be scaled in various dimensions in an optimal configuration for the algorithm to be executed.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: November 9, 2010
    Assignee: ST Microelectronics S.R.L.
    Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elio Guidetti
  • Publication number: 20100281234
    Abstract: A method includes providing a processor configured to execute instructions. The method may further include providing a first set of registers in the processor to store first data and first instructions associated with a first thread, and providing a second set of registers in the processor to store second data and second instructions associated with a second thread. The method may further include transmitting the first data and first instructions associated with the first thread to the first set of registers, and executing the first instructions in order to process the first data. The method may further include transmitting the second data and second instructions to the second set of registers while executing the first instructions and processing the first data. A corresponding apparatus is also disclosed and claimed herein.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: Novafora, Inc.
    Inventors: MUHAMMAD AHMED, Marc Schaub, Shlomo Selim Rakib
  • Patent number: 7822881
    Abstract: In a data-processing method, first result data may be obtained using a plurality of configurable coarse-granular elements, the first result data may be written into a memory that includes spatially separate first and second memory areas and that is connected via a bus to the plurality of configurable coarse-granular elements, the first result data may be subsequently read out from the memory, and the first result data may be subsequently processed using the plurality of configurable coarse-granular elements. In a first configuration, the first memory area may be configured as a write memory, and the second memory area may be configured as a read memory. Subsequent to writing to and reading from the memory in accordance with the first configuration, the first memory area may be configured as a read memory, and the second memory area may be configured as a write memory.
    Type: Grant
    Filed: October 7, 2005
    Date of Patent: October 26, 2010
    Inventors: Martin Vorbach, Robert Münch
  • Patent number: 7818541
    Abstract: A data processing architecture comprising: an input device for receiving an incoming stream of data packets; and a plurality of processing elements which are operable to process data received thereby; wherein the input device is operable to distribute data packets in whole or in part to the processing elements in dependence upon the data processing bandwidth of the processing elements.
    Type: Grant
    Filed: May 23, 2007
    Date of Patent: October 19, 2010
    Assignee: Clearspeed Technology Limited
    Inventors: John Rhoades, Ken Cameron, Paul Winser, Ray McConnell, Gordon Faulds, Simon McIntosh-Smith, Anthony Spencer, Jeff Bond, Matthias Dejaegher, Danny Halamish, Gajinder Panesar
  • Patent number: 7802073
    Abstract: The present disclosure provides methods and systems adapted for use with a processor having one or more physical cores. The methods and systems include a virtual core management component adapted to map one or more virtual cores to at least one of the physical cores to enable execution of one or more programs by the at least one physical core. The one or more virtual cores include one or more logical states associated with the execution of the one or more programs. The methods and systems may include a memory component adapted to store the one or more virtual cores. The virtual core management component may be adapted to transfer the one or more virtual cores from the memory component to the at least one physical core.
    Type: Grant
    Filed: July 23, 2007
    Date of Patent: September 21, 2010
    Assignee: Oracle America, Inc.
    Inventors: Yu Qing Cheng, John Gregory Favor, Carlos Puchol, Seungyoon Peter Song, Peter Glaskowsky, Laurent Moll, Joe Rowlands, Donald Alpert
  • Patent number: 7797512
    Abstract: A virtual core management system including one or more physical cores and one or more virtual cores. Each virtual core respectively includes a collection of logical states associated with execution of a corresponding program. The virtual core management system further includes one or more interrupt controllers configured to send one or more interrupt signals to interrupt execution of a corresponding program associated with at least one of the one or more virtual cores, and a virtual core management component configured to map the at least one virtual core to one of the one or more physical cores and route the one or more interrupt signals to the corresponding physical core.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: September 14, 2010
    Assignee: Oracle America, Inc.
    Inventors: Yu Qing Cheng, John Gregory Favor, Peter N. Glaskowsky, Laurent R. Moll, Carlos Puchol, Seungyoon Peter Song
  • Patent number: 7793075
    Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.
    Type: Grant
    Filed: July 9, 2008
    Date of Patent: September 7, 2010
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7793074
    Abstract: An apparatus comprises a plurality of processor cores, and an interconnection network to route data among the processor cores based on destination information in the data. The processor cores are configured to forward the data to a final destination if the destination information indicates that a destination processor core has been reached, or to forward the data to other processor cores if the destination information indicates that a destination processor core has not been reached. The final destination is one of a plurality of destinations indicated by the destination information, the destinations including a plurality of portions of the destination processor core.
    Type: Grant
    Filed: April 14, 2006
    Date of Patent: September 7, 2010
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal
  • Patent number: 7769980
    Abstract: In arithmetic/logic units (ALU) provided corresponding to entries, an MIMD instruction decoder generating a group of control signals in accordance with a Multiple Instruction-Multiple Data (MIMD) instruction and an MIMD register storing data designating the MIMD instruction are provided, and an inter-ALU communication circuit is provided. The amount and direction of movement of the inter-ALU communication circuit are set by data bits stored in a movement data register. It is possible to execute data movement and arithmetic/logic operation with the amount of movement and operation instruction set individually for each ALU unit. Therefore, in a Single Instruction-Multiple Data type processing device, Multiple Instruction-Multiple Data operation can be executed at high speed in a flexible manner.
    Type: Grant
    Filed: August 16, 2007
    Date of Patent: August 3, 2010
    Assignee: Renesas Technology Corp.
    Inventors: Toshinori Sueyoshi, Masahiro Iida, Mitsutaka Nakano, Fumiaki Senoue, Katsuya Mizumoto
  • Patent number: 7765338
    Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FET computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: July 27, 2010
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
  • Publication number: 20100174868
    Abstract: Designing a coupling of a traditional processor, in particular a sequential processor, and a reconfigurable field of data processing units, in particular a runtime-reconfigurable field of data processing units is described.
    Type: Application
    Filed: March 22, 2010
    Publication date: July 8, 2010
    Inventor: Martin Vorbach
  • Patent number: 7752361
    Abstract: A system including a storage processing device with an input/output module. The input/output module has port processors to receive and transmit network traffic. The input/output module also has a switch connecting the port processors. Each port processor categorizes the network traffic as fast path network traffic or control path network traffic. The switch routes fast path network traffic from an ingress port processor to a specified egress port processor. The storage processing device also includes a control module to process the control path network traffic received from the ingress port processor. The control module routes processed control path network traffic to the switch for routing to a defined egress port processor. The control module is connected to the input/output module. The input/output module and the control module are configured to interactively support data virtualization, data migration, data journaling, and snapshotting.
    Type: Grant
    Filed: October 28, 2003
    Date of Patent: July 6, 2010
    Assignee: Brocade Communications Systems, Inc.
    Inventors: Venkat Rangan, Edward D. McClanahan, Michael B. Schmitz
  • Patent number: 7752337
    Abstract: The present invention utilizes a “small-world” network architecture, in which a relatively small number of random cross-links of nodes or vertices in a network can result in small characteristic path lengths, for the transfer of messages between modes or vertices in a telecommunications/computer network regardless of their location. The “small world” principle is usually considered to apply to many biological and social networks, as these systems generally exhibit properties that are not completely regular or completely random but somewhere in between. The present invention applies this small world principle to telecommunications/computer networks.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: July 6, 2010
    Assignee: Jifmar Pty Ltd
    Inventors: Fergus O'Brien, Matthew Roughan
  • Patent number: 7739479
    Abstract: A method of providing physics data within a game program or simulation using a hardware-based physics processing unit having unique architecture designed to efficiently calculate physics related data.
    Type: Grant
    Filed: November 19, 2003
    Date of Patent: June 15, 2010
    Assignee: NVIDIA Corporation
    Inventors: Jean Pierre Bordes, Curtis Davis, Monier Maher, Manju Hegde, Otto A. Schmid
  • Patent number: 7739647
    Abstract: The present invention provides a configurable domain specific abstract core (DSAC) for implementing applications within any domain. The DSAC comprises at least one function specific abstract module (FSAM) configurable at a plurality of stages for implementing a predetermined function belonging to one or more applications in the domain. The FSAM comprises a function specific abstract logic (FSAL) for implementing functional logic and a micro state engine (MSE) for generating and monitoring one or more control signals, at least one of the control signals being generated by execution of a dynamic script for controlling the FSAL.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: June 15, 2010
    Assignee: Infosys Technologies Ltd.
    Inventors: Guruprasad Ramananda Athani, Ranju Philip Abraham, Shashi Basavappa Chinnikatte
  • Patent number: 7734894
    Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.
    Type: Grant
    Filed: April 28, 2008
    Date of Patent: June 8, 2010
    Assignee: Tilera Corporation
    Inventors: David Wentzlaff, Anant Agarwal