Array Processor Patents (Class 712/10)

Array processor element interconnection (Class 712/11)

Array processor operation (Class 712/16)

Digital signal processor for wireless baseband processing

Patent number: 7007155

Abstract: A circuit employing an array of reconfigurable processing elements for wireless baseband processing. The circuit includes a first linear array of reconfigurable processing elements for processing signals from a first channel, and a second linear array of reconfigurable processing elements, coupled in parallel with the first linear array of reconfigurable processing elements, for processing signals from a second channel that is concurrent with the first channel. The circuit also includes a frame buffer array having a number of frame buffers that corresponds to a number of reconfigurable processing elements in the first and second linear arrays of processing elements. A point-to-point data bus is connected between each reconfigurable processor and an associated frame buffer. A shared data bus is connected between the first and second linear arrays of reconfigurable processing elements and the frame buffer array.

Type: Grant

Filed: September 17, 2002

Date of Patent: February 28, 2006

Assignee: Morpho Technologies

Inventors: Behzad Barjesteh Mohebbi, Fadi Joseph Kurdahi
Multiprocessor data processing system having a data routing mechanism regulated through control communication

Patent number: 7007128

Abstract: A data interconnect and routing mechanism reduces data communication latency, supports dynamic route determination based upon processor activity level/traffic, and implements an architecture that supports scalable improvements in communication frequencies. In one implementation, a data processing system includes at least first through third processing units, data storage coupled to the plurality of processing units, and an interconnect fabric. The interconnect fabric includes at least a first data bus coupling the first processing unit to the second processing unit and a second data bus coupling the third processing unit to the second processing unit so that the first and third processing units can transmit data traffic to the second processing unit. The data processing system further includes a control channel coupling the first and third processing units.

Type: Grant

Filed: January 7, 2004

Date of Patent: February 28, 2006

Assignee: International Business Machines Corporation

Inventors: Ravi Kumar Arimilli, Jerry Don Lewis, Vicente Enrique Chung, Jody Bern Joyner
Flow of streaming data through multiple processing modules

Patent number: 7000022

Abstract: Frame-based streaming data flows through a graph of multiple interconnected processing modules. The modules have a set of performance parameters whose values specify the sensitivity of each module to the selection of certain resources of a system. A user specifies overall goals for an actual graph for processing a given type of data for a particular purpose. A flow manager constructs the graph as a sequence of module interconnections required for processing the data, in response to the parameter values of the individual modules in the graph in view of the goals for the overall graph as a whole, and divides it into pipes each having one or more modules and each assigned to a memory manager for handling data frames in the pipe.

Type: Grant

Filed: June 7, 2004

Date of Patent: February 14, 2006

Assignee: Microsoft Corporation

Inventors: Rafael S. Lisitsa, George H. J. Shaw, Dale A. Sather, Bryan A. Woodruff
Buffered coscheduling for parallel programming and enhanced fault tolerance

Patent number: 6993764

Abstract: A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval.

Type: Grant

Filed: June 28, 2001

Date of Patent: January 31, 2006

Assignee: The Regents of the University of California

Inventors: Fabrizio Petrini, Wu-chun Feng
Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller

Patent number: 6986020

Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

Type: Grant

Filed: September 21, 2004

Date of Patent: January 10, 2006

Assignee: PTS Corporation

Inventors: Edwin F. Barry, Nikos P. Pitsianis, Kevin Coopman
Highly versatile process control system controller

Patent number: 6973508

Abstract: A versatile controller that can be used as either a stand-alone controller in a relatively small process plant or as one of numerous controllers in a distributed process control system depending on the needs of the process plant includes a processor adapted to be programmed to execute one or more programming routines and a memory, such as a non-volatile memory, coupled to the processor and adapted to store the one or more programming routines to be executed on the processor. The versatile controller also includes a plurality of field device input/output ports communicatively connected to the processor, a configuration communication port connected to the processor and to the memory to enable the controller to be configured with the programming routines and a second communication port which enables a user interface to be intermittently connected to the controller to view information stored within the controller memory.

Type: Grant

Filed: February 12, 2002

Date of Patent: December 6, 2005

Assignee: Fisher-Rosemount Systems, Inc.

Inventors: Rusty Shepard, Ken Krivoshein, Dan Christensen, Gary Law, Kent Burr, Mark Nixon
Apparatus and method for accessing a mass storage device in a fault-tolerant server

Patent number: 6971043

Abstract: An apparatus and method for accessing a first local mass storage device or a second local mass storage device in a fault-tolerant server. In one embodiment, the fault-tolerant server establishes communication between a first computing element and a first local mass storage device. The fault-tolerant server also establishes communications between a second computing element and a second local mass storage device. In one embodiment, the first computing element and the second computing element issue substantially similar instruction streams to one of the local mass storage devices.

Type: Grant

Filed: April 11, 2001

Date of Patent: November 29, 2005

Assignee: Stratus Technologies Bermuda LTD

Inventors: Michael McLoughlin, Gerry Griffin
High-speed vision sensor with image processing function

Patent number: 6970196

Abstract: A high-speed vision sensor includes: an analog-to-digital converter array 13, in which one analog-to-digital converter 210 is provided in correspondence with all the photodetector elements 120 that are located on each row in a photodetector array 11; a parallel processing system 14 that includes processor elements 400 and shift registers 410, both of which form a one-to-one correspondence with the photodetector elements 120; and data buses 17, 18 and data buffers 19 and 20 for data transfer to processing elements 400. The processing elements 400 perform high-speed image processing between adjacent pixels by parallel processings. By using the data buses 17, 18, it is possible to attain, at a high rate of speed, such calculation processing that requires data supplied from outside.

Type: Grant

Filed: March 10, 2000

Date of Patent: November 29, 2005

Assignee: Hamamatsu Photonics K.K.

Inventors: Masatoshi Ishikawa, Haruyoshi Toyoda
Processor cluster architecture and associated parallel processing methods

Patent number: 6959372

Abstract: A parallel processing architecture comprising a cluster of embedded processors that share a common code distribution bus. Pages or blocks of code are concurrently loaded into respective program memories of some or all of these processors (typically all processors assigned to a particular task) over the code distribution bus, and are executed in parallel by these processors. A task control processor determines when all of the processors assigned to a particular task have finished executing the current code page, and then loads a new code page (e.g., the next sequential code page within a task) into the program memories of these processors for execution. The processors within the cluster preferably share a common memory (1 per cluster) that is used to receive data inputs from, and to provide data outputs to, a higher level processor. Multiple interconnected clusters may be integrated within a common integrated circuit device.

Type: Grant

Filed: February 18, 2003

Date of Patent: October 25, 2005

Assignee: Cogent Chipware Inc.

Inventors: Richard F. Hobson, Bill Ressl, Allan R. Dyck
Methods and apparatus for providing data transfer control

Patent number: 6944683

Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.

Type: Grant

Filed: February 19, 2004

Date of Patent: September 13, 2005

Assignee: PTS Corporation

Inventors: Edwin Frank Barry, Edward A. Wolff
Display module driving system and digital to analog converter for driving display

Patent number: 6940496

Abstract: A display module driving system wherein digital pixel data for an image to be displayed is provided to a plurality of column drivers on a row by row basis in serial format over a plurality of dedicated bus lines rather than a single parallel bus line. Digital pixel data for a complete image row is divided into segments, wherein the number of segments is each to the number of column drivers. Each segments is then serialized and transmitted to a corresponding column driver such that the digital pixel data for an entire row is transferred to each of the plurality of column drivers at the same time. The column drivers receive the segments and rearrange the data into parallel. The pixels are then transferred to a digital to analog converter, preferably two pixels at a time, where each pixel is converted into analog red, green and blue signals.

Type: Grant

Filed: June 4, 1999

Date of Patent: September 6, 2005

Assignee: Silicon, Image, Inc.

Inventor: Eun-Gu Kim
Display apparatus in which recovery time is short in fault occurrence

Patent number: 6933942

Abstract: In a display apparatus, a display instruction generating unit outputs a display instruction. A plurality of display processing units are arranged in parallel, and each of the plurality of display processing units generates display data in response to the display instruction from the display instruction generating unit. A display switching unit selects one of the plurality of display processing units and outputs the display data from the selected display processing unit to the display unit. Thus, a display unit displays the display data.

Type: Grant

Filed: July 16, 2002

Date of Patent: August 23, 2005

Assignee: NEC Corporation

Inventor: Junichi Tamai
Methods and apparatus for pipelined bus

Patent number: 6912608

Abstract: Techniques for a pipelined bus which provides a very high performance interface to computing elements, such as processing elements, host interfaces, memory controllers, and other application-specific coprocessors and external interface units. The pipelined bus is a robust interconnected bus employing a scalable, pipelined, multi-client topology, with a fully synchronous, packet-switched, split-transaction data transfer model. Multiple non-interfering transfers may occur concurrently since there is no single point of contention on the bus. An aggressive packet transfer model with local conflict resolution in each client and packet-level retries allows recovery from collisions and buffer backups. Clients are assigned unique IDs, based upon a mapping from the system address space allowing identification needed for quick routing of packets among clients.

Type: Grant

Filed: April 25, 2002

Date of Patent: June 28, 2005

Assignee: PTS Corporation

Inventors: Edward A. Wolff, David Baker, Bryan Garnett Cope, Edwin Franklin Barry
High speed software driven emulator comprised of a plurality of emulation processors with a method to allow high speed bulk read/write operation synchronous DRAM while refreshing the memory

Patent number: 6901359

Abstract: A system and method for bulk transfer to and from the SRAMs in which a starting memory address is latched and is then incremented every clock cycle to generate a new memory address. The addresses are decoded and memory requests are pipelined to the SRAM memory, one every clock cycle. When the memory controller detects transfer of the boundary of a predetermined number of clock cycles or words (e.g. 64 words or four clock cycles) the burst mode of data transfer is stopped and the memory controller waits for a “done” signal before resuming another cycle of the burst transfer mode. The memory controller on detecting a request on this address boundary first does a memory refresh followed by a requested operation; e.g. a continuation of the transfer operation.

Type: Grant

Filed: September 6, 2000

Date of Patent: May 31, 2005

Assignee: Quickturn Design Systems, Inc.

Inventors: William F. Beausoleil, R. Bryan Cook, Tak-kwong Ng, Helmut Roth, Peter Tannenbaum, Lawrence A. Thomas, Norton J. Tomassetti
Method and apparatus for integration of communication links with a remote direct memory access protocol

Patent number: 6901491

Abstract: In one embodiment, a server is provided. The server includes multiple application processor chips. Each of the multiple application processor chips includes multiple processing cores. Multiple memories corresponding to the multiple processor chips are included. The multiple memories are configured such that one processor chip is associated with one memory. A plurality of fabric chips enabling each of the multiple application processor chips to access any of the multiple memories are included. The data associated with one of the multiple application processor chips is stored across each of the multiple memories. In one embodiment, the application processor chips include a remote direct memory access (RDMA) and striping engine. The RDMA and striping engine is configured to store data in a striped manner across the multiple memories. A method for allowing multiple processors to exchange information through horizontal scaling is also provided.

Type: Grant

Filed: October 16, 2002

Date of Patent: May 31, 2005

Assignee: Sun Microsystems, Inc.

Inventors: Leslie D. Kohn, Michael K. Wong
Rearranging data between vector and matrix forms in a SIMD matrix processor

Patent number: 6898691

Abstract: This invention discloses a group of instructions, block4 and block4v, in a matrix processor 16 that rearranges data between vector and matrix forms of an A×B matrix of data 120 where the data matrix includes one or more 4×4 sub-matrices of data 160-166. The instructions of this invention simultaneously swaps row or columns between the first 140, second 142, third 144, and fourth 146 matrix registers according to the instructions that perform predefined matrix tensor operations on the data matrix that includes one of the following group of operations: swapping rows between the different individual matrix registers, or swapping columns between the different individual matrix registers. Additionally, successive iterations or combinations of the block4 and or block4v instructions perform standard tensor matrix operations from the following group of matrix operations: transpose, shuffle, and deal.

Type: Grant

Filed: June 6, 2002

Date of Patent: May 24, 2005

Assignee: Intrinsity, Inc.

Inventors: James S. Blomgren, Timothy A. Olson, Christophe Harle
Tightly coupled and scalable memory and execution unit architecture

Patent number: 6895452

Abstract: An architecture is shown where an execution unit is tightly coupled to a shared, reconfigurable memory system. Sequence control signals drive a DMA controller and address generator to control the transfer of data from the shared memory to a bus interface unit (BIU). The sequence control signals also drive a data controller and address generator which controls transfer of data from the shared memory to an execution unit interface (EUI). The EUI is connected to the execution unit operates under control of the data controller and address generator to transfer vector data to and from the shared memory. The shared memory is configured to swap memory space in between the BIU and the execution unit so as to support continuous execution and I/O. A local fast memory is coupled to the execution unit. A local address generator controls the transfer of scalar data between the local fast memory and the execution unit.

Type: Grant

Filed: October 16, 1998

Date of Patent: May 17, 2005

Assignee: Marger Johnson & McCollom, P.C.

Inventors: Ron Coleman, Brent LeBack, Stuart Hawkinson, Richard Rubinstein
Signal processing arrangement

Patent number: 6873287

Abstract: The present invention relates to a method and an arrangement suitable for embedded signal processing, comprising a number of computational units (100), each computational unit comprising a number of processing elements (20) capable of working independently and transmitting data simultaneously. Said computational units are arranged in clusters, work independently, and transmit data simultaneously, and that said processing elements (20) are globally and regularly inter-connected optically in a hypercube topology and transformed into a planar waveguide.

Type: Grant

Filed: November 1, 2001

Date of Patent: March 29, 2005

Assignee: Telefonaktiebolaget LM Ericsson

Inventor: Häkan Forsberg
Methods and circuits for measuring clock skew on programmable logic devices

Patent number: 6862548

Abstract: Described are methods for accurately measuring the skew of clock distribution networks on programmable logic devices. Clock distribution networks are modeled using a sequence of oscillators formed on the device using configurable logic. Each oscillator includes a portion of the network, and consequently oscillates at a frequency that depends on the signal propagation delay associated with the included portion of the network. The various oscillator configurations are defined mathematically as the sum of a series of delays, with the period of each oscillator representing the sum. The respective equations of the oscillators are combined to solve for the delay contribution of the included portion of the clock network. The delay associated with the included portion of the clock network can be combined with similar measurements for other portions of the clock network to more completely describe the network.

Type: Grant

Filed: October 30, 2001

Date of Patent: March 1, 2005

Assignee: Xilinx, Inc.

Inventor: Siuki Chan
Data processing system

Patent number: 6859869

Abstract: A data processing system, wherein a data flow processor (DFP) integrated circuit chip is provided which comprises a plurality of orthogonally arranged homogeneously structured cells, each cell having a plurality of logically same and structurally identically arranged modules. The cells are combined and facultatively grouped using lines and columns and connected to the input/output ports of the DFP. A compiler programs and configures the cells, each by itself and facultatively-grouped, such that random logic functions and/or linkages among the cells can be realized. The manipulation of the DFP configuration is performed during DFP operation such that modification of function parts (MACROs) of the DFP can take place without requiring other function parts to be deactivated or being impaired.

Type: Grant

Filed: April 12, 1999

Date of Patent: February 22, 2005

Assignee: PACT XPP Technologies AG

Inventor: Martin Vorbach
Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller

Patent number: 6834295

Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

Type: Grant

Filed: February 23, 2001

Date of Patent: December 21, 2004

Assignee: PTS Corporation

Inventors: Edwin F. Barry, Nikos P. Pitsianis, Kevin Coopman
Methods and apparatus for extended packet communications between multiprocessor clusters

Publication number: 20040255002

Abstract: According to the present invention, methods and apparatus are provided for increasing the efficiency and effectiveness of communications between multiprocessor clusters. Mechanisms for improving the accuracy of information available to an interconnection controller are implemented in order to allow the interconnection controller to increase reliability and reduce latency in a multiple cluster system. Protocol extensions and link layer extensions are provided with packets to convey information between interconnection controllers of separate multiprocessor clusters.

Type: Application

Filed: June 12, 2003

Publication date: December 16, 2004

Applicant: Newisys, Inc., A Delaware corporation

Inventors: Rajesh Kota, Shashank Newawarker, Guru Prasadh, Carl Zeitler, David B. Glasco
Split embedded dram processor

Publication number: 20040250045

Abstract: A processing architecture includes a first CPU core portion coupled to a second embedded dynamic random access memory (DRAM) portion. These architectural components jointly implement a single processor and instruction set. Advantageously, the embedded logic on the DRAM chip implements the memory intensive processing tasks, thus reducing the amount of traffic that needs to be bussed back and forth between the CPU core and the embedded DRAM chips. The embedded DRAM logic monitors and manipulates the instruction stream into the CPU core. The architecture of the instruction set, data paths, addressing, control, caching, and interfaces are developed to allow the system to operate using a standard programming model. Specialized video and graphics processing systems are developed. Also, an extended very long instruction word (VLIW) architecture implemented as a primary VLIW processor coupled to an embedded DRAM VLIW extension processor efficiently deals with memory intensive tasks.

Type: Application

Filed: July 2, 2004

Publication date: December 9, 2004

Inventor: Eric M. Dowling
Apparatus and a method to provide higher bandwidth or processing power on a bus

Patent number: 6826645

Abstract: A method and apparatus in which an arbiter links to a processor having a flexible architecture, and the processor connects to a device through a point to point bus.

Type: Grant

Filed: December 13, 2000

Date of Patent: November 30, 2004

Assignee: Intel Corporation

Inventor: Chakravarthy Kosaraju
Handling interrupts in a system having multiple data processing units

Publication number: 20040236879

Abstract: An interrupt controller is provided for processing interrupt requests in a system having a plurality of data processing units operable to service those interrupt requests, each interrupt request having an associated priority level. The interrupt controller comprises request logic operable to receive an indication of unserviced interrupt requests, to apply predetermined criteria to determine which of said plurality of data processing units are candidate data processing units for servicing at least one of said unserviced interrupt requests, and to issue a request signal to each said candidate data processing unit. Priority encoding logic is operable to determine a highest priority unserviced interrupt request based on the associated priority levels of the unserviced interrupt requests.

Type: Application

Filed: May 23, 2003

Publication date: November 25, 2004

Inventors: Daren Croxford, Man Cheung Joseph Yiu
Method and system for processing program for parallel processing purposes, storage medium having stored thereon program getting program processing executed for parallel processing purposes, and storage medium having stored thereon instruction set to be executed in parallel

Publication number: 20040230770

Abstract: In a program processing procedure specially designed to perform compilation for parallel processing purposes, a method and system for increasing the program execution rate of a target machine is provided. A compiler front end translates source code into intermediate code that has been divided into basic blocks. A parallelizer converts the intermediate code, which has been generated by the compiler front end, into a parallelly executable form. An execution order determiner determines the order of the basic blocks to be executed. An expanded basic block parallelizer subdivides the intermediate code, which has already been divided into the basic blocks, into execution units, each of which is made up of parallelly executable instructions, following the order determined and on the basic block basis.

Type: Application

Filed: June 23, 2004

Publication date: November 18, 2004

Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.

Inventors: Kensuke Odani, Taketo Heishi
Method for forming a single instruction multiple data massively parallel processor system on a chip

Publication number: 20040221135

Abstract: A single chip active memory includes a plurality of memory stripes, each coupled to a full word interface and one of a plurality of processing element (PE) sub-arrays. The large number of couplings between a PE sub-array and its associated memory stripe are managed by placing the PE sub-arrays so that their data paths run at right angle to the data paths of the plurality of memory stripes. The data lines exiting the memory stripes are run across the PE sub-arrays on one metal layer. At the appropriate locations, the data lines are coupled to another orthogonally oriented metal layer to complete the coupling between the memory stripe and its associated PE sub-array. The plurality of PE sub-arrays are mapped to form a large logical array, in which each PE is coupled to four other PEs. Physically distant PEs are coupled using current mode differential logical couplings an drivers to insure good signal integrity at high operational speeds. Each PE contains a small DRAM register array.

Type: Application

Filed: June 4, 2004

Publication date: November 4, 2004

Inventor: Graham Kirsch
Method for load balancing an n-dimensional array of parallel processing elements

Publication number: 20040216119

Abstract: One aspect of the present invention relates to a method for balancing the load of an n-dimensional array of processing elements (PEs), wherein each dimension of the array includes the processing elements arranged in a plurality of lines and wherein each of the PEs has a local number of tasks associated therewith. The method comprises balancing at least one line of PEs in a first dimension, balancing at least one line of PEs in a next dimension, and repeating the balancing at least one line of PEs in a next dimension for each dimension of the n-dimensional array. The method may further comprise selecting one or more lines within said first dimension and shifting the number of tasks assigned to PEs in said selected one or more lines.

Type: Application

Filed: October 20, 2003

Publication date: October 28, 2004

Inventor: Mark Beaumont
Method for rounding values for a plurality of parallel processing elements

Publication number: 20040215925

Abstract: A method for calculating a local mean number of tasks for each processing element (PEr) in a parallel processing system, wherein each processing element (PEr) has a local number of tasks associated therewith and wherein r represents the number for a selected processing element, the method comprising assigning a value (Er) to the each processing element (PEr), summing a total number of tasks present on the parallel processing system and the value (Er) for the each processing element (PEr), dividing the sum of the total number of tasks present on the parallel processing system and the value (Er) for the each processing element (PEr) by a total number of processing elements in the parallel processing system and truncating a fractional portion of the divided sum for the each processing element.

Type: Application

Filed: October 20, 2003

Publication date: October 28, 2004

Inventor: Mark Beaumont
Graphics system configured to parallel-process graphics data using multiple pipelines

Patent number: 6801202

Abstract: A method and computer graphics system capable of implementing multiple pipelines for the parallel processing of graphics data. For certain data, a requirement may exist that the data be processed in order. The graphics system may use a set of tokens to reliably switch between ordered and unordered data modes. Furthermore, the graphics system may be capable of super-sampling and performing real-time convolution. In one embodiment, the computer graphics system may comprise a graphics processor, a sample buffer, and a sample-to-pixel calculation unit. The graphics processor may be configured to receive graphics data and to generate a plurality of samples for each of a plurality of frames. The sample buffer, which is coupled to the graphics processor, may be configured to store the samples. The sample-to-pixel calculation unit is programmable to generate a plurality of output pixels by filtering the rendered samples using a filter.

Type: Grant

Filed: June 28, 2001

Date of Patent: October 5, 2004

Assignee: Sun Microsystems, Inc.

Inventors: Scott R. Nelson, Lisa Grenier, Michael F. Deering
Active memory command engine and method

Publication number: 20040193840

Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.

Type: Application

Filed: July 28, 2003

Publication date: September 30, 2004

Inventor: Graham Kirsch
Matrix processing device in SMP node distributed memory type parallel computer

Publication number: 20040193841

Abstract: In the LU decomposition of a matrix composed of blocks, the blocks to be updated of the matrix are vertically divided in each SMP node connected through a network and each of the divided blocks is allocated to each node. This process is also repeatedly applied to new blocks to be updated later, and the newly divided blocks are also cyclically allocated to each node. Each node updates allocated divided blocks in the original order of blocks. Since by sequentially updating blocks, the amount of processed blocks of each node equally increases, load can be equally distributed.

Type: Application

Filed: March 12, 2004

Publication date: September 30, 2004

Applicant: FUJITSU LIMITED

Inventor: Makoto Nakanishi
System and method for encoding processing element commands in an active memory device

Publication number: 20040193784

Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DRAM control unit (“DCU”) commands to a DRAM control unit or array control unit (“ACU”) commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in the ACU where processing array instructions are stored. The processing array instructions are used to address a decode SRAM containing microinstructions that are used to control the operation of an array of processing elements. The number of bits in each of the microinstructions is substantially greater than the number of bits in the corresponding processing array instruction. The decode SRAM is preferably loaded prior to operation of the active memory based on the operations to be performed by the processing elements.

Type: Application

Filed: July 28, 2003

Publication date: September 30, 2004

Inventor: Graham Kirsch
Processing apparatus for performing preconditioning process through multilevel block incomplete factorization

Patent number: 6799194

Abstract: In a preconditioning process for an iteration method to solve simultaneous linear equations through multilevel block incomplete factorization of a coefficient matrix, a set of variable numbers of variables to be removed is determined at each level of the factorization such that a block matrix comprising coefficients of the variables can be diagonal dominant. The approximate inverse matrix of the block matrix is obtained in iterative computation, and non-zero elements of a coefficient matrix at a coarse level are reduced.

Type: Grant

Filed: June 26, 2001

Date of Patent: September 28, 2004

Assignees: Fujitsu Limited, Australian National University

Inventors: Lutz Grosz, Makoto Nakanishi
Synchronization of vertical retrace for multiple participating graphics computers

Patent number: 6791551

Abstract: A system and method for synchronizing image display and buffer swapping in a multiple processor-multiple display environment. In a master-slave dichotomy, one processor or system is deemed the master and the others act as slaves. The master generates signals used to control vertical retrace and buffer swapping for itself and the slaves. In addition, a synchronization signal generator is provided to synchronize a timing signal between the master and slave systems.

Type: Grant

Filed: November 27, 2001

Date of Patent: September 14, 2004

Assignee: Silicon Graphics, Inc.

Inventors: Shrijeet Mukherjee, Kanoj Sarcar, James Tornes
Affine transformation analysis system and method for image matching

Publication number: 20040175057

Abstract: An affine transformation analysis system and method is provided for matching two images. The novel systolic array image affine transformation analysis system comprising a linear rf-processing means, an affine parameter incremental updating means, and a least square error fitting means is based on a Lie transformation group model of cortical visual motion and stereo processing. Image data is provided to a plurality of component linear rf-processing means each comprising a Gabor receptive field, a dynamical Gabor receptive field, and six Lie germs. The Gabor coefficients of images and affine Lie derivatives are extracted from responses of linear receptive fields, respectively. The differences and affine Lie-derivatives of these Gabor coefficients obtained from each parallel pipelined linear rf-processing components are then input to a least square error fitting means, a systolic array comprising a QR decomposition means and a backward substitution means.

Type: Application

Filed: March 4, 2003

Publication date: September 9, 2004

Inventors: Thomas Tsao, Stanley Yuen
Shared memory array

Patent number: 6782463

Abstract: Disclosed is a device comprising a core processing circuit coupled to a single memory array which is partitioned into at least a first portion as a cache memory of the core processing circuit, and a second portion as a memory accessible by the one or more data transmission devices through a data bus independently of the core processing circuit.

Type: Grant

Filed: September 14, 2001

Date of Patent: August 24, 2004

Assignee: Intel Corporation

Inventors: Mark A. Schmisseur, Jeff McCoskey, Timothy J. Jehl
Method and apparatus for image processing

Publication number: 20040156547

Abstract: An image processing system includes, in part, an image processing engine adapted to perform object-independent processing corresponding to a first processing layer of the image processing system, a post processing engine adapted to perform object-dependent processing corresponding to a second processing layer of the image processing system, and a processing engine adapted to perform object composition, recognition and association corresponding to a third processing layer of the image processing system The image processing engine includes a multitude of processors each associated with a different one of the pixels of the image. The post processing engine includes an N-way symmetric multi-processing system (SMP) having disposed therein N DFT engines and N matrix multiplication engines, where N is an integer greater than 1. The multitude of the processors of the image processing engine are formed on a semiconductor substrate different from the semiconductor substrate on which images are captured.

Type: Application

Filed: January 15, 2004

Publication date: August 12, 2004

Applicant: Parimics, Inc.

Inventor: Axel K. Kloth
Method and apparatus for image processing

Publication number: 20040156546

Abstract: An image processing system processes images via a first processing layer adapted to perform object-independent processing, a second processing layer adapted to perform object-dependent processing, and a third processing layer adapted to perform object composition, recognition and association. The image processing system performs object-independent processing using a plurality of processors each of which is associated with a different one of the pixels of the image. The image processing system performs object-independent processing using a symmetric multi-processor. The plurality of processors may form a massively parallel processor of a systolic array type and configured as a single-instruction multiple-data system. Each of the plurality of the processors is further configured to perform object-independent processing using a unified and symmetric processing of N dimensions in space and one dimension in time.

Type: Application

Filed: January 15, 2004

Publication date: August 12, 2004

Applicant: Parimics, Inc.

Inventor: Axel K. Kloth
Methods and apparatus for manifold array processing

Patent number: 6769056

Abstract: A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.

Type: Grant

Filed: September 24, 2002

Date of Patent: July 27, 2004

Assignee: PTS Corporation

Inventors: Edwin F. Barry, Thomas L. Drabenstott, Gerald G. Pechanek, Nikos P. Pitsianis
Method and device

Publication number: 20040128474

Abstract: The invention concerns a cell array having an intercell structure and specifies how a favorable segmenting of the intercell structure may be executed in order to improve the interaction of the cells.

Type: Application

Filed: January 20, 2004

Publication date: July 1, 2004

Inventor: Martin Vorbach
Broadcast invalidate scheme

Patent number: 6751721

Abstract: A directory-based multiprocessor cache control scheme for distributing invalidate messages to change the state of shared data in a computer system. The plurality of processors are grouped into a plurality of clusters. A directory controller tracks copies of shared data sent to processors in the clusters. Upon receiving an exclusive request from a processor requesting permission to modify a shared copy of the data, the directory controller generates invalidate messages requesting that other processors sharing the same data invalidate that data. These invalidate messages are sent via a point-to-point transmission only to master processors in clusters actually containing a shared copy of the data. Upon receiving the invalidate message, the master processors broadcast the invalidate message in an ordered fan-in/fan-out process to each processor in the cluster.

Type: Grant

Filed: August 31, 2000

Date of Patent: June 15, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: David A. J. Webb, Jr., Richard E. Kessler, Steve Lang, Aaron T. Spink
Field programmable gate array and microcontroller system-on-a-chip

Patent number: 6751723

Abstract: An system-on-a-chip integrated circuit has a field programmable gate array core having logic clusters, static random access memory modules, and routing resources, a field programmable gate array virtual component interface translator having inputs and outputs, wherein the inputs are connected to the field programmable gate array core, a microcontroller, a microcontroller virtual component interface translator having input and outputs, wherein the inputs are connected to the microcontroller, a system bus connected to the outputs of the field programmable gate array virtual component interface translator and also to the outputs of said microcontroller virtual component interface translator, and direct connections between the microcontroller and the routing resources of the field programmable gate array core.

Type: Grant

Filed: September 2, 2000

Date of Patent: June 15, 2004

Assignee: Actel Corporation

Inventors: Arunangshu Kundu, Arnold Goldfein, William C. Plants, David Hightower
Causality-based memory ordering in a multiprocessing environment

Publication number: 20040111586

Abstract: Causality-based memory ordering in a multiprocessing environment. A disclosed embodiment includes a plurality of processors and arbitration logic coupled to the plurality of processors. The processors and arbitration logic maintain processor consistency yet allow stores generated in a first order by any two or more of the processors to be observed consistent with a different order of stores by at least one of the other processors. Causality monitoring logic coupled to the arbitration logic monitors any causal relationships with respect to observed stores.

Type: Application

Filed: December 2, 2003

Publication date: June 10, 2004

Inventor: Deborah T. Marr
Array type processor with state transition controller identifying switch configuration and processing element instruction address

Patent number: 6738891

Abstract: To execute all processing in an array section of an array-type processor, each processor must execute processing of different types, i.e., processing of an operating unit and processing of a random logic circuit, which limits its size and processing performance. A data path section including processors arranged in an array are connected via programmable switches to primarily execute processing of operation and a state transition controller configured to easily implement a state transition function to control state transitions are independently disposed. These sections are configured in customized structure for respective processing purposes to efficiently implement and achieve the processing of operation and the control operation.

Type: Grant

Filed: February 23, 2001

Date of Patent: May 18, 2004

Assignee: NEC Corporation

Inventors: Taro Fujii, Masato Motomura, Koichiro Furuta
Arrangement with a plurality of processors having an interface for a collective memory

Patent number: 6738840

Abstract: A data processing arrangement comprises a plurality of processors and a memory interface via which the processors can access a collective memory. The memory interface comprises an interface memory (SRAM) for temporarily storing data belonging to different processors. The memory interface also comprises a control circuit for controlling the interface memory in such a manner that it forms a FIFO memory for each of the different processors. This makes to possible to realize implementations at a comparatively low cost in comparison with a memory interface comprising a separate FIFO memory for each processor.

Type: Grant

Filed: August 17, 2000

Date of Patent: May 18, 2004

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Thierry Nouvet, Hugues De Perthuis, Stéphane Mutz
Parallel-processing apparatus and method

Patent number: 6735684

Abstract: A parallel-processing apparatus includes a plurality of cells, variable-delay circuits, a signal output unit, a delay counter, and an accumulation unit. Each cell has a processing circuit for performing arbitrary processing. The variable-delay circuits change the signal propagation delay in accordance with the processing results of the processing circuits. The signal output unit outputs a measurement input signal to the first variable-delay circuit of a variable-delay circuit array. The delay counter receives the measurement input signal output form the signal output unit and a measurement output signal output from the variable-delay circuit array, and obtains the signal propagation delay time of the variable-delay circuit array upon the basis of the measurement input and output signals. The accumulation unit accumulates the processing results of the processing circuits. A parallel processing method is also disclosed.

Type: Grant

Filed: September 13, 2000

Date of Patent: May 11, 2004

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Satoshi Shigematsu, Hiroki Morimura, Katsuyuki Machida
Parity checking device and method in data communication system

Patent number: 6718514

Abstract: There is provided parity checking device and method in a data communication system. In the parity checking device, a controller determines loop occurring times according to the length of the data and the number of bits to be shifted according to the data or XOR operation results and determines whether the data has an error based on a final XOR operation result. A first register and a second register store the data or the XOR operation results under the control of the controller. A shifter receives the output of the first register and shifts the received bits by the shift bit number received from the controller. An operation unit receives the outputs of the shifter and the second register, performs an XOR operation between the received data, and outputs an XOR operation result under the control of the controller.

Type: Grant

Filed: December 29, 2000

Date of Patent: April 6, 2004

Assignee: Samsung Electronics Co., Ltd.

Inventor: Myung-Goo Kang
Semiconductor integrated circuit device having pipeline stage and designing method therefor

Patent number: 6711724

Abstract: Logic circuits are arranged to constitute a pipeline with a clock signal cycle period set longer than a target cycle period by a gain obtained when replacing a flip-flop circuit by latch circuits. Then, the clock signal cycle period is changed to the target cycle period, to detect a critical path, on which a setup condition error occurs in the pipeline. After replacing the flip-flop circuit related to this error path by complementarily operating latch circuits, related logic circuits are rearranged according to the replacing latch circuits, to meet various operating parameters. In this way, it becomes possible to readily design a pipeline that accurately operates synchronously with a high-speed clock signal.

Type: Grant

Filed: December 16, 2002

Date of Patent: March 23, 2004

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventor: Atsushi Yoshikawa
System and method using differential branch latency processing elements

Publication number: 20040030872

Abstract: The invention is a system and method for executing a program that comprises a plurality of basic blocks on a computer system that comprises a plurality of processing elements. The invention generates a branch instruction by one processing element of the plurality of processing elements, sends the branch instruction to the plurality of processing elements. The invention then independently branches to a target of the branch instruction by each of the processing elements of the plurality of processing elements when each processing element receives the sent branch instruction. At least one processing element of the plurality of processing elements receives the branch instruction at a time later than another processing element of the plurality of processing elements.

Type: Application

Filed: August 8, 2002

Publication date: February 12, 2004

Inventor: Michael S. Schlansker

prev … 4 5 6 7 8 9 10 next