Array Processor Operation Patents (Class 712/16)

Application specific (Class 712/17)

Data flow array processor (Class 712/18)

Systolic array processor (Class 712/19)

Multimode (e.g., mimd to simd, etc.) (Class 712/20)

Multiple instruction, multiple data (mimd) (Class 712/21)

Single instruction, multiple data (simd) (Class 712/22)

Parallel data processing apparatus

Patent number: 7526630

Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.

Type: Grant

Filed: January 4, 2007

Date of Patent: April 28, 2009

Assignee: Clearspeed Technology, PLC

Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russel David, Ray McConnell, Tim Day, Trey Greer
Distributed grid computing method utilizing processing cycles of mobile phones

Patent number: 7515899

Abstract: Additional computing power is captured using the idle processing power of mobile phones incorporated into a grid computing system, wherein the system is capable of pushing projects out to available mobile phones for processing during idle operation times. To further efficiently utilize the unused processing cycles of mobile phones, a unique protocol is utilized to coordinate processing tasks which makes use of existing short messages techniques to communicate projects. The unique protocol is combination of bootstrapping using standard compression techniques along with an adaptive compression scheme.

Type: Grant

Filed: April 23, 2008

Date of Patent: April 7, 2009

Assignee: International Business Machines Corporation

Inventors: Hollie Carr, Peter Mattison, Christopher E. Sharp
Informational-signal-processing apparatus, functional block, and method of controlling the functional block

Patent number: 7509442

Abstract: An informational-signal-processing apparatus has a plurality of functional blocks and a control block that controls operations of the functional blocks. Each of the functional blocks performs a series of items of processing. The control block or a predetermined block among the control block and the functional blocks distributes a global command. Each of the functional blocks receives the global command and operates adaptively based on the received global command. The functional blocks output a block-to-block synchronizing signal at an output timing of a processed informational signal that has been performed on the basis of the global command.

Type: Grant

Filed: November 30, 2006

Date of Patent: March 24, 2009

Assignee: Sony Corporation

Inventors: Seiji Wada, Tetsujiro Kondo, Yoshihiro Wakita, Takuya Oshima
Hardware resource based mapping of cooperative thread arrays (CTA) to result matrix tiles for efficient matrix multiplication in computing system comprising plurality of multiprocessors

Patent number: 7506134

Abstract: The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.

Type: Grant

Filed: June 16, 2006

Date of Patent: March 17, 2009

Assignee: NVIDIA Corporation

Inventors: Norbert Juffa, Radoslav Danilak
Methodology for scheduling, partitioning and mapping computational tasks onto scalable, high performance, hybrid FPGA networks

Patent number: 7506297

Abstract: An automatically reconfigurable high performance FPGA system that includes a hybrid FPGA network and an automated scheduling, partitioning and mapping software tool adapted to configure the hybrid FPGA network in order to implement a functional task. The hybrid FPGA network includes a plurality of field programmable gate arrays, at least one processor, and at least one memory. The automated software tool adapted to carry out the steps of scheduling portions of a functional task in a time sequence, partitioning a plurality of elements of the hybrid FPGA network by allocating or assigning network resources to the scheduled portions of the functional task, mapping the partitioned elements into a physical hardware design for implementing the functional task on the plurality of elements of the hybrid FPGA network, and iteratively repeating the scheduling, partitioning and mapping steps to reach an optimal physical hardware design.

Type: Grant

Filed: June 15, 2005

Date of Patent: March 17, 2009

Assignee: University of North Carolina at Charlotte

Inventors: Arindam Mukherjee, Arun Ravindran
System and method using embedded microprocessor as a node in an adaptable computing machine

Patent number: 7502915

Abstract: The present invention provides an adaptive computing engine (ACE) that includes processing nodes having different capabilities such as arithmetic nodes, bit-manipulation nodes, finite state machine nodes, input/output nodes and a programmable scalar node (PSN). In accordance with one embodiment of the present invention, a common architecture is adaptable to function in either a kernel node, or k-node, or as general purpose RISC node. The k-node acts as a system controller responsible for adapting other nodes to perform selected functions. As a RISC node, the PSN is configured to perform computationally intensive applications such as signal processing.

Type: Grant

Filed: September 29, 2003

Date of Patent: March 10, 2009

Assignee: NVIDIA Corporation

Inventors: Rojit Jacob, Dan Minglun Chuang
Instruction vector-mode processing in multi-lane processor by multiplex switch replicating instruction in one lane to select others along with updated operand address

Patent number: 7493475

Abstract: An improved superscalar processor. The processor includes multiple lanes, allowing multiple instructions in a bundle to be executed in parallel. In vector mode, the parallel lanes may be used to execute multiple instances of a bundle, representing multiple iterations of the bundle in a vector run. Scheduling logic determines whether, for each bundle, multiple instances can be executed in parallel. If multiple instances can be executed in parallel, coupling circuitry couples an instance of the bundle from one lane into one or more other lanes. In each lane, register addresses are renamed to ensure proper execution of the bundles in the vector run. Additionally, the processor may include a register bank separate from the architectural register file. Renaming logic can generate addresses to this separate register bank that are longer than used to address architectural registers, allowing longer vectors and more efficient processor operation.

Type: Grant

Filed: November 15, 2006

Date of Patent: February 17, 2009

Assignee: STMicroelectronics, Inc.

Inventor: Osvaldo M. Colavin
MECHANISM FOR IMPLEMENTING A MICROCODE PATCH DURING FABRICATION

Publication number: 20090031103

Abstract: A patch apparatus in a microprocessor is provided. The patch apparatus includes a plurality of fuse banks and an array controller. The plurality of fuse banks is configured to store associated patch records that are employed to patch microcode or circuits in the microprocessor. The array controller is coupled to the plurality of fuse banks, and is configured to read the associated patch records, and is configured to provide the associated patch records to a patch loader, where the patch loader provides patches corresponding to the associated patch records, as prescribed, to designated target patch mechanisms in the microprocessor. The patch loader provides the patches to the designated target patch mechanisms following transition of a microprocessor reset signal and prior to execution of instructions stored in a BIOS ROM.

Type: Application

Filed: July 24, 2007

Publication date: January 29, 2009

Applicant: VIA TECHNOLOGIES

Inventors: G. GLENN HENRY, TERRY PARKS
Image processing method and device

Patent number: 7483595

Abstract: An image processing method and device for processing multiple rows of pixels of an image simultaneously with a single instruction. The processing includes selecting a pixel window having a plurality of pixels of an image spanning across multiple rows and columns, building vertical and horizontal load registers to include the plurality of pixels of the selected pixel window, and simultaneously processing selected pixels of the plurality of pixels included in the vertical and horizontal load registers using a single instruction, wherein the vertical and horizontal load registers are shifted when the selected pixels are processed. Accordingly, a method and device for efficient processing of an image is provided.

Type: Grant

Filed: September 16, 2004

Date of Patent: January 27, 2009

Assignee: Marvell International Technology Ltd.

Inventors: Douglas Gene Keithley, Roy Gideon Moss
Parallel processing device and parallel processing method

Patent number: 7480785

Abstract: A row decoding circuit (171) outputs a select signal to a row set in a row range setting unit (172) to select a select signal line (103), processing results from processing circuits (102) on this row are output to a data output line (104), and a row adder (106) adds processing results output to a data output line (104) of a column set in a column range selector (105).

Type: Grant

Filed: February 13, 2004

Date of Patent: January 20, 2009

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Toshishige Shimamura, Hiroki Morimura, Koji Fujii, Satoshi Shigematsu, Katsuyuki Machida
Method for load balancing an n-dimensional array of parallel processing elements

Patent number: 7472392

Abstract: One aspect of the present invention relates to a method for balancing the load of an n-dimensional array of processing elements (PEs), wherein each dimension of the array includes the processing elements arranged in a plurality of lines and wherein each of the PEs has a local number of tasks associated therewith. The method comprises balancing at least one line of PEs in a first dimension, balancing at least one line of PEs in a next dimension, and repeating the balancing at least one line of PEs in a next dimension for each dimension of the n-dimensional array. The method may further comprise selecting one or more lines within said first dimension and shifting the number of tasks assigned to PEs in said selected one or more lines.

Type: Grant

Filed: October 20, 2003

Date of Patent: December 30, 2008

Assignee: Micron Technology, Inc.

Inventor: Mark Beaumont
Integrated Processor Array, Instruction Sequencer And I/O Controller

Publication number: 20080307196

Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.

Type: Application

Filed: May 28, 2008

Publication date: December 11, 2008

Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
Loosely-biased heterogeneous reconfigurable arrays

Patent number: 7461234

Abstract: A heterogeneous array includes clusters of processing elements. The clusters include a combination of ALUs and multiplexers linked by direct connections and various general-purpose routing networks. The multiplexers are controlled by the ALUs in the same cluster, or alternatively by ALUs in other clusters, via a special purpose routing network. Components of applications configured onto the array are selectively implemented in either multiplexers or ALUs, as determined by the relative efficiency of implementing the component in one or the other type of processing element, and by the relative availability of the processing element types. Multiplexer control signals are generated from combinations of ALU status signals, and optionally routed to control multiplexers in different clusters.

Type: Grant

Filed: May 16, 2005

Date of Patent: December 2, 2008

Assignee: Panasonic Corporation

Inventors: Nicholas John Charles Ray, Andrea Olgiati, Anthony I. Stansfield, Alan D Marshall
Processing system with dedicated local memories and busy identification

Patent number: 7457939

Abstract: A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A processing system is provided for processing programs and data. The processing system has a processing unit and multiple sub-processing units. Each sub-processing unit includes a dedicated local memory for storing programs and data. The dedicated local memory of each respective sub-processing unit is not a cache memory. In an alternative, multiple computing devices may connect to one another via a communications network, and each computing device may include at least one processing element having the processing unit and sub-processing units.

Type: Grant

Filed: October 18, 2004

Date of Patent: November 25, 2008

Assignee: Sony Computer Entertainment Inc.

Inventors: Masakazu Suzuoki, Takeshi Yamazaki
Row and column enable signal activation of processing array elements with interconnection logic to simulate bus effect

Patent number: 7454593

Abstract: The present invention relates to the control of an array of processing elements in a parallel processor using row and column select lines. For each column in the array, a column select line connects to all of the processing elements in the column. For each row in the array, a row select line connecting to all of the processing elements in the row. A processing element in the array may be selected by activation of its row and column select lines. The processing elements are connected to adjacent processing elements by respective segments of a row bus for each row and by respective segments of a column bus for each column. Each row of the array includes a respective column edge register coupled to a processing element at one end of the respective row and to a processing element at the other end of the respective row.

Type: Grant

Filed: April 11, 2003

Date of Patent: November 18, 2008

Assignee: Micron Technology, Inc.

Inventor: Graham Kirsch
Array Type Operation Device

Publication number: 20080282061

Abstract: An array calculation device that includes a processor array composed of a plurality of processor elements having been assigned with orders, acquires an instruction in each cycle, generates, in each cycle, operation control information for controlling an operation of a processor element of a first order, and then generates an instruction to the processor element of the first order in accordance with the operation control information and the acquired instruction, and also generates, in each cycle, operation control information for controlling an operation of each processor element of a next order and onwards, in accordance with operation control information generated for controlling an operation of a processor element of an immediately preceding order, and then generates an instruction to each processor element of the next order and onwards, in accordance with the operation control information generated and the acquired instruction.

Type: Application

Filed: August 2, 2005

Publication date: November 13, 2008

Inventors: Hiroyuki Morishita, Takeshi Tanaka, Masaki Maeda, Yorihiko Wakayama
Array of Boolean logic controlled processing elements with concurrent I/O processing and instruction sequencing

Patent number: 7451293

Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.

Type: Grant

Filed: October 19, 2006

Date of Patent: November 11, 2008

Assignee: Brightscale Inc.

Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
Methods for transmitting data across quantum interfaces and quantum gates using same

Patent number: 7451292

Abstract: Quantum gaps exist between an origin and a destination that heretofore have prevented reliably utilizing the advantages of quantum computing. To predict the outcome of instructions with precision, the input data, preferably a qubit, is collapsed to a point value within the quantum gap based on a software instruction. After collapse the input data is restructured at the destination, wherein dynamics of restructuring are governed by a plurality of gap factors as follows: computational self-awareness; computational decision logic; computational processing logic; computational and network protocol and logic exchange; computational and network components, logic and processes; provides the basis for excitability of the Gap junction and its ability to transmit electronic and optical impulses, integrates them properly, and depends on feedback loop logic; computational and network component and system interoperability; and embodiment substrate and network computational physical topology.

Type: Grant

Filed: August 8, 2003

Date of Patent: November 11, 2008

Inventor: Thomas J Routt
Inter-chip processor control plane communication

Patent number: 7447872

Abstract: An inter-chip communication (ICC) mechanism enables any processor in a pipelined arrayed processing engine to communicate directly with any other processor of the engine over a low-latency communication path. The ICC mechanism includes a unidirectional control plane path that is separate from a data plane path of the engine and that accommodates control information flow among the processors. The mechanism thus enables inter-processor communication without sending messages over the data plane communication path extending through processors of each pipeline.

Type: Grant

Filed: May 30, 2002

Date of Patent: November 4, 2008

Assignee: Cisco Technology, Inc.

Inventors: Russell Schroter, John William Marshall, Kenneth H. Potter
Processor synchronization in a multi-processor computer system

Patent number: 7441100

Abstract: A method for synchronizing a plurality of processors of a multi-processor computer system on a synchronization point is disclosed. The method includes triggering a first set of processors, using a lead processor of the plurality of processors when the lead processor encounters the synchronization point, to enter an exit holding loop. The first set of processors representing the plurality of processors except the lead processor. The triggering the first set of processors is performed without accessing a shared memory area of the multi-processor system. There is also included triggering the plurality of processors, using a tail processor of the plurality of processors when the tail processor encounters the synchronization point, to leave the exit holding loop. The triggering the plurality of processors is performed without accessing the shared memory area of the multi-processor system.

Type: Grant

Filed: February 27, 2004

Date of Patent: October 21, 2008

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Chenghung Justin Chen, John W. Curry, Robert Seymour
Message routing scheme

Publication number: 20080229059

Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.

Type: Application

Filed: March 14, 2007

Publication date: September 18, 2008

Inventor: Michael David May
Method and apparatus for diagnosing broken scan chain based on leakage light emission

Patent number: 7426448

Abstract: A mechanism for diagnosing broken scan chains based on leakage light emission is provided. An image capture mechanism detects light emission from leakage current in complementary metal oxide semiconductor (CMOS) devices. The diagnosis mechanism identifies devices with unexpected light emission. An unexpected amount of light emission may indicate that a transistor is turned off when it should be turned on or vice versa. All possible inputs may be tested to determine whether a problem exists with transistors in latches or with transistors in clock buffers. Broken points in the scan chain may then be determined based on the locations of unexpected light emission.

Type: Grant

Filed: February 3, 2004

Date of Patent: September 16, 2008

Assignee: International Business Machines Corporation

Inventors: Peilin Song, Tian Xia, Alan J. Weger, Franco Stellari, Stanislav V. Polonsky
Method for indirect access to a support interface for memory-mapped resources to reduce system connectivity from out-of-band support processor

Patent number: 7418541

Abstract: A method and apparatus are provided for a support interface for memory-mapped resources. A support processor sends a sequence of commands over and FSI interface to a memory-mapped support interface on a processor chip. The memory-mapped support interface updates memory, memory-mapped registers or memory-mapped resources. The interface uses fabric packet generation logic to generate a single command packet in a protocol for the coherency fabric which consists of an address, command and/or data. Fabric commands are converted to FSI protocol and forwarded to attached support chips to access the memory-mapped resource, and responses from the support chips are converted back to fabric response packets. Fabric snoop logic monitors the coherency fabric and decodes responses for packets previously sent by fabric packet generation logic. The fabric snoop logic updates status register and/or writes response data to a read data register. The system also reports any errors that are encountered.

Type: Grant

Filed: February 10, 2005

Date of Patent: August 26, 2008

Assignee: International Business Machines Corporation

Inventors: James Stephen Fields, Jr., Paul Frank Lecocq, Brian Chan Monwai, Thomas Pflueger, Kevin Franklin Reick, Timothy M. Skergan, Scott Barnett Swaney
Array of parallel programmable processing engines and deterministic method of operating the same

Patent number: 7401333

Abstract: The present invention provides an array of parallel programmable processing engines interconnected by a switching network. At least some of the processing engines execute a thread, and at least some threads communicate with each other through communication objects either internally within one processing engine or through the network. A scheduling step of the parallel programmable processing engines is initiated by one or more events, an event being defined by a change of a state variable of a communication object. The array comprises: means for scheduling a scheduling step of the processing engines, the scheduling means comprising means for executing at least a first set of threads in parallel, means for updating state values of communications objects in response to the parallel executing step, and means for repeatedly and sequentially scheduling the executing means and the updating means until no more events occur. The present invention also provides a deterministic method of operating such an array.

Type: Grant

Filed: August 8, 2001

Date of Patent: July 15, 2008

Assignee: TranSwitch Corporation

Inventor: Ivo Vandeweerd
Method to operate cache-inhibited memory mapped commands to access registers

Patent number: 7392350

Abstract: In a multiprocessor environment, by executing cache-inhibited reads or writes to registers, a scan communication is used to rapidly access registers inside and outside a chip originating the command. Cumbersome locking of the memory location may be thus avoided. Setting of busy latches at the outset virtually eliminates the chance of collisions, and status bits are set to inform the requesting core processor that a command is done and free of error, if that is the case.

Type: Grant

Filed: February 10, 2005

Date of Patent: June 24, 2008

Assignee: International Business Machines Corporation

Inventors: James Stephen Fields, Jr., Michael Stephen Floyd, Paul Frank Lecocq, Larry Scott Leitner, Kevin Franklin Reick
SEMICONDUCTOR INTEGRATED CIRCUIT

Publication number: 20080148010

Abstract: The system design is facilitated by eliminating the increase in data transfer volume of the whole system. In order to facilitate the system design, there are provided an operation unit array, a memory array, a data transfer circuit, and a switch circuit. There are also provided a configuration data management unit for managing the configuration data defining the logical behaviors of the operation unit array, the memory array, the data transfer circuit, and the switch circuit, as well as a state transition management unit capable of controlling the switching of the configuration data. The data transfer circuit includes a control circuit capable of autonomously sorting the data by determining the timing of the data sorting according to the setting included in the configuration data.

Type: Application

Filed: December 14, 2007

Publication date: June 19, 2008

Inventor: Tomoyuki KODAMA
Imaging device

Patent number: 7369683

Abstract: In an imaging device of the present invention, an imaging element 2 is driven in a thinning read-out mode for reading out signal charges from a subset of pixels, or in an all-pixels read-out mode for reading out signal charges from all pixels. When the imaging element 2 is driven in the thinning read-out mode, the imaging device processes and records a series of first image data that is obtained by reading out signal charges from the subset of pixels and that constitutes the moving images. When the imaging element 2 is driven in the all-pixels read-out mode, the imaging device processes and records a series of second image data constituting moving images after the number of pixels of the second image data is thinned, and processes and records a portion of the second image data as a still image without thinning when an instruction to pick up the still image is given while picking up the moving images.

Type: Grant

Filed: August 4, 2004

Date of Patent: May 6, 2008

Assignee: Sanyo Electric Co., Ltd.

Inventors: Akio Kobayashi, Shigeru Miki
Task distribution

Patent number: 7356819

Abstract: Methods, signals, devices and systems are provided for matching tasks with processing units. A region within a multi-faceted task space is allocated to a processing unit. A point in the multi-faceted task space is assigned to a task. The task is then associated with the processing unit if the region allocated to the processing unit is close to the point assigned to the task. The region allocated to a processing unit may be changed. If no assigned point for a task is sufficiently close to any allocated processing unit region, the task is suspended. Overlapping regions may be assigned to different processing units. In some implementations, the union of the allocated regions covers the task space, while in others it does not. Regions may also be allocated to wait conditions and one or more dimensions of a region may be allocated to conventional processor allocators.

Type: Grant

Filed: August 7, 2003

Date of Patent: April 8, 2008

Assignee: Novell, Inc.

Inventors: Glenn Ricart, Del Jensen, Stephen R. Carter
Re-configurable circuit and configuration switching method

Patent number: 7315933

Abstract: The present invention is a re-configurable circuit capable of reducing latency by selecting a route for skipping the FF of an operation unit and outputting data to a connection destination operation unit if an accumulated process time is below an operation cycle allocated to the operation unit. The operation unit comprises at least a selector, a flip-flop and an operator. In a program for generating configuration information for switching the configuration of the operation unit of the re-configurable circuit, the selector selects the use/non-use of the flip-flop, based on the configuration information and selector switching condition is reflected in the configuration information for determining whether to take a route for transferring data inputted to the selector to the operator or a route for transferring the data to the operator skipping the flip-flop.

Type: Grant

Filed: October 6, 2005

Date of Patent: January 1, 2008

Assignee: Fujitsu Limited

Inventor: Seiichi Nishijima
Distributed multi-sample convolution

Patent number: 7266255

Abstract: A multi-chip system is disclosed for distributing the convolution process. Rather than having multiple convolution chips working in parallel with each chip working on a different portion of the screen, a new design utilizes chips working in series. Each chip is responsible for a different interleaved region of screen space. Each chip performs part of the convolution process for a pixel and sends a partial result on to the next chip. The final chip completes the convolution and stores the filtered pixel. An alternate design interconnects chips in groups. The chips within a group operate in series, whereas the groups may operate in parallel.

Type: Grant

Filed: September 26, 2003

Date of Patent: September 4, 2007

Assignee: Sun Microsystems, Inc.

Inventors: Michael A. Wasserman, Paul R. Ramsey, Nathaniel David Naegle
System and method for directing the flow of data and instructions into at least one functional unit

Patent number: 7176914

Abstract: A system and method are provided for directing the flow of data and instructions into at least one functional unit. In one embodiment of a system of components defining a plurality of nodes, a queue network manager (QNM) forming a part of each node, is provided. In this embodiment, the QNM comprises an interface to a network that supports intercommunication among the plurality of nodes, an interface configured to pass messages with a functional unit within the node, a random access memory (RAM) configured to store at least one of a message and a programmable instruction, and logic configured to control an operational aspect of a functional unit based on contents of the programmable instruction.

Type: Grant

Filed: May 16, 2002

Date of Patent: February 13, 2007

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Darel N. Emmot
Policy-based management of a redundant array of independent nodes

Patent number: 7155466

Abstract: An archive cluster application runs in a distributed manner across a redundant array of independent nodes. Each node preferably runs a complete archive cluster application instance. A given nodes provides a data repository, which stores up to a large amount (e.g., a terabyte) of data, while also acting as a portal that enables access to archive files. Each symmetric node has a set of software processes, e.g., a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests to the node for data (i.e., file data), the storage manager manages data read/write functions from a disk associated with the node, and the metadata manager facilitates metadata transactions and recovery across the distributed database. The policy manager implements one or more policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage.

Type: Grant

Filed: October 27, 2004

Date of Patent: December 26, 2006

Assignee: Archivas, Inc.

Inventors: Andres Rodriguez, Jack A. Orenstein, David M. Shaw, Benjamin K. D. Bernhard
Methods and apparatus for providing data transfer control

Patent number: 7130934

Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.

Type: Grant

Filed: April 7, 2005

Date of Patent: October 31, 2006

Assignee: Altera Corporation

Inventors: Edwin Franklin Barry, Edward A. Wolff
Digital communications processor

Patent number: 7100020

Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.

Type: Grant

Filed: May 7, 1999

Date of Patent: August 29, 2006

Assignee: Freescale Semiconductor, Inc.

Inventors: Thomas B. Brightman, Andrew T. Brown, John F. Brown, James A. Farrell, Andrew D. Funk, David J. Husak, Edward J. McLellan, Mark A. Sankey, Paul Schmitt, Donald A. Priore
Method for forming a single instruction multiple data massively parallel processor system on a chip

Patent number: 7069416

Abstract: A single chip active memory includes a plurality of memory stripes, each coupled to a full word interface and one of a plurality of processing element (PE) sub-arrays. The large number of couplings between a PE sub-array and its associated memory stripe are managed by placing the PE sub-arrays so that their data paths run at right angle to the data paths of the plurality of memory stripes. The data lines exiting the memory stripes are run across the PE sub-arrays on one metal layer. At the appropriate locations, the data lines are coupled to another orthogonally oriented metal layer to complete the coupling between the memory stripe and its associated PE sub-array. The plurality of PE sub-arrays are mapped to form a large logical array, in which each PE is coupled to four other PEs. Physically distant PEs are coupled using current mode differential logical couplings an drivers to insure good signal integrity at high operational speeds. Each PE contains a small DRAM register array.

Type: Grant

Filed: June 4, 2004

Date of Patent: June 27, 2006

Assignee: Micron Technology, Inc.

Inventor: Graham Kirsch
Network processor which defines virtual paths without using logical path descriptors

Patent number: 7069557

Abstract: A virtual path feature in which several virtual channels share an assigned amount of bandwidth is implemented in a network processor. The network processor maintains a schedule indicative of respective times at which a plurality of virtual channels are to be serviced. An entry is read from the schedule. The entry corresponds to a current transmit cycle and includes a pointer to a channel descriptor for a virtual channel to be serviced in the current transmit cycle. A data cell for the virtual channel to be serviced in the current cycle is transmitted. An entry is added to the schedule to point to a channel descriptor that is pointed to by the channel descriptor for the virtual channel serviced in the current transmit cycle.

Type: Grant

Filed: May 23, 2002

Date of Patent: June 27, 2006

Assignee: International Business Machines Corporation

Inventor: Merwin Herscher Alferness
Irregular network

Patent number: 7043562

Abstract: Irregularities are provided in at least one dimension of a torus or mesh network for lower average path length and lower maximum channel load while increasing tolerance for omitted end-around connections. In preferred embodiments, all nodes supported on each backplane are connected in a single cycle which includes nodes on opposite sides of lower dimension tori. The cycles in adjacent backplanes hop different numbers of nodes.

Type: Grant

Filed: June 9, 2003

Date of Patent: May 9, 2006

Assignee: Avivi Systems, Inc.

Inventors: William J. Dally, William F. Mann, Philip P. Carvey
Process for automatic dynamic reloading of data flow processors (DFPS) and units with two- or three- dimensional programmable cell architectures (FPGAS, DPGAS, and the like)

Patent number: 7028107

Abstract: A system for communication between a plurality of functional elements in a cell arrangement and a higher-level unit is described. The system may include, for example, a configuration memory arranged between the functional elements and the higher-level unit; and a control unit configured to move at least one position pointer to a configuration memory location in response to at least one event reported by a functional element. At run time, a configuration word in the configuration memory pointed to by at least one of the position pointers is transferred to the functional element in order to perform reconfiguration without the configuration word being managed by a central logic.

Type: Grant

Filed: October 7, 2002

Date of Patent: April 11, 2006

Assignee: Pact XPP Technologies AG

Inventors: Martin Vorbach, Robert Münch
Blocking processing restrictions based on page indices

Patent number: 7020761

Abstract: Processing restrictions of a computing environment are filtered and blocked, in certain circumstances, such that processing continues despite the restrictions. One restriction includes an indication that address translation is prohibited, in response to a buffer miss. When a processing unit of the computing environment is met with this restriction, it performs a comparison of page indices, which indicates whether the address translation can continue. If address translation can continue, the restriction is ignored. The processing unit includes a processor or a pageable entity, as examples.

Type: Grant

Filed: May 12, 2003

Date of Patent: March 28, 2006

Assignee: International Business Machines Corporation

Inventors: Timothy J. Siegel, Bruce A. Wagar, Ute Gaertner, Lisa C. Heller, Erwin F. Pfeffer
Fully scalable computer architecture

Patent number: 6996504

Abstract: A scalable computer architecture capable of performing fully scalable simulations includes a plurality of processing elements (PEs) and a plurality of interconnections between the PEs. In this regard, the interconnections can interconnect each processing element to each neighboring processing element located adjacent the respective processing element, and further interconnect at least one processing element to at least one other processing element located remote from the respective at least one processing element. For example, the interconnections can interconnect the plurality of processing elements according to a fractal-type method or a quenched random method. Further, the plurality of interconnections can include at least one interconnection at each length scale of the plurality of processing elements.

Type: Grant

Filed: November 14, 2001

Date of Patent: February 7, 2006

Assignee: Mississippi State University

Inventors: Mark A. Novotny, Gyorgy Korniss
Buffered coscheduling for parallel programming and enhanced fault tolerance

Patent number: 6993764

Abstract: A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval.

Type: Grant

Filed: June 28, 2001

Date of Patent: January 31, 2006

Assignee: The Regents of the University of California

Inventors: Fabrizio Petrini, Wu-chun Feng
Multi-channel bi-directional bus network with direction sideband bit for multiple context processing elements

Patent number: 6990566

Abstract: A method and an apparatus for configuration of multiple context processing elements (MCPEs) are described. The method and an apparatus is capable of selectively transmitting data over a bidirectional shared bus network including a plurality of channels between pairs of MCPEs in the networked array. The method and an apparatus then selectively transmits a sideband bit indicating a direction in which the data is transmitted in the shared bus network.

Type: Grant

Filed: April 20, 2004

Date of Patent: January 24, 2006

Assignee: Broadcom Corporation

Inventors: Ethan Mirsky, Robert French, Ian Eslick
Parallel computer with improved access to adjacent processor and memory elements

Patent number: 6968442

Abstract: A parallel computer of this invention includes a plurality of memory elements and a plurality of processing elements and each of the processing elements is connected to logically adjacent memory elements. For example, the processing element which corresponds to a logical position (i, j) is connected to the memory elements which correspond to a plurality of logical positions (i, j), (i, j+1), (i+1, j) and (i+1, j+1). It is preferable if each of the memory elements can be accessed from the exterior. According to this invention, efficient memory access can be made and the parallel processing can be performed at high speed without increasing the hardware amount and making the control operation complicated. Further, the operation speed of the image processing can be enhanced by constructing an image memory by use of a plurality of memory elements and causing the processing element to effect the image processing in a distributed and cooperative manner.

Type: Grant

Filed: June 19, 2002

Date of Patent: November 22, 2005

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kenichi Maeda, Nobuyuki Takeda, Yasukazu Okamoto
Pull transfers and transfer receipt confirmation in a datapipe routing bridge

Patent number: 6967950

Abstract: In a network of digital signal processor nodes connected in a peer-to-peer relationship, a data packet sent to a node causes a return transmission from that node. The requester digital signal processor sends a data packet to a target digital signal processor. Upon arrival at the target digital signal processor, its receiver drives the arriving request packet into an I/O memory and triggers a transmitter interrupt. Next, the pull interrupt causes the transmitter to execute on a next packet boundary the pull request packet. Finally, the execution of the pull request causes the transmitter to pull a portion of the local I/O memory and send it back to the requester digital signal processor. The same physical portion of the I/O memory is overlaid with two logical uses, a receiver channel and a transmitter code block.

Type: Grant

Filed: July 13, 2001

Date of Patent: November 22, 2005

Assignee: Texas Instruments Incorporated

Inventors: Peter Galicki, Cheryl S. Shepherd, Jonathan H. Thorn
Apparatus and method for matrix data processing

Patent number: 6944747

Abstract: A matrix data processor is implemented wherein data elements are stored in physical registers and mapped to logical registers. After being stored in the logical registers, the data elements are then treated as matrix elements. By using a series of variable matrix parameters to define the size and location of the various matrix source and destination elements, as well as the operation(s) to be performed on the matrices, the performance of digital signal processing operations can be significantly enhanced.

Type: Grant

Filed: December 9, 2002

Date of Patent: September 13, 2005

Assignee: GemTech Systems, LLC

Inventors: Gopalan N Nair, Gouri G. Nair
Multiprocessor application interface requiring no ultilization of a multiprocessor operating system

Patent number: 6928539

Abstract: A test monitor loaded into a multiprocessor machine comprises a program (31) designed to interpret a script language for writing tests, a program (29) that constitutes a kernel part for conducting the tests according to the scripts, and a library (30) of functions that constitutes an application program interface with the firmware of the machine 1. This monitor implements a method for executing instruction sequences simultaneously in several processors (3, 4, 5) of a multiprocessor machine (1). The method comprises a first step (8) in which a single processor operating system is booted in a first processor (2) and a second step (9) in which the first processor (1) orders at least one other processor (3) of the machine, called an application processor, to execute one or more instruction sequences (17, 18, 19) under the control of said first processor.

Type: Grant

Filed: May 17, 2001

Date of Patent: August 9, 2005

Assignee: Bull S.A.

Inventors: Claude Brassac, Alain Vigor
Unified memory distributed across multiple nodes in a computer graphics system

Patent number: 6919894

Abstract: A system is described that is broadly directed to a system of integrated circuit components. The system comprises a plurality of nodes that are interconnected by communication links. A random access memory (RAM) is connected to each node. At least one functional unit is integrated into each node, and each functional unit is configured to carry out a predetermined processing function. Finally, each RAM includes a coherency mechanism configured to permit only read access to the RAM by other nodes, the coherency mechanism further configured to permit write access to the RAM only by functional units that are local to the node.

Type: Grant

Filed: July 21, 2003

Date of Patent: July 19, 2005

Assignee: Hewlett Packard Development Company, L.P.

Inventors: Darel N. Emmot, Byron A. Alcorn
Method and system for efficient use of a multi-dimensional sharing vector in a computer system

Patent number: 6915388

Abstract: A multiprocessor computer system includes a plurality of processor nodes, a memory, and an interconnect network connecting the plurality of processor nodes to the memory. The memory includes a plurality of lines and a cache coherence directory structure. The plurality of lines includes a first line. The cache coherence directory structure includes a plurality of directory structure entries. Each directory structure entry includes processor pointer information indicating the processor nodes that have cached copies of the first line. The processor pointer information includes a plurality n of bit vectors, where n is an integer greater than one. The n bit vectors define a matrix having a number of locations equal to the product of the number of bits in each of the n bit vectors.

Type: Grant

Filed: July 20, 2001

Date of Patent: July 5, 2005

Assignee: Silicon Graphics, Inc.

Inventor: William A. Huffman
Reconfigurable data path processor

Patent number: 6883084

Abstract: A reconfigurable data path processor comprises a plurality of independent processing elements. Each of the processing elements advantageously comprising an identical architecture. Each processing element comprises a plurality of data processing means for generating a potential output. Each processor is also capable of through-putting an input as a potential output with little or no processing. Each processing element comprises a conditional multiplexer having a first conditional multiplexer input, a second conditional multiplexer input and a conditional multiplexer output. A first potential output value is transmitted to the first conditional multiplexer input, and a second potential output value is transmitted to the second conditional multiplexer output. The conditional multiplexer couples either the first conditional multiplexer input or the second conditional multiplexer input to the conditional multiplexer output, according to an output control command.

Type: Grant

Filed: July 25, 2002

Date of Patent: April 19, 2005

Assignee: University of New Mexico

Inventor: Gregory Donohoe
Cross-chip communication mechanism in distributed node topology

Publication number: 20040215929

Abstract: A method of communicating between processing units on different integrated circuit chips in a multi-processor computer system by issuing a command from a source processing unit to a destination processing unit, receiving the command at the destination processing unit while the destination processing unit is processing program instructions, and accessing registers in clock-controlled components of the destination processing unit without interrupting processing of the program instructions by the destination processing unit. The access may be a read from status or mode registers of the destination processing unit, or write to control or mode registers. Many processing units can be interconnected in a ring topology, and the access command can be passed from the source processing unit through several other processing units before reaching the destination processing unit.

Type: Application

Filed: April 28, 2003

Publication date: October 28, 2004

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Stephen Floyd, Larry Scott Leitner, Kevin Franklin Reick, Kevin Dennis Woodling

prev 1 2 3 4 5 6 7 next