Array Processor Operation Patents (Class 712/16)
  • Patent number: 7526630
    Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.
    Type: Grant
    Filed: January 4, 2007
    Date of Patent: April 28, 2009
    Assignee: Clearspeed Technology, PLC
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russel David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7515899
    Abstract: Additional computing power is captured using the idle processing power of mobile phones incorporated into a grid computing system, wherein the system is capable of pushing projects out to available mobile phones for processing during idle operation times. To further efficiently utilize the unused processing cycles of mobile phones, a unique protocol is utilized to coordinate processing tasks which makes use of existing short messages techniques to communicate projects. The unique protocol is combination of bootstrapping using standard compression techniques along with an adaptive compression scheme.
    Type: Grant
    Filed: April 23, 2008
    Date of Patent: April 7, 2009
    Assignee: International Business Machines Corporation
    Inventors: Hollie Carr, Peter Mattison, Christopher E. Sharp
  • Patent number: 7509442
    Abstract: An informational-signal-processing apparatus has a plurality of functional blocks and a control block that controls operations of the functional blocks. Each of the functional blocks performs a series of items of processing. The control block or a predetermined block among the control block and the functional blocks distributes a global command. Each of the functional blocks receives the global command and operates adaptively based on the received global command. The functional blocks output a block-to-block synchronizing signal at an output timing of a processed informational signal that has been performed on the basis of the global command.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: March 24, 2009
    Assignee: Sony Corporation
    Inventors: Seiji Wada, Tetsujiro Kondo, Yoshihiro Wakita, Takuya Oshima
  • Patent number: 7506134
    Abstract: The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.
    Type: Grant
    Filed: June 16, 2006
    Date of Patent: March 17, 2009
    Assignee: NVIDIA Corporation
    Inventors: Norbert Juffa, Radoslav Danilak
  • Patent number: 7506297
    Abstract: An automatically reconfigurable high performance FPGA system that includes a hybrid FPGA network and an automated scheduling, partitioning and mapping software tool adapted to configure the hybrid FPGA network in order to implement a functional task. The hybrid FPGA network includes a plurality of field programmable gate arrays, at least one processor, and at least one memory. The automated software tool adapted to carry out the steps of scheduling portions of a functional task in a time sequence, partitioning a plurality of elements of the hybrid FPGA network by allocating or assigning network resources to the scheduled portions of the functional task, mapping the partitioned elements into a physical hardware design for implementing the functional task on the plurality of elements of the hybrid FPGA network, and iteratively repeating the scheduling, partitioning and mapping steps to reach an optimal physical hardware design.
    Type: Grant
    Filed: June 15, 2005
    Date of Patent: March 17, 2009
    Assignee: University of North Carolina at Charlotte
    Inventors: Arindam Mukherjee, Arun Ravindran
  • Patent number: 7502915
    Abstract: The present invention provides an adaptive computing engine (ACE) that includes processing nodes having different capabilities such as arithmetic nodes, bit-manipulation nodes, finite state machine nodes, input/output nodes and a programmable scalar node (PSN). In accordance with one embodiment of the present invention, a common architecture is adaptable to function in either a kernel node, or k-node, or as general purpose RISC node. The k-node acts as a system controller responsible for adapting other nodes to perform selected functions. As a RISC node, the PSN is configured to perform computationally intensive applications such as signal processing.
    Type: Grant
    Filed: September 29, 2003
    Date of Patent: March 10, 2009
    Assignee: NVIDIA Corporation
    Inventors: Rojit Jacob, Dan Minglun Chuang
  • Patent number: 7493475
    Abstract: An improved superscalar processor. The processor includes multiple lanes, allowing multiple instructions in a bundle to be executed in parallel. In vector mode, the parallel lanes may be used to execute multiple instances of a bundle, representing multiple iterations of the bundle in a vector run. Scheduling logic determines whether, for each bundle, multiple instances can be executed in parallel. If multiple instances can be executed in parallel, coupling circuitry couples an instance of the bundle from one lane into one or more other lanes. In each lane, register addresses are renamed to ensure proper execution of the bundles in the vector run. Additionally, the processor may include a register bank separate from the architectural register file. Renaming logic can generate addresses to this separate register bank that are longer than used to address architectural registers, allowing longer vectors and more efficient processor operation.
    Type: Grant
    Filed: November 15, 2006
    Date of Patent: February 17, 2009
    Assignee: STMicroelectronics, Inc.
    Inventor: Osvaldo M. Colavin
  • Publication number: 20090031103
    Abstract: A patch apparatus in a microprocessor is provided. The patch apparatus includes a plurality of fuse banks and an array controller. The plurality of fuse banks is configured to store associated patch records that are employed to patch microcode or circuits in the microprocessor. The array controller is coupled to the plurality of fuse banks, and is configured to read the associated patch records, and is configured to provide the associated patch records to a patch loader, where the patch loader provides patches corresponding to the associated patch records, as prescribed, to designated target patch mechanisms in the microprocessor. The patch loader provides the patches to the designated target patch mechanisms following transition of a microprocessor reset signal and prior to execution of instructions stored in a BIOS ROM.
    Type: Application
    Filed: July 24, 2007
    Publication date: January 29, 2009
    Applicant: VIA TECHNOLOGIES
    Inventors: G. GLENN HENRY, TERRY PARKS
  • Patent number: 7483595
    Abstract: An image processing method and device for processing multiple rows of pixels of an image simultaneously with a single instruction. The processing includes selecting a pixel window having a plurality of pixels of an image spanning across multiple rows and columns, building vertical and horizontal load registers to include the plurality of pixels of the selected pixel window, and simultaneously processing selected pixels of the plurality of pixels included in the vertical and horizontal load registers using a single instruction, wherein the vertical and horizontal load registers are shifted when the selected pixels are processed. Accordingly, a method and device for efficient processing of an image is provided.
    Type: Grant
    Filed: September 16, 2004
    Date of Patent: January 27, 2009
    Assignee: Marvell International Technology Ltd.
    Inventors: Douglas Gene Keithley, Roy Gideon Moss
  • Patent number: 7480785
    Abstract: A row decoding circuit (171) outputs a select signal to a row set in a row range setting unit (172) to select a select signal line (103), processing results from processing circuits (102) on this row are output to a data output line (104), and a row adder (106) adds processing results output to a data output line (104) of a column set in a column range selector (105).
    Type: Grant
    Filed: February 13, 2004
    Date of Patent: January 20, 2009
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Toshishige Shimamura, Hiroki Morimura, Koji Fujii, Satoshi Shigematsu, Katsuyuki Machida
  • Patent number: 7472392
    Abstract: One aspect of the present invention relates to a method for balancing the load of an n-dimensional array of processing elements (PEs), wherein each dimension of the array includes the processing elements arranged in a plurality of lines and wherein each of the PEs has a local number of tasks associated therewith. The method comprises balancing at least one line of PEs in a first dimension, balancing at least one line of PEs in a next dimension, and repeating the balancing at least one line of PEs in a next dimension for each dimension of the n-dimensional array. The method may further comprise selecting one or more lines within said first dimension and shifting the number of tasks assigned to PEs in said selected one or more lines.
    Type: Grant
    Filed: October 20, 2003
    Date of Patent: December 30, 2008
    Assignee: Micron Technology, Inc.
    Inventor: Mark Beaumont
  • Publication number: 20080307196
    Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.
    Type: Application
    Filed: May 28, 2008
    Publication date: December 11, 2008
    Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
  • Patent number: 7461234
    Abstract: A heterogeneous array includes clusters of processing elements. The clusters include a combination of ALUs and multiplexers linked by direct connections and various general-purpose routing networks. The multiplexers are controlled by the ALUs in the same cluster, or alternatively by ALUs in other clusters, via a special purpose routing network. Components of applications configured onto the array are selectively implemented in either multiplexers or ALUs, as determined by the relative efficiency of implementing the component in one or the other type of processing element, and by the relative availability of the processing element types. Multiplexer control signals are generated from combinations of ALU status signals, and optionally routed to control multiplexers in different clusters.
    Type: Grant
    Filed: May 16, 2005
    Date of Patent: December 2, 2008
    Assignee: Panasonic Corporation
    Inventors: Nicholas John Charles Ray, Andrea Olgiati, Anthony I. Stansfield, Alan D Marshall
  • Patent number: 7457939
    Abstract: A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A processing system is provided for processing programs and data. The processing system has a processing unit and multiple sub-processing units. Each sub-processing unit includes a dedicated local memory for storing programs and data. The dedicated local memory of each respective sub-processing unit is not a cache memory. In an alternative, multiple computing devices may connect to one another via a communications network, and each computing device may include at least one processing element having the processing unit and sub-processing units.
    Type: Grant
    Filed: October 18, 2004
    Date of Patent: November 25, 2008
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Masakazu Suzuoki, Takeshi Yamazaki
  • Patent number: 7454593
    Abstract: The present invention relates to the control of an array of processing elements in a parallel processor using row and column select lines. For each column in the array, a column select line connects to all of the processing elements in the column. For each row in the array, a row select line connecting to all of the processing elements in the row. A processing element in the array may be selected by activation of its row and column select lines. The processing elements are connected to adjacent processing elements by respective segments of a row bus for each row and by respective segments of a column bus for each column. Each row of the array includes a respective column edge register coupled to a processing element at one end of the respective row and to a processing element at the other end of the respective row.
    Type: Grant
    Filed: April 11, 2003
    Date of Patent: November 18, 2008
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Publication number: 20080282061
    Abstract: An array calculation device that includes a processor array composed of a plurality of processor elements having been assigned with orders, acquires an instruction in each cycle, generates, in each cycle, operation control information for controlling an operation of a processor element of a first order, and then generates an instruction to the processor element of the first order in accordance with the operation control information and the acquired instruction, and also generates, in each cycle, operation control information for controlling an operation of each processor element of a next order and onwards, in accordance with operation control information generated for controlling an operation of a processor element of an immediately preceding order, and then generates an instruction to each processor element of the next order and onwards, in accordance with the operation control information generated and the acquired instruction.
    Type: Application
    Filed: August 2, 2005
    Publication date: November 13, 2008
    Inventors: Hiroyuki Morishita, Takeshi Tanaka, Masaki Maeda, Yorihiko Wakayama
  • Patent number: 7451293
    Abstract: A computer processor having an integrated instruction sequencer, array of processing engines, and I/O controller. The instruction sequencer sequences instructions from a host, and transfers these instructions to the processing engines, thus directing their operation. The I/O controller controls the transfer of I/O data to and from the processing engines in parallel with the processing controlled by the instruction sequencer. The processing engines themselves are constructed with an integer arithmetic and logic unit (ALU), a 1-bit ALU, a decision unit, and registers. Instructions from the instruction sequencer direct the integer ALU to perform integer operations according to logic states stored in the 1-bit ALU and data stored in the decision unit. The 1-bit ALU and the decision unit can modify their stored information in the same clock cycle as the integer ALU carries out its operation. The processing engines also contain a local memory for storing instructions and data.
    Type: Grant
    Filed: October 19, 2006
    Date of Patent: November 11, 2008
    Assignee: Brightscale Inc.
    Inventors: Bogdan Mitu, Gheorghe Stefan, Dan Tomescu
  • Patent number: 7451292
    Abstract: Quantum gaps exist between an origin and a destination that heretofore have prevented reliably utilizing the advantages of quantum computing. To predict the outcome of instructions with precision, the input data, preferably a qubit, is collapsed to a point value within the quantum gap based on a software instruction. After collapse the input data is restructured at the destination, wherein dynamics of restructuring are governed by a plurality of gap factors as follows: computational self-awareness; computational decision logic; computational processing logic; computational and network protocol and logic exchange; computational and network components, logic and processes; provides the basis for excitability of the Gap junction and its ability to transmit electronic and optical impulses, integrates them properly, and depends on feedback loop logic; computational and network component and system interoperability; and embodiment substrate and network computational physical topology.
    Type: Grant
    Filed: August 8, 2003
    Date of Patent: November 11, 2008
    Inventor: Thomas J Routt
  • Patent number: 7447872
    Abstract: An inter-chip communication (ICC) mechanism enables any processor in a pipelined arrayed processing engine to communicate directly with any other processor of the engine over a low-latency communication path. The ICC mechanism includes a unidirectional control plane path that is separate from a data plane path of the engine and that accommodates control information flow among the processors. The mechanism thus enables inter-processor communication without sending messages over the data plane communication path extending through processors of each pipeline.
    Type: Grant
    Filed: May 30, 2002
    Date of Patent: November 4, 2008
    Assignee: Cisco Technology, Inc.
    Inventors: Russell Schroter, John William Marshall, Kenneth H. Potter
  • Patent number: 7441100
    Abstract: A method for synchronizing a plurality of processors of a multi-processor computer system on a synchronization point is disclosed. The method includes triggering a first set of processors, using a lead processor of the plurality of processors when the lead processor encounters the synchronization point, to enter an exit holding loop. The first set of processors representing the plurality of processors except the lead processor. The triggering the first set of processors is performed without accessing a shared memory area of the multi-processor system. There is also included triggering the plurality of processors, using a tail processor of the plurality of processors when the tail processor encounters the synchronization point, to leave the exit holding loop. The triggering the plurality of processors is performed without accessing the shared memory area of the multi-processor system.
    Type: Grant
    Filed: February 27, 2004
    Date of Patent: October 21, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Chenghung Justin Chen, John W. Curry, Robert Seymour
  • Publication number: 20080229059
    Abstract: Each possessor node in an array of nodes has a respective local node address, and each local node address comprises a plurality of components having an order of addressing significance from most to least significant. Each node comprises: mapping means configured to map each component of the local node address onto a respective routing direction, and a switch arranged to receive a message having a destination node address identifying a destination node. The switch comprises: means for comparing the local node address to the destination node address to identify a the most significant non-matching component; and means for routing the message to another node, on the condition that the local node address does not match the destination node address, in the direction mapped to the most significant non-matching component.
    Type: Application
    Filed: March 14, 2007
    Publication date: September 18, 2008
    Inventor: Michael David May
  • Patent number: 7426448
    Abstract: A mechanism for diagnosing broken scan chains based on leakage light emission is provided. An image capture mechanism detects light emission from leakage current in complementary metal oxide semiconductor (CMOS) devices. The diagnosis mechanism identifies devices with unexpected light emission. An unexpected amount of light emission may indicate that a transistor is turned off when it should be turned on or vice versa. All possible inputs may be tested to determine whether a problem exists with transistors in latches or with transistors in clock buffers. Broken points in the scan chain may then be determined based on the locations of unexpected light emission.
    Type: Grant
    Filed: February 3, 2004
    Date of Patent: September 16, 2008
    Assignee: International Business Machines Corporation
    Inventors: Peilin Song, Tian Xia, Alan J. Weger, Franco Stellari, Stanislav V. Polonsky
  • Patent number: 7418541
    Abstract: A method and apparatus are provided for a support interface for memory-mapped resources. A support processor sends a sequence of commands over and FSI interface to a memory-mapped support interface on a processor chip. The memory-mapped support interface updates memory, memory-mapped registers or memory-mapped resources. The interface uses fabric packet generation logic to generate a single command packet in a protocol for the coherency fabric which consists of an address, command and/or data. Fabric commands are converted to FSI protocol and forwarded to attached support chips to access the memory-mapped resource, and responses from the support chips are converted back to fabric response packets. Fabric snoop logic monitors the coherency fabric and decodes responses for packets previously sent by fabric packet generation logic. The fabric snoop logic updates status register and/or writes response data to a read data register. The system also reports any errors that are encountered.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: August 26, 2008
    Assignee: International Business Machines Corporation
    Inventors: James Stephen Fields, Jr., Paul Frank Lecocq, Brian Chan Monwai, Thomas Pflueger, Kevin Franklin Reick, Timothy M. Skergan, Scott Barnett Swaney
  • Patent number: 7401333
    Abstract: The present invention provides an array of parallel programmable processing engines interconnected by a switching network. At least some of the processing engines execute a thread, and at least some threads communicate with each other through communication objects either internally within one processing engine or through the network. A scheduling step of the parallel programmable processing engines is initiated by one or more events, an event being defined by a change of a state variable of a communication object. The array comprises: means for scheduling a scheduling step of the processing engines, the scheduling means comprising means for executing at least a first set of threads in parallel, means for updating state values of communications objects in response to the parallel executing step, and means for repeatedly and sequentially scheduling the executing means and the updating means until no more events occur. The present invention also provides a deterministic method of operating such an array.
    Type: Grant
    Filed: August 8, 2001
    Date of Patent: July 15, 2008
    Assignee: TranSwitch Corporation
    Inventor: Ivo Vandeweerd
  • Patent number: 7392350
    Abstract: In a multiprocessor environment, by executing cache-inhibited reads or writes to registers, a scan communication is used to rapidly access registers inside and outside a chip originating the command. Cumbersome locking of the memory location may be thus avoided. Setting of busy latches at the outset virtually eliminates the chance of collisions, and status bits are set to inform the requesting core processor that a command is done and free of error, if that is the case.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: June 24, 2008
    Assignee: International Business Machines Corporation
    Inventors: James Stephen Fields, Jr., Michael Stephen Floyd, Paul Frank Lecocq, Larry Scott Leitner, Kevin Franklin Reick
  • Publication number: 20080148010
    Abstract: The system design is facilitated by eliminating the increase in data transfer volume of the whole system. In order to facilitate the system design, there are provided an operation unit array, a memory array, a data transfer circuit, and a switch circuit. There are also provided a configuration data management unit for managing the configuration data defining the logical behaviors of the operation unit array, the memory array, the data transfer circuit, and the switch circuit, as well as a state transition management unit capable of controlling the switching of the configuration data. The data transfer circuit includes a control circuit capable of autonomously sorting the data by determining the timing of the data sorting according to the setting included in the configuration data.
    Type: Application
    Filed: December 14, 2007
    Publication date: June 19, 2008
    Inventor: Tomoyuki KODAMA
  • Patent number: 7369683
    Abstract: In an imaging device of the present invention, an imaging element 2 is driven in a thinning read-out mode for reading out signal charges from a subset of pixels, or in an all-pixels read-out mode for reading out signal charges from all pixels. When the imaging element 2 is driven in the thinning read-out mode, the imaging device processes and records a series of first image data that is obtained by reading out signal charges from the subset of pixels and that constitutes the moving images. When the imaging element 2 is driven in the all-pixels read-out mode, the imaging device processes and records a series of second image data constituting moving images after the number of pixels of the second image data is thinned, and processes and records a portion of the second image data as a still image without thinning when an instruction to pick up the still image is given while picking up the moving images.
    Type: Grant
    Filed: August 4, 2004
    Date of Patent: May 6, 2008
    Assignee: Sanyo Electric Co., Ltd.
    Inventors: Akio Kobayashi, Shigeru Miki
  • Patent number: 7356819
    Abstract: Methods, signals, devices and systems are provided for matching tasks with processing units. A region within a multi-faceted task space is allocated to a processing unit. A point in the multi-faceted task space is assigned to a task. The task is then associated with the processing unit if the region allocated to the processing unit is close to the point assigned to the task. The region allocated to a processing unit may be changed. If no assigned point for a task is sufficiently close to any allocated processing unit region, the task is suspended. Overlapping regions may be assigned to different processing units. In some implementations, the union of the allocated regions covers the task space, while in others it does not. Regions may also be allocated to wait conditions and one or more dimensions of a region may be allocated to conventional processor allocators.
    Type: Grant
    Filed: August 7, 2003
    Date of Patent: April 8, 2008
    Assignee: Novell, Inc.
    Inventors: Glenn Ricart, Del Jensen, Stephen R. Carter
  • Patent number: 7315933
    Abstract: The present invention is a re-configurable circuit capable of reducing latency by selecting a route for skipping the FF of an operation unit and outputting data to a connection destination operation unit if an accumulated process time is below an operation cycle allocated to the operation unit. The operation unit comprises at least a selector, a flip-flop and an operator. In a program for generating configuration information for switching the configuration of the operation unit of the re-configurable circuit, the selector selects the use/non-use of the flip-flop, based on the configuration information and selector switching condition is reflected in the configuration information for determining whether to take a route for transferring data inputted to the selector to the operator or a route for transferring the data to the operator skipping the flip-flop.
    Type: Grant
    Filed: October 6, 2005
    Date of Patent: January 1, 2008
    Assignee: Fujitsu Limited
    Inventor: Seiichi Nishijima
  • Patent number: 7266255
    Abstract: A multi-chip system is disclosed for distributing the convolution process. Rather than having multiple convolution chips working in parallel with each chip working on a different portion of the screen, a new design utilizes chips working in series. Each chip is responsible for a different interleaved region of screen space. Each chip performs part of the convolution process for a pixel and sends a partial result on to the next chip. The final chip completes the convolution and stores the filtered pixel. An alternate design interconnects chips in groups. The chips within a group operate in series, whereas the groups may operate in parallel.
    Type: Grant
    Filed: September 26, 2003
    Date of Patent: September 4, 2007
    Assignee: Sun Microsystems, Inc.
    Inventors: Michael A. Wasserman, Paul R. Ramsey, Nathaniel David Naegle
  • Patent number: 7176914
    Abstract: A system and method are provided for directing the flow of data and instructions into at least one functional unit. In one embodiment of a system of components defining a plurality of nodes, a queue network manager (QNM) forming a part of each node, is provided. In this embodiment, the QNM comprises an interface to a network that supports intercommunication among the plurality of nodes, an interface configured to pass messages with a functional unit within the node, a random access memory (RAM) configured to store at least one of a message and a programmable instruction, and logic configured to control an operational aspect of a functional unit based on contents of the programmable instruction.
    Type: Grant
    Filed: May 16, 2002
    Date of Patent: February 13, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Darel N. Emmot
  • Patent number: 7155466
    Abstract: An archive cluster application runs in a distributed manner across a redundant array of independent nodes. Each node preferably runs a complete archive cluster application instance. A given nodes provides a data repository, which stores up to a large amount (e.g., a terabyte) of data, while also acting as a portal that enables access to archive files. Each symmetric node has a set of software processes, e.g., a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests to the node for data (i.e., file data), the storage manager manages data read/write functions from a disk associated with the node, and the metadata manager facilitates metadata transactions and recovery across the distributed database. The policy manager implements one or more policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage.
    Type: Grant
    Filed: October 27, 2004
    Date of Patent: December 26, 2006
    Assignee: Archivas, Inc.
    Inventors: Andres Rodriguez, Jack A. Orenstein, David M. Shaw, Benjamin K. D. Bernhard
  • Patent number: 7130934
    Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: October 31, 2006
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Edward A. Wolff
  • Patent number: 7100020
    Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: August 29, 2006
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Thomas B. Brightman, Andrew T. Brown, John F. Brown, James A. Farrell, Andrew D. Funk, David J. Husak, Edward J. McLellan, Mark A. Sankey, Paul Schmitt, Donald A. Priore
  • Patent number: 7069416
    Abstract: A single chip active memory includes a plurality of memory stripes, each coupled to a full word interface and one of a plurality of processing element (PE) sub-arrays. The large number of couplings between a PE sub-array and its associated memory stripe are managed by placing the PE sub-arrays so that their data paths run at right angle to the data paths of the plurality of memory stripes. The data lines exiting the memory stripes are run across the PE sub-arrays on one metal layer. At the appropriate locations, the data lines are coupled to another orthogonally oriented metal layer to complete the coupling between the memory stripe and its associated PE sub-array. The plurality of PE sub-arrays are mapped to form a large logical array, in which each PE is coupled to four other PEs. Physically distant PEs are coupled using current mode differential logical couplings an drivers to insure good signal integrity at high operational speeds. Each PE contains a small DRAM register array.
    Type: Grant
    Filed: June 4, 2004
    Date of Patent: June 27, 2006
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7069557
    Abstract: A virtual path feature in which several virtual channels share an assigned amount of bandwidth is implemented in a network processor. The network processor maintains a schedule indicative of respective times at which a plurality of virtual channels are to be serviced. An entry is read from the schedule. The entry corresponds to a current transmit cycle and includes a pointer to a channel descriptor for a virtual channel to be serviced in the current transmit cycle. A data cell for the virtual channel to be serviced in the current cycle is transmitted. An entry is added to the schedule to point to a channel descriptor that is pointed to by the channel descriptor for the virtual channel serviced in the current transmit cycle.
    Type: Grant
    Filed: May 23, 2002
    Date of Patent: June 27, 2006
    Assignee: International Business Machines Corporation
    Inventor: Merwin Herscher Alferness
  • Patent number: 7043562
    Abstract: Irregularities are provided in at least one dimension of a torus or mesh network for lower average path length and lower maximum channel load while increasing tolerance for omitted end-around connections. In preferred embodiments, all nodes supported on each backplane are connected in a single cycle which includes nodes on opposite sides of lower dimension tori. The cycles in adjacent backplanes hop different numbers of nodes.
    Type: Grant
    Filed: June 9, 2003
    Date of Patent: May 9, 2006
    Assignee: Avivi Systems, Inc.
    Inventors: William J. Dally, William F. Mann, Philip P. Carvey
  • Patent number: 7028107
    Abstract: A system for communication between a plurality of functional elements in a cell arrangement and a higher-level unit is described. The system may include, for example, a configuration memory arranged between the functional elements and the higher-level unit; and a control unit configured to move at least one position pointer to a configuration memory location in response to at least one event reported by a functional element. At run time, a configuration word in the configuration memory pointed to by at least one of the position pointers is transferred to the functional element in order to perform reconfiguration without the configuration word being managed by a central logic.
    Type: Grant
    Filed: October 7, 2002
    Date of Patent: April 11, 2006
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Robert Münch
  • Patent number: 7020761
    Abstract: Processing restrictions of a computing environment are filtered and blocked, in certain circumstances, such that processing continues despite the restrictions. One restriction includes an indication that address translation is prohibited, in response to a buffer miss. When a processing unit of the computing environment is met with this restriction, it performs a comparison of page indices, which indicates whether the address translation can continue. If address translation can continue, the restriction is ignored. The processing unit includes a processor or a pageable entity, as examples.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: March 28, 2006
    Assignee: International Business Machines Corporation
    Inventors: Timothy J. Siegel, Bruce A. Wagar, Ute Gaertner, Lisa C. Heller, Erwin F. Pfeffer
  • Patent number: 6996504
    Abstract: A scalable computer architecture capable of performing fully scalable simulations includes a plurality of processing elements (PEs) and a plurality of interconnections between the PEs. In this regard, the interconnections can interconnect each processing element to each neighboring processing element located adjacent the respective processing element, and further interconnect at least one processing element to at least one other processing element located remote from the respective at least one processing element. For example, the interconnections can interconnect the plurality of processing elements according to a fractal-type method or a quenched random method. Further, the plurality of interconnections can include at least one interconnection at each length scale of the plurality of processing elements.
    Type: Grant
    Filed: November 14, 2001
    Date of Patent: February 7, 2006
    Assignee: Mississippi State University
    Inventors: Mark A. Novotny, Gyorgy Korniss
  • Patent number: 6993764
    Abstract: A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval.
    Type: Grant
    Filed: June 28, 2001
    Date of Patent: January 31, 2006
    Assignee: The Regents of the University of California
    Inventors: Fabrizio Petrini, Wu-chun Feng
  • Patent number: 6990566
    Abstract: A method and an apparatus for configuration of multiple context processing elements (MCPEs) are described. The method and an apparatus is capable of selectively transmitting data over a bidirectional shared bus network including a plurality of channels between pairs of MCPEs in the networked array. The method and an apparatus then selectively transmits a sideband bit indicating a direction in which the data is transmitted in the shared bus network.
    Type: Grant
    Filed: April 20, 2004
    Date of Patent: January 24, 2006
    Assignee: Broadcom Corporation
    Inventors: Ethan Mirsky, Robert French, Ian Eslick
  • Patent number: 6968442
    Abstract: A parallel computer of this invention includes a plurality of memory elements and a plurality of processing elements and each of the processing elements is connected to logically adjacent memory elements. For example, the processing element which corresponds to a logical position (i, j) is connected to the memory elements which correspond to a plurality of logical positions (i, j), (i, j+1), (i+1, j) and (i+1, j+1). It is preferable if each of the memory elements can be accessed from the exterior. According to this invention, efficient memory access can be made and the parallel processing can be performed at high speed without increasing the hardware amount and making the control operation complicated. Further, the operation speed of the image processing can be enhanced by constructing an image memory by use of a plurality of memory elements and causing the processing element to effect the image processing in a distributed and cooperative manner.
    Type: Grant
    Filed: June 19, 2002
    Date of Patent: November 22, 2005
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kenichi Maeda, Nobuyuki Takeda, Yasukazu Okamoto
  • Patent number: 6967950
    Abstract: In a network of digital signal processor nodes connected in a peer-to-peer relationship, a data packet sent to a node causes a return transmission from that node. The requester digital signal processor sends a data packet to a target digital signal processor. Upon arrival at the target digital signal processor, its receiver drives the arriving request packet into an I/O memory and triggers a transmitter interrupt. Next, the pull interrupt causes the transmitter to execute on a next packet boundary the pull request packet. Finally, the execution of the pull request causes the transmitter to pull a portion of the local I/O memory and send it back to the requester digital signal processor. The same physical portion of the I/O memory is overlaid with two logical uses, a receiver channel and a transmitter code block.
    Type: Grant
    Filed: July 13, 2001
    Date of Patent: November 22, 2005
    Assignee: Texas Instruments Incorporated
    Inventors: Peter Galicki, Cheryl S. Shepherd, Jonathan H. Thorn
  • Patent number: 6944747
    Abstract: A matrix data processor is implemented wherein data elements are stored in physical registers and mapped to logical registers. After being stored in the logical registers, the data elements are then treated as matrix elements. By using a series of variable matrix parameters to define the size and location of the various matrix source and destination elements, as well as the operation(s) to be performed on the matrices, the performance of digital signal processing operations can be significantly enhanced.
    Type: Grant
    Filed: December 9, 2002
    Date of Patent: September 13, 2005
    Assignee: GemTech Systems, LLC
    Inventors: Gopalan N Nair, Gouri G. Nair
  • Patent number: 6928539
    Abstract: A test monitor loaded into a multiprocessor machine comprises a program (31) designed to interpret a script language for writing tests, a program (29) that constitutes a kernel part for conducting the tests according to the scripts, and a library (30) of functions that constitutes an application program interface with the firmware of the machine 1. This monitor implements a method for executing instruction sequences simultaneously in several processors (3, 4, 5) of a multiprocessor machine (1). The method comprises a first step (8) in which a single processor operating system is booted in a first processor (2) and a second step (9) in which the first processor (1) orders at least one other processor (3) of the machine, called an application processor, to execute one or more instruction sequences (17, 18, 19) under the control of said first processor.
    Type: Grant
    Filed: May 17, 2001
    Date of Patent: August 9, 2005
    Assignee: Bull S.A.
    Inventors: Claude Brassac, Alain Vigor
  • Patent number: 6919894
    Abstract: A system is described that is broadly directed to a system of integrated circuit components. The system comprises a plurality of nodes that are interconnected by communication links. A random access memory (RAM) is connected to each node. At least one functional unit is integrated into each node, and each functional unit is configured to carry out a predetermined processing function. Finally, each RAM includes a coherency mechanism configured to permit only read access to the RAM by other nodes, the coherency mechanism further configured to permit write access to the RAM only by functional units that are local to the node.
    Type: Grant
    Filed: July 21, 2003
    Date of Patent: July 19, 2005
    Assignee: Hewlett Packard Development Company, L.P.
    Inventors: Darel N. Emmot, Byron A. Alcorn
  • Patent number: 6915388
    Abstract: A multiprocessor computer system includes a plurality of processor nodes, a memory, and an interconnect network connecting the plurality of processor nodes to the memory. The memory includes a plurality of lines and a cache coherence directory structure. The plurality of lines includes a first line. The cache coherence directory structure includes a plurality of directory structure entries. Each directory structure entry includes processor pointer information indicating the processor nodes that have cached copies of the first line. The processor pointer information includes a plurality n of bit vectors, where n is an integer greater than one. The n bit vectors define a matrix having a number of locations equal to the product of the number of bits in each of the n bit vectors.
    Type: Grant
    Filed: July 20, 2001
    Date of Patent: July 5, 2005
    Assignee: Silicon Graphics, Inc.
    Inventor: William A. Huffman
  • Patent number: 6883084
    Abstract: A reconfigurable data path processor comprises a plurality of independent processing elements. Each of the processing elements advantageously comprising an identical architecture. Each processing element comprises a plurality of data processing means for generating a potential output. Each processor is also capable of through-putting an input as a potential output with little or no processing. Each processing element comprises a conditional multiplexer having a first conditional multiplexer input, a second conditional multiplexer input and a conditional multiplexer output. A first potential output value is transmitted to the first conditional multiplexer input, and a second potential output value is transmitted to the second conditional multiplexer output. The conditional multiplexer couples either the first conditional multiplexer input or the second conditional multiplexer input to the conditional multiplexer output, according to an output control command.
    Type: Grant
    Filed: July 25, 2002
    Date of Patent: April 19, 2005
    Assignee: University of New Mexico
    Inventor: Gregory Donohoe
  • Publication number: 20040215929
    Abstract: A method of communicating between processing units on different integrated circuit chips in a multi-processor computer system by issuing a command from a source processing unit to a destination processing unit, receiving the command at the destination processing unit while the destination processing unit is processing program instructions, and accessing registers in clock-controlled components of the destination processing unit without interrupting processing of the program instructions by the destination processing unit. The access may be a read from status or mode registers of the destination processing unit, or write to control or mode registers. Many processing units can be interconnected in a ring topology, and the access command can be passed from the source processing unit through several other processing units before reaching the destination processing unit.
    Type: Application
    Filed: April 28, 2003
    Publication date: October 28, 2004
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael Stephen Floyd, Larry Scott Leitner, Kevin Franklin Reick, Kevin Dennis Woodling