Array Processor Operation Patents (Class 712/16)
  • Patent number: 6308251
    Abstract: A parallel processor apparatus capable of reducing the power consumption when converting serial data to parallel data and, at the same time, capable of improving an operating speed, wherein a data input register for converting serial data to parallel data is divided and data inputting means of a plurality of blocks are constituted and wherein detection circuits for detecting the time of input and the time of output of the pointer data in the data inputting means are provided and switch circuits for connecting the related data inputting means and a serial data input line only for a period from the time of input to the time of output of the pointer data detected by the detection circuit are provided.
    Type: Grant
    Filed: March 8, 1999
    Date of Patent: October 23, 2001
    Assignee: Sony Corporation
    Inventor: Akihiko Hashiguchi
  • Patent number: 6298430
    Abstract: A user-configurable ultra-scalar multiprocessor has a predetermined plurality of distributed configurable signal processors (DCSPs) (1) which are computational clusters that each have at least two sub microprocessors (SMs) (2) and one packet bus controller (PBC) (3) that are a unit group. The DCSPs, the SM and the PBC are connected through local network buses (6). The PBC has communication buses (7) that connect the PBC with each of the SM. The communication buses of the PBC that connect the PBC with each SM have serial chains of one hardwired connection (4) and one programmably switchable connector (5). Each communication bus between the SMs is at least one hardwired connection and two programmably switchable connectors.
    Type: Grant
    Filed: March 22, 2000
    Date of Patent: October 2, 2001
    Assignee: Context, Inc. of Delaware
    Inventor: Vladimir P. Roussakov
  • Publication number: 20010018733
    Abstract: To execute all processing in an array section of an array-type processor, each processor must execute processing of different types, i.e., processing of an operating unit and processing of a random logic circuit, which limits its size and processing performance. A data path section including processors arranged in an array are connected via programmable switches to primarily execute processing of operation and a state transition controller configured to easily implement a state transition function to control state transitions are independently disposed. These sections are configured in customized structure for respective processing purposes to efficiently implement and achieve the processing of operation and the control operation.
    Type: Application
    Filed: February 23, 2001
    Publication date: August 30, 2001
    Inventors: Taro Fujii, Masato Motomura, Koichiro Furuta
  • Patent number: 6279045
    Abstract: An integrated circuit architecture for multimedia processing. A single integrated circuit (IC) operates as a system or subsystem, and is adaptable to processing a variety of multimedia algorithms, whether proprietary or open. Hard macros, either analog or digital, can be incorporated. The IC can also contain audio/video CODECs to suit different standards, as well as other peripheral devices which may be required for multimedia applications. An electronic component (e.g., integrated circuit) incorporating the technique is suitably included in a system or subsystem having electrical functionality, such as general purpose computers, telecommunications devices, and the like.
    Type: Grant
    Filed: October 5, 1998
    Date of Patent: August 21, 2001
    Assignee: Kawasaki Steel Corporation
    Inventors: Kumaraguru Muthujumaraswathy, Michael D. Rostoker
  • Patent number: 6275890
    Abstract: The present invention provides a cross-bar switch which includes a plurality of master bus ports, the master bus ports adapted to receive a plurality of master buses; a plurality of slave bus ports, the slave bus ports adapted to receive a plurality of slave buses; a manner of switching for selectively coupling the plurality of master bus ports to the plurality of slave bus ports; and a manner of configuration for prioritizing access requests by the plurality of master buses to the plurality of slave buses via the switching means. The cross-bar switch of the present invention has the capability of prioritizing requests between multiple parallel high speed buses. In a preferred embodiment, this arbitration is accomplished through Configuration Registers on the cross-bar switch. The Configuration Registers are programmable through the Device Control Register bus, which allows the cross-bar switch to be dynamically programmed and changed by a processor in a larger system.
    Type: Grant
    Filed: August 19, 1998
    Date of Patent: August 14, 2001
    Assignee: International Business Machines Corporation
    Inventors: William Robert Lee, David Wallach
  • Publication number: 20010011342
    Abstract: A reconfigurable register file integrated in an instruction set architecture capable of extended precision operations, and also capable of parallel operation on lower precision data is described. A register file is composed of two separate files with each half containing half as many registers as the original. The halves are designated even or odd by virtue of the register addresses which they contain. Single width and double width operands are optimally supported without increasing the register file size and without increasing the number of register file ports. Separate extended registers are also employed to provide extended precision for operations such as multiply-accumulate operations.
    Type: Application
    Filed: February 28, 2001
    Publication date: August 2, 2001
    Inventors: Gerald G. Pechanek, Edwin F. Barry
  • Patent number: 6230252
    Abstract: A scalable multiprocessor system includes processing element nodes. A scalable interconnect network includes physical communication links interconnecting the processing element nodes in an n-dimensional topology, and routers for routing messages between the processing element nodes on the physical communication links. The routers are capable of routing messages in hypercube topologies of at least up to six dimensions, and further capable of routing messages in at least one n dimensional torus topology having at least one of the n dimensions having a radix greater than four, such as a 4×8×4 torus topology.
    Type: Grant
    Filed: November 17, 1997
    Date of Patent: May 8, 2001
    Assignee: Silicon Graphics, Inc.
    Inventors: Randal S. Passint, Greg Thorson, Michael B. Galles
  • Patent number: 6223239
    Abstract: A multiple use core logic chipset is provided in a computer system that may be configured either as a bridge between an accelerated graphics port (“AGP”) bus and host and memory buses, or as a bridge between a system area network interface and the host bus and the system memory bus. The function of the multiple use chipset is determined at the time of manufacture of the computer system, or in the field whether an AGP bus bridge or a system area network interface is to be implemented. Selection of the type of bus bridge (AGP or system area network interface) in the multiple use core logic chipset may be implemented by a hardware signal input, or by software during computer system configuration or power on self test (“POST”). Software configuration may also be determined upon detection of either an AGP device or a system area network interface connected to the core logic chipset.
    Type: Grant
    Filed: August 12, 1998
    Date of Patent: April 24, 2001
    Assignee: Compaq Computer Corporation
    Inventor: Sompong Paul Olarig
  • Patent number: 6212628
    Abstract: An apparatus for processing data has a Single-Instruction-Multiple-Data (SIMD) architecture, and a number of features that improve performance and programmability. The apparatus includes a rectangular array of processing elements and a controller. In one aspect, each of the processing elements includes one or more addressable storage means and other elements arranged in a pipelined architecture. The controller includes means for receiving a high level instruction, and converting each instruction into a sequence of one or more processing element microinstructions for simultaneously controlling each stage of the processing element pipeline. In doing so, the controller detects and resolves a number of resource conflicts, and automatically generates instructions for registering image operands that are skewed with respect to one another in the processing element array.
    Type: Grant
    Filed: April 9, 1998
    Date of Patent: April 3, 2001
    Assignee: TeraNex, Inc.
    Inventors: Andrew P. Abercrombie, David A. Duncan, Woodrow Meeker, Michele D. Van Dyke-Lewis
  • Patent number: 6208362
    Abstract: In an image processing system or method, an image element memorizing device memorizes image elements which are image data that are subjects of process. An image element processing state memorizing device memorizes present processing states of the image elements in the image element memorizing device. A detecting device detects, in response to the present processing states, a pointer of one of the image element that is capable of being processed by the image processing system. A temporary pointer memorizing device memorizes the pointer from the detecting device. A calculating device reads the pointer from the temporary pointer memorizing device to process an image in response to the image element of the pointer which is read.
    Type: Grant
    Filed: August 27, 1997
    Date of Patent: March 27, 2001
    Assignee: NEC Corporation
    Inventor: Sholin Kyo
  • Patent number: 6205532
    Abstract: A module connection assembly connects modules in a torus configuration that can be changed remotely. In particular, a single module can be added to or deleted from the configuration by remotely switching from conducting paths that provide end-around electrical paths to conducting paths that provide pass-through electrical paths. The assembly includes two backplanes, a first set of module connectors for electrically connecting modules to one of the backplanes, and a second set of module connectors for electrically connecting modules to the other backplane. The assembly further includes configuration controllers. Each configuration controller selects between end-around electrical paths that electrically connect multiple module connectors of the first set to each other, and pass-through electrical paths that electrically connect module connectors of the first set to module connectors of the second set.
    Type: Grant
    Filed: May 22, 1998
    Date of Patent: March 20, 2001
    Assignee: Avici Systems, Inc.
    Inventors: Philip P. Carvey, William J. Dally, Larry R. Dennison
  • Patent number: 6192384
    Abstract: A processor particularly useful in multimedia applications such as image processing is based on a stream programming model and has a tiered storage architecture to minimize global bandwidth requirements. The processor has a stream register file through which the processor's functional units transfer streams to execute processor operations. Load and store instructions transfer streams between the stream register file and a stream memory; send and receive instructions transfer streams between stream register files of different processors; and operate instructions pass streams between the stream register file and computational kernels. Each of the computational kernels is capable of performing compound vector operations. A compound vector operation performs a sequence of arithmetic operations on data read from the stream register file, i.e., a global storage resource, and generates a result that is written back to the stream register file.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: February 20, 2001
    Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology
    Inventors: William J. Dally, Scott Whitney Rixner, Jeffrey P. Grossman, Christopher James Buehler
  • Patent number: 6144982
    Abstract: An apparatus for tracking pipeline resources of a processor involves fetching selected ones of the coded instructions and marking the fetched instructions with instruction metadata. The instruction metadata indicates a number of pipeline resources required by each instruction. The marked instructions are issued from the fetch unit and, using the instruction metadata, a count of a number of resources committed to issued instructions in the execution pipelines is maintained. When it is determined that the number of resources committed to issued instructions exceeds a preselected maximum and instructions are prevented from issuing from the fetch unit. As each instruction is retired, the instruction metadata is used to determine a number of resources released by retirement of the issued instruction.
    Type: Grant
    Filed: June 25, 1997
    Date of Patent: November 7, 2000
    Assignee: Sun Microsystems, Inc.
    Inventor: Ramesh Panwar
  • Patent number: 6141422
    Abstract: A system for performing high speed exponentiation in a secure environment. The system includes an interface for receiving encrypted data sent from a host system, a plurality of exponentiators capable of operating concurrently, an encyptor decrypting data received from a host system and encrypting data produced from the exponentiators, and logic circuitry for selecting an available and properly functioning exponentiator to perform exponentiation on the received data.
    Type: Grant
    Filed: June 4, 1997
    Date of Patent: October 31, 2000
    Assignee: Philips Electronics North America Corporation
    Inventors: Charles Robert Rimpo, John Charles Ciccone, Yongyut Yuenyongsgool
  • Patent number: 6128719
    Abstract: An interconnection network used for a multiprocessor system. An indirect n-dimensional rotator graph network having a transmission path of arbitrary nodes in a multiprocessor system including n! nodes includes n! input ports, n! output ports, a first stage switch module including n! demultiplexers, second through (n-1)th stage switch modules each having n! n.times.n crossbar switches, and an nth stage switch module including n! multiplexers, in which the switches or the demultiplexers composing switch modules of first to (n-1)th stages comprise n generators g.sub.1, g.sub.2, . . . , g.sub.n, the g.sub.1 is connected to a switch or multiplexer of a later stage having an identifier identical to that of a demultiplexer or switch, to which the g.sub.1 is included, and the g.sub.i (2.ltoreq.i.ltoreq.
    Type: Grant
    Filed: September 21, 1998
    Date of Patent: October 3, 2000
    Assignee: SamSung Electronics Co., Ltd.
    Inventor: Seong-dong Kim
  • Patent number: 6122719
    Abstract: A method and an apparatus for retiming in a network of multiple context processing elements are provided. A programmable delay element is configured to programmably delay signals between a number of multiple context processing elements of an array without requiring a multiple context processing element to implement the delay. The output of a first multiple context processing element is coupled to a first multiplexer and to the input of a number of serially connected delay registers. The output of each of the serially connected delay registers is coupled to the input of a second multiplexer. The output of the second multiplexer is coupled to the input of the first multiplexer, and the output of the first multiplexer is coupled to a second multiple context processing element. The first and second multiplexers are provided with at least one set of data representative of at least one configuration memory context of a multiple context processing element.
    Type: Grant
    Filed: October 31, 1997
    Date of Patent: September 19, 2000
    Assignee: Silicon Spice
    Inventors: Ethan Mirsky, Robert French, Ian Eslick
  • Patent number: 6108760
    Abstract: A method and an apparatus for position independent reconfiguration in a network of multiple context processing elements are provided. Wach multiple context processing element in a networked array of multiple context processing elements has an assigned physical identification. Virtual identifications may also be assigned to a number of the multiple context processing elements. Data is transmitted to at least one of the multiple context processing elements of the array, the data comprising control data, configuration data, an address mask, and a destination identification. The transmitted address mask is applied to either the physical or virtual identification and to a destination identification. The masked physical or virtual identification is compared to the masked destination identification.
    Type: Grant
    Filed: October 31, 1997
    Date of Patent: August 22, 2000
    Assignee: Silicon Spice
    Inventors: Ethan Mirsky, Robert French, Ian Eslick
  • Patent number: 6092174
    Abstract: A dynamically reconfigurable distributed integrated circuit processor has at least one two-layer matrix in which a first layer has operative microcomputer modules (1) with local memory (2) grouped in computational clusters (5) and a second layer has a network of global communications connecting buses (7, 8) with packet decoders in coherence with the first layer. All components of the basic operating units are micro programmable and in universal communication selectively throughout separate operative microcomputer modules and throughout the computational clusters. Electrical conductivity of components is variable for select speed, timing and factors. A use method is described.
    Type: Grant
    Filed: June 1, 1998
    Date of Patent: July 18, 2000
    Assignee: Context, Inc.
    Inventor: Vladimir P. Roussakov
  • Patent number: 6085304
    Abstract: A memory-like I/O system is provided for interfacing a processing element array with a host system. The I/O system includes cornerturn logic for converting data written to the processing element array from horizontal format to vertical format and for converting data read from the processing element array from vertical format to horizontal format. Addressable interface memory is provided and includes a first bank for receiving and storing data which has been output from the cornerturn logic and for outputting that data for delivery to the processing element array. The addressable interface memory includes a second bank for receiving and storing data which has been output from the processing element array and for outputting that data for delivery to the cornerturn logic. The interface of the invention can provide support for concurrent I/O and processing, thereby allowing processing and I/O operations to proceed in parallel.
    Type: Grant
    Filed: November 28, 1997
    Date of Patent: July 4, 2000
    Assignee: TeraNex, Inc.
    Inventors: Carl Morris, Kevin Dennis
  • Patent number: 6085303
    Abstract: Improved method and apparatus for facilitating barrier and eureka synchronization in a massively parallel processing system. The present barrier/eureka synchronization mechanism provides a partitionable, low-latency, immediately reusable, robust mechanism which can operate on a physical data-communications network and can be used to alert all processor entities (PEs) in a partition when all of the PEs in that partition have reached a designated barrier point in their individual program code, or when any one of the PEs in that partition has reached a designated eureka point in its individual program code, or when either the barrier or eureka requirements have been satisfied, which ever comes first. Multiple overlapping synchronization partitions are available simultaneously through the use of a plurality of parallel synchronization contexts.
    Type: Grant
    Filed: November 17, 1997
    Date of Patent: July 4, 2000
    Assignee: Cray Research, Inc.
    Inventors: Greg Thorson, Randal S. Passint, Steven L. Scott
  • Patent number: 6079008
    Abstract: A parallel processing system or processor has a computing architecture including a plurality of execution units to repeatedly distribute instruction streams within the processor via corresponding buses, and a series of processing units to access the buses and selectively execute the distributed instruction streams. The execution units each retrieve an instruction stream from an associated memory and place the instruction stream on a corresponding bus, while the processing units individually may select and execute any instruction stream placed on the corresponding buses. The processing units autonomously execute conditional instructions (e.g., IF/ENDIF instructions, conditional looping instructions, etc.), whereby an enable flag within the processing unit is utilized to indicate occurrence of conditions specified within a conditional instruction and control selective execution of instructions in response to occurrence of those conditions.
    Type: Grant
    Filed: April 3, 1998
    Date of Patent: June 20, 2000
    Assignee: Patton Electronics Co.
    Inventor: William B. Clery, III
  • Patent number: 6067609
    Abstract: An apparatus for processing data has a Single-Instruction-Multiple-Data (SIMD) architecture, and a number of features that improve performance and programmability. The apparatus includes a rectangular array of processing elements and a controller. The apparatus offers a number of techniques for shifting image data within the array. A first technique, the ROLL option, simultaneously shifts image planes in opposite directions within the array. A second technique, the gated shift option, makes a normal shift of an image plane to neighboring PEs conditional, for each PE, upon a value stored in a mask register of each PE. A third technique, the carry propagate option, combines the computations from multiple PEs in order to complete an n-bit operation in fewer than n clocks by forming "supercells" within the array. The apparatus also includes a multi-bit X Pattern register and a multi-bit Y Pattern register.
    Type: Grant
    Filed: April 9, 1998
    Date of Patent: May 23, 2000
    Assignee: TeraNex, Inc.
    Inventors: Woodrow L. Meeker, Andrew P. Abercrombie
  • Patent number: 6038688
    Abstract: A node disjoint path forming method for a hypercube having a damaged node which is capable of using unused nodes (surplus nodes) in an n-number of node disjoint paths each having a length of n with respect to n-dimensional hypercubes more than 4-cube, so that it is possible to obtain an n-number of node disjoint paths each having a length of n even though there are damaged nodes. The method includes the steps of a first step for forming a linear arrangement consisting of an n-number of integers (0, 1, 2, . . .
    Type: Grant
    Filed: January 15, 1998
    Date of Patent: March 14, 2000
    Assignee: Electronics and Telecommunications Research Intitute
    Inventor: Ki Song Yoon
  • Patent number: 6035374
    Abstract: A method of executing coded instructions in a dynamically configurable multiprocessor having shared execution resources including steps of placing a first processor in an active state upon booting of the multiprocessor. In response to a processor create command, a second processor is placed in an active state. When either the first or second processor encounter a cache miss that has to be serviced by off-chip cache the processor requiring service is placed in nap state in which instruction fetching for that processor is disabled. When either the first or second processor encounter a cache miss that has to be serviced by main memory, the processor requiring services I placed in a sleep state by flushing all instructions from the processor in the sleep state and disabling instruction fetching for the processor in the sleep state.
    Type: Grant
    Filed: June 25, 1997
    Date of Patent: March 7, 2000
    Assignee: Sun Microsystems, Inc.
    Inventors: Ramesh Panwar, Joseph I. Chamdani
  • Patent number: 6021453
    Abstract: A novel architecture is based on a general purpose microcomputer with an "upstream" bus and a "downstream" bus. The upstream bus interfaces to an integrated multiport RAM that is shared between an upstream processor and the local processor, and possesses both upstream and local (downstream) interrupts associated with dedicated locations in RAM. The upstream bus can be operated in two modes, a standard (EISA) PC bus MASTER mode in which the dual port RAM is compatible with an IBM PC bus and a SLAVE mode in which the upstream bus is compatible with the downstream bus. An indefinitely long chain of such processors can be initialized by one host. Orthogonal channels (decoupled from the main upstream/downstream bus) can be used to achieve unique functionality based on host control of arrays of such processors.
    Type: Grant
    Filed: September 9, 1997
    Date of Patent: February 1, 2000
    Inventor: Edwin E. Klingman
  • Patent number: 5991866
    Abstract: A system and method for generating a program to enable reassignment of data items among processors in a massively-parallel computer to effect a predetermined rearrangement of address bits. The computer has a plurality of processing elements, each including a memory. Each memory includes a plurality of storage locations for storing a data item, each storage location within the computer being identified by an address, comprising a plurality of address bits having a global portion comprising a processing element identification portion and a local portion identifying the storage location within the memory of the particular processing element. The system generates a program to facilitate use of a predetermined set of tools to effect a reassignment of data items among processing elements and storage location to, in turn, effect a predetermined rearrangement of address bits. The system includes a global processing portion and a local processing portion.
    Type: Grant
    Filed: June 7, 1994
    Date of Patent: November 23, 1999
    Assignee: TM Patents, LP
    Inventors: Steven K. Heller, Andrew Shaw
  • Patent number: 5991867
    Abstract: A transmit scheduler and method of operation are provided for an asynchronous transfer mode network. The transmit scheduler is operable to write data to and read data from a scheduler table and a virtual channel identifier ("VCI") table in order to schedule cells for virtual channels. The transmit scheduler calculates a location in the scheduler table in which to schedule a cell for a current virtual channel and determines whether a cell for a prior virtual channel is scheduled in the calculated location in the scheduler table. The transmit scheduler then schedules the cell for the current virtual channel at the calculated location in the scheduler table. If a cell for a prior virtual channel was scheduled in the calculated location in the scheduler table, the transmit scheduler writes a pointer into a next pointer field of a record for the current virtual channel in the VCI table, where the pointer provides a link to a record for the prior virtual channel in the VCI table.
    Type: Grant
    Filed: September 12, 1996
    Date of Patent: November 23, 1999
    Assignee: Efficient Networks, Inc.
    Inventor: Klaus S. Fosmark
  • Patent number: 5944811
    Abstract: In a superscalar processor for fetching a prescribed peak number of instructions in parallel in each period until such instructions are fetched to a predetermined peak number, such as ten, an instruction parallel issue and execution administrating device comprises a forward map buffer for a forward map indicative of a result of each instruction for use as an operand by which one of other instructions of the predetermined peak number. The forward map is developed before the result is actually produced and is used, after the actual production, to indicate which one of such results should be used as the operand by the above-mentiond one of the other instructions.
    Type: Grant
    Filed: August 29, 1997
    Date of Patent: August 31, 1999
    Assignee: NEC Corporation
    Inventor: Masato Motomura
  • Patent number: 5935230
    Abstract: At least two clusters of CPUs are present in a multiprocessor computer system. Each CPU cluster has a given number of CPUs, each CPU having an associated ID such as an ID number. An additional ID number, not associated with a CPU in the same cluster, is associated with the opposite CPU cluster that appears to the original cluster as a "phantom" processor. A round-robin bus arbitration scheme allows ordered ownership of a common bus within a first cluster until the ID reaches the "phantom" processor, at which time bus ownership passes to a CPU in the second cluster. This arrangement is preferably symmetric, so that when a CPU from the first cluster requests ownership of the bus, it is granted bus ownership by virtue of the first cluster's appearance to the second cluster as a "phantom" CPU.
    Type: Grant
    Filed: July 9, 1997
    Date of Patent: August 10, 1999
    Assignee: Amiga Development, LLC
    Inventors: Felix Pinai, Manhtien Phan
  • Patent number: RE36954
    Abstract: In a parallel computer system using a SIMD method constituted by a controller and a plurality of processor elements, each of the processor elements has a storage unit to store data to be processed, the controller controls operation of the processor elements, and the parallel computer system performs processing of the data based on a calculation control signal transmitted from the controller. The parallel computer system further a data collection unit connected between the processor elements and the controller for receiving output data from the processor elements, performing a predetermined calculation, and outputting calculated data to the controller; and a calculation control unit connected between the data collection unit and the controller for transmitting the calculation control signal from the controller to the data calculation unit to make it possible to perform the predetermined calculation in the data collection circuit.
    Type: Grant
    Filed: July 19, 1995
    Date of Patent: November 14, 2000
    Assignee: Fujitsu Ltd.
    Inventors: Tatsuya Shindo, Kaoru Kawamura, Masanobu Umeda, Toshiyuki Shibuya, Hideki Miwatari