Application Specific Patents (Class 712/17)
  • Patent number: 6477697
    Abstract: An automated processor design tool uses a description of customized processor instruction set extensions in a standardized language to develop a configurable definition of a target instruction set, a Hardware Description Language description of circuitry necessary to implement the instruction set, and development tools such as a compiler, assembler, debugger and simulator which can be used to develop applications for the processor and to verify it. The standardized language is capable of handling instruction set extensions which modify processor state or use configurable processors. By providing a constrained domain of extensions and optimizations, the process can be automated to a high degree, thereby facilitating fast and reliable development.
    Type: Grant
    Filed: May 28, 1999
    Date of Patent: November 5, 2002
    Assignee: Tensilica, Inc.
    Inventors: Earl A. Killian, Richard Ruddell, Albert Ren-Rui Wang
  • Patent number: 6366998
    Abstract: The present invention generally relates to a hybrid VLIW-SIMD programming model for a digital signal processor. The hybrid programming model broadcasts a packet of information to a plurality of functional units or processing elements. Each packet contains several instructions having certain characteristics, such as instruction type and instruction length, among others. The hybrid programming model includes functional units which are reconfigurable based upon the instructions with an instruction packet and the availability of the functional units. The model groups the functional units such that the operations specified in the instructions can be efficiently executed and selects which functional units should be utilized for a given operation.
    Type: Grant
    Filed: October 14, 1998
    Date of Patent: April 2, 2002
    Assignee: Conexant Systems, Inc.
    Inventor: Moataz A. Mohamed
  • Patent number: 6366997
    Abstract: Processing element to processing element switch connection control is described using a receive model that precludes communication hazards from occurring in a synchronous MIMD mode of operation. Such control allows different communication topologies and various processing effects such as an array transpose, hypercomplement or the like to be efficiently achieved utilizing architectures, such as the manifold array processing architecture. An encoded instruction method reduces the amount of state information and setup burden on the programmer taking advantage of the recognition that the majority of algorithms will use only a small fraction of all possible mux settings available. Thus, by means of transforming the PE identification based upon a communication path specified by a PE communication instruction an efficient switch control mechanism can be used.
    Type: Grant
    Filed: August 29, 2000
    Date of Patent: April 2, 2002
    Assignee: BOPS, Inc.
    Inventors: Edwin F. Barry, Gerald G. Pechanek, Thomas L. Drabenstott, Edward A. Wolff, Nikos P. Pitsianis, Grayson Morris
  • Patent number: 6351798
    Abstract: The present invention provides an address resolution method for use in a multiprocessor system with distributed shared memory. The method allows users to change a memory configuration and a system configuration to increase system operation flexibility and to isolate errors. A cell controller indexes into an address resolution table using the high-order part of a processor-specified address. A write protection flag specifies whether to permit write access from other cells. An attempt to write-access a cell inhibited for write access causes a logical circuit to output an access exception signal.
    Type: Grant
    Filed: June 15, 1999
    Date of Patent: February 26, 2002
    Assignee: NEC Corporation
    Inventor: Fumio Aono
  • Patent number: 6324638
    Abstract: A processor capable of executing vector instructions includes at least an instruction sequencing unit and a vector processing unit that receives vector instructions to be executed from the instruction sequencing unit. The vector processing unit includes a plurality of multiply structures, each containing only a single multiply array, that each correspond to at least one element of a vector input operand. Utilizing the single multiply array, each of the plurality of multiply structures is capable of performing a multiplication operation on one element of a vector input operand and is also capable of performing a multiplication operation on multiple elements of a vector input operand concurrently. In an embodiment in which the maximum length of an element of a vector input operand is N bits, each of the plurality of multiply arrays can handle both N by N bit integer multiplication and M by M bit integer multiplication, where N is a non-unitary integer multiple of M.
    Type: Grant
    Filed: March 31, 1999
    Date of Patent: November 27, 2001
    Assignee: International Business Machines Corporation
    Inventors: Thomas Elmer, Michael Putrino
  • Patent number: 6308250
    Abstract: A method and system for operating a computing system having multiple processing units. According to a new machine instruction, called the iota instruction, the computing system operates on a vector of mask bits to generate an iota vector having a sequence of values. In one form, each value of the iota vector is a sum of a series of the lower order mask bits up to and including the mask bit corresponding to the entry in the iota vector. In another form, each entry in the iota vector is a sum of a series of lower order mask bits but does not include the mask bit corresponding to the particular entry in the iota vector. In order to calculate the iota vector, the multiple processing units of the present invention communicate the mask bits to the other processing units. Advantages of the present invention include the vectorization of software loops having certain data hazards that prevented conventional compilers from vectorizing the software.
    Type: Grant
    Filed: June 23, 1998
    Date of Patent: October 23, 2001
    Assignee: Silicon Graphics, Inc.
    Inventor: Peter Michael Klausler
  • Patent number: 6263416
    Abstract: In a superscalar processor, multiple instructions are executed in parallel to obtain multiple execution results, and the multiple execution results are stored in a working register file. Each execution result in the working register file has at least one status bit associated therewith which identifies the execution result as valid data. The multiple execution results contained in the working register data then retired by changing the status bits associated with each execution result to identify the execution result as final data. In this manner, the speculative data is retired as the final data without data movement of the speculative data, thus reducing a number of ports needed in the superscalar processor.
    Type: Grant
    Filed: June 27, 1997
    Date of Patent: July 17, 2001
    Assignee: Sun Microsystems, Inc.
    Inventor: Rajasekhar Cherabuddi
  • Patent number: 6263415
    Abstract: The present invention provides a new crossbar switch which is implemented by a first plurality of chips. Each chip is completely programmable to couple to every node in the system, e.g., from one node to about one thousand nodes (corresponding to present-day technology limits of about one thousand I/O pins) although conventional systems typically support no more than 32 nodes. The crossbar switch can be implemented to support only one node, then one chip can be used to route all 64 bits in parallel for 64 bit microprocessors. A second plurality of chips in parallel provides the redundancy necessary for a high availability system.
    Type: Grant
    Filed: April 21, 1999
    Date of Patent: July 17, 2001
    Assignee: Hewlett-Packard Co
    Inventor: Padmanabha I. Venkitakrishnan
  • Patent number: 6209077
    Abstract: A general purpose accelerator board and acceleration method comprising use of: one or more programmable logic devices; a plurality of memory blocks; bus interface for communicating data between the memory blocks and devices external to the board; and dynamic programming capabilities for providing logic to the programmable logic device to be executed on data in the memory blocks.
    Type: Grant
    Filed: December 21, 1998
    Date of Patent: March 27, 2001
    Assignee: Sandia Corporation
    Inventors: Perry J. Robertson, Edward L. Witzke
  • Patent number: 6205533
    Abstract: A mechanism for performing parallel computations on an emulated spatial lattice by scheduling memory and communication operations on a static mesh-connected array of synchronized processing nodes. The lattice data are divided up among the array of processing nodes, each having a memory and a plurality of processing elements within each node. The memory is assumed to have a hierarchical granular structure that distinguishes groups of bits that are most efficiently accessed together, such as words or rows. The lattice data is organized in memory so that the sets of bits that interact during processing are always accessed together. Such an organization is based on mapping the lattice data into the granular structure of the memories in a manner that has simple spatial translation properties in the emulated space. The mapping permits data movement in the emulated lattice to be achieved by a combination of scheduled memory access and scheduled communication.
    Type: Grant
    Filed: August 12, 1999
    Date of Patent: March 20, 2001
    Inventor: Norman H. Margolus
  • Publication number: 20010000046
    Abstract: A processor complex architecture facilitates accurate passing of transient data among processor complex stages of a pipelined processing engine. The processor complex comprises a central processing unit (CPU) coupled to an instruction memory and a pair of context data memory structures via a memory manager circuit. The context memories store transient “context” data for processing by the CPU in accordance with instructions stored in the instruction memory. The architecture further comprises data mover circuitry that cooperates with the context memories and memory manager to provide a technique for efficiently passing data among the stages in a manner that maintains data coherency in the processing engine. An aspect of the architecture is the ability of the CPU to operate on the transient data substantially simultaneously with the passing of that data by the data mover.
    Type: Application
    Filed: November 30, 2000
    Publication date: March 15, 2001
    Inventors: Michael L. Wright, Darren Kerr, Kenneth Michael Key, William E. Jennings
  • Patent number: 6154809
    Abstract: A two-dimensional PE (processing element) array that can achieve a small amount of hardware, short transfer time and high flexibility. It includes q.times.r CAMs, where q and r are any integers equal to or greater than two, and hit-flag lines. Each CAM has one-dimensionally arrayed w words, a hit-flag register capable of shift up and shift down, and an upper shift I/O port and a lower shift I/O port for inputting from and outputting to outside the contents of the hit-flag register. Each of the hit-flag lines connects the lower-shift I/O port of one of two horizontally adjacent CAMs with the upper-shift I/O port of the other of the two. The w words are arranged in m rows and n columns and are connected in a zigzag, and each word is assigned to a PE that performs various types of logical and arithmetic operations.
    Type: Grant
    Filed: September 11, 1998
    Date of Patent: November 28, 2000
    Assignee: Nippon Telegraph & Telephone Corporation
    Inventors: Takeshi Ikenaga, Takeshi Ogura
  • Patent number: 6096091
    Abstract: An integrated circuit comprising a plurality of reconfigurable logic networks, one or more buffers, a configuration control network, and an embedded processor, all comprised as an integral part of the integrated circuit, and a method of operation of the integrated circuit. One or more of the buffers are coupled between two of the plurality of reconfigurable logic networks. The buffers isolate the plurality of reconfigurable logic networks from one another. The integration control network is coupled to each of the plurality of reconfigurable logic networks, and may also be coupled to one or more buffers. The embedded processor is operable to reconfigure one or more of the plurality of reconfigurable logic networks over the configuration control network. The integrated circuit may also comprise a local memory. The local memory is coupled to the embedded processor, and is operable to store data and/or instructions accessible by the embedded processor.
    Type: Grant
    Filed: February 24, 1998
    Date of Patent: August 1, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Alfred C. Hartmann
  • Patent number: 6094714
    Abstract: A parallel processing system computer which utilizes the logic programming language Prolog comprising a plurality of processing nodes, each node comprising three central processing units (CPUs), a memory architecture adapted for Prolog execution and interfacing hardware, each processing node being connected to a communication bus and a real-time broadcast bus whereby the real-time data from an input can be broadcast via the real-time broadcast bus to each processing node. One of the CPUs is used to control the communications and scheduling of the node, the other two CPUs are used as sequential Prolog Processors (SPPS). Each node can be arranged such that the collection of unused memory is carried out by one SPP while another SPP continues to run Prolog program enabling continuous real-time operation. The memory architecture is hybrid and comprises local static RAM and global dynamic RAM. The dynamic RAM comprises a Prolog database of known signal for comparison with the real-time input signals.
    Type: Grant
    Filed: August 14, 1997
    Date of Patent: July 25, 2000
    Assignee: The Secretary of State for Defence in Her Britannic Majesty's Government of the United Kingdom of Great Britain and Northern Ireland
    Inventors: Jonathan Roe, Anthony Pudner, Alan Michael
  • Patent number: 6049859
    Abstract: The subject matter of the application essentially relates to a matrix array of processor units, each processor unit having, in addition to an arithmetic logic unit and a result register bank, a further arithmetic logic unit, a multiplier/adder unit, a storage unit of a distributed screen section buffer and a local general purpose memory. The processor is distinguished by a high processing speed in conjunction with a small chip area and enables real-time processing even in the case of computation-intensive image processing methods such as 2D convolution, Gabor transformation, Gaussian or Laplacian pyramids, block matching, DCT or MPEG2.
    Type: Grant
    Filed: July 15, 1998
    Date of Patent: April 11, 2000
    Assignee: Siemens Aktiengesellschaft
    Inventors: Jorg Gliese, Ulrich Hachmann, Wolfgang Raab, Alexander Schackow, Ulrich Ramacher, Nikolaus Bruls, Rene Schuffny
  • Patent number: 6041422
    Abstract: A fault tolerant semiconductor memory system has a main memory (1) having a first plurality of individually addressable storage locations. The system additionally has means for storing the address of ones of the storage locations which are defective, substitute memory comprising a second plurality of individually addressable storage locations mapped to corresponding ones of the defective storage locations, and control means comprising a plurality of comparators (20, 21, 23) for comparing a received address signal with a respective one of the addresses of the defective storage locations, each comparator being directly coupled to a corresponding one of the substitute storage locations, wherein read and write access can be re-routed from a defective storage location to the corresponding substitute storage locations.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: March 21, 2000
    Assignee: Memory Corporation Technology Limited
    Inventor: Alexander Roger Deas
  • Patent number: 6038651
    Abstract: A remote resource management system for managing resources in a symmetrical multiprocessing comprising a plurality of clusters of symmetric multiprocessors having interfaces between cluster nodes of the symmetric multiprocessor system. Each cluster of the system has a local interface and interface controller. There are one or more remote storage controllers each having its local interface controller, and a local-to-remote data bus. The remote resource manager manages the interface between two clusters of symmetric multiprocessors each of which clusters has a plurality of processors, a shared cache memory, a plurality of I/O adapters and a main memory accessible from the cluster. This remote resource manager manages resources with a remote storage controller to distribute work to a remote controller acting as an agent to perform a desired operation without requiring knowledge of a requester who initiated the work request.
    Type: Grant
    Filed: March 23, 1998
    Date of Patent: March 14, 2000
    Assignee: International Business Machines Corporation
    Inventors: Gary Alan VanHuben, Michael A. Blake, Pak-kin Mak
  • Patent number: 6029001
    Abstract: A system for compiling a computer program to implement parallel image processing on a computer having a plurality of arithmetic processors. The program is analyzed to determine whether it contains a parallel image processing identifier, and if so, a plurality of parallel image processing execution codes are generated for use by the arithmetic processors. Thereby, allowing image processing to be conducted at an increased speed.
    Type: Grant
    Filed: July 22, 1997
    Date of Patent: February 22, 2000
    Assignee: Sony Corporation
    Inventors: Satoshi Katsuo, Taro Shigata
  • Patent number: 6023742
    Abstract: A configurable computing architecture (10) has its functionality controlled by a combination of static and dynamic control, wherein the configuration is referred to as static control and instructions are referred to as dynamic control. A reconfigurable data path (12) has a plurality of elements including functional units (32, 36), registers (30), and memories (34) whose interconnection and functionality is determined by a combination of static and dynamic control. These elements are connected together, using the static configuration, into a pipelined data path that performs a computation of interest. The dynamic control signals (21) are suitably used to change the operation of a functional unit and the routing of signals between functional units. The static control signals (23) are provided each by a static memory cell (62) that is written by a host (13). The controller (14) generates control instructions (16) that are interpreted by a control path (18) that computes the dynamic control signals.
    Type: Grant
    Filed: July 18, 1997
    Date of Patent: February 8, 2000
    Assignee: University of Washington
    Inventors: William Henry Carl Ebeling, Darren Charles Cronquist, Paul David Franklin
  • Patent number: 5935216
    Abstract: A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
    Type: Grant
    Filed: August 22, 1991
    Date of Patent: August 10, 1999
    Assignee: Sandia Corporation
    Inventors: Robert E. Benner, John L. Gustafson, Gary R. Montry