Application Specific Patents (Class 712/17)
-
Patent number: 6477697Abstract: An automated processor design tool uses a description of customized processor instruction set extensions in a standardized language to develop a configurable definition of a target instruction set, a Hardware Description Language description of circuitry necessary to implement the instruction set, and development tools such as a compiler, assembler, debugger and simulator which can be used to develop applications for the processor and to verify it. The standardized language is capable of handling instruction set extensions which modify processor state or use configurable processors. By providing a constrained domain of extensions and optimizations, the process can be automated to a high degree, thereby facilitating fast and reliable development.Type: GrantFiled: May 28, 1999Date of Patent: November 5, 2002Assignee: Tensilica, Inc.Inventors: Earl A. Killian, Richard Ruddell, Albert Ren-Rui Wang
-
Patent number: 6366998Abstract: The present invention generally relates to a hybrid VLIW-SIMD programming model for a digital signal processor. The hybrid programming model broadcasts a packet of information to a plurality of functional units or processing elements. Each packet contains several instructions having certain characteristics, such as instruction type and instruction length, among others. The hybrid programming model includes functional units which are reconfigurable based upon the instructions with an instruction packet and the availability of the functional units. The model groups the functional units such that the operations specified in the instructions can be efficiently executed and selects which functional units should be utilized for a given operation.Type: GrantFiled: October 14, 1998Date of Patent: April 2, 2002Assignee: Conexant Systems, Inc.Inventor: Moataz A. Mohamed
-
Patent number: 6366997Abstract: Processing element to processing element switch connection control is described using a receive model that precludes communication hazards from occurring in a synchronous MIMD mode of operation. Such control allows different communication topologies and various processing effects such as an array transpose, hypercomplement or the like to be efficiently achieved utilizing architectures, such as the manifold array processing architecture. An encoded instruction method reduces the amount of state information and setup burden on the programmer taking advantage of the recognition that the majority of algorithms will use only a small fraction of all possible mux settings available. Thus, by means of transforming the PE identification based upon a communication path specified by a PE communication instruction an efficient switch control mechanism can be used.Type: GrantFiled: August 29, 2000Date of Patent: April 2, 2002Assignee: BOPS, Inc.Inventors: Edwin F. Barry, Gerald G. Pechanek, Thomas L. Drabenstott, Edward A. Wolff, Nikos P. Pitsianis, Grayson Morris
-
Patent number: 6351798Abstract: The present invention provides an address resolution method for use in a multiprocessor system with distributed shared memory. The method allows users to change a memory configuration and a system configuration to increase system operation flexibility and to isolate errors. A cell controller indexes into an address resolution table using the high-order part of a processor-specified address. A write protection flag specifies whether to permit write access from other cells. An attempt to write-access a cell inhibited for write access causes a logical circuit to output an access exception signal.Type: GrantFiled: June 15, 1999Date of Patent: February 26, 2002Assignee: NEC CorporationInventor: Fumio Aono
-
Patent number: 6324638Abstract: A processor capable of executing vector instructions includes at least an instruction sequencing unit and a vector processing unit that receives vector instructions to be executed from the instruction sequencing unit. The vector processing unit includes a plurality of multiply structures, each containing only a single multiply array, that each correspond to at least one element of a vector input operand. Utilizing the single multiply array, each of the plurality of multiply structures is capable of performing a multiplication operation on one element of a vector input operand and is also capable of performing a multiplication operation on multiple elements of a vector input operand concurrently. In an embodiment in which the maximum length of an element of a vector input operand is N bits, each of the plurality of multiply arrays can handle both N by N bit integer multiplication and M by M bit integer multiplication, where N is a non-unitary integer multiple of M.Type: GrantFiled: March 31, 1999Date of Patent: November 27, 2001Assignee: International Business Machines CorporationInventors: Thomas Elmer, Michael Putrino
-
Patent number: 6308250Abstract: A method and system for operating a computing system having multiple processing units. According to a new machine instruction, called the iota instruction, the computing system operates on a vector of mask bits to generate an iota vector having a sequence of values. In one form, each value of the iota vector is a sum of a series of the lower order mask bits up to and including the mask bit corresponding to the entry in the iota vector. In another form, each entry in the iota vector is a sum of a series of lower order mask bits but does not include the mask bit corresponding to the particular entry in the iota vector. In order to calculate the iota vector, the multiple processing units of the present invention communicate the mask bits to the other processing units. Advantages of the present invention include the vectorization of software loops having certain data hazards that prevented conventional compilers from vectorizing the software.Type: GrantFiled: June 23, 1998Date of Patent: October 23, 2001Assignee: Silicon Graphics, Inc.Inventor: Peter Michael Klausler
-
Patent number: 6263415Abstract: The present invention provides a new crossbar switch which is implemented by a first plurality of chips. Each chip is completely programmable to couple to every node in the system, e.g., from one node to about one thousand nodes (corresponding to present-day technology limits of about one thousand I/O pins) although conventional systems typically support no more than 32 nodes. The crossbar switch can be implemented to support only one node, then one chip can be used to route all 64 bits in parallel for 64 bit microprocessors. A second plurality of chips in parallel provides the redundancy necessary for a high availability system.Type: GrantFiled: April 21, 1999Date of Patent: July 17, 2001Assignee: Hewlett-Packard CoInventor: Padmanabha I. Venkitakrishnan
-
Patent number: 6263416Abstract: In a superscalar processor, multiple instructions are executed in parallel to obtain multiple execution results, and the multiple execution results are stored in a working register file. Each execution result in the working register file has at least one status bit associated therewith which identifies the execution result as valid data. The multiple execution results contained in the working register data then retired by changing the status bits associated with each execution result to identify the execution result as final data. In this manner, the speculative data is retired as the final data without data movement of the speculative data, thus reducing a number of ports needed in the superscalar processor.Type: GrantFiled: June 27, 1997Date of Patent: July 17, 2001Assignee: Sun Microsystems, Inc.Inventor: Rajasekhar Cherabuddi
-
Patent number: 6209077Abstract: A general purpose accelerator board and acceleration method comprising use of: one or more programmable logic devices; a plurality of memory blocks; bus interface for communicating data between the memory blocks and devices external to the board; and dynamic programming capabilities for providing logic to the programmable logic device to be executed on data in the memory blocks.Type: GrantFiled: December 21, 1998Date of Patent: March 27, 2001Assignee: Sandia CorporationInventors: Perry J. Robertson, Edward L. Witzke
-
Patent number: 6205533Abstract: A mechanism for performing parallel computations on an emulated spatial lattice by scheduling memory and communication operations on a static mesh-connected array of synchronized processing nodes. The lattice data are divided up among the array of processing nodes, each having a memory and a plurality of processing elements within each node. The memory is assumed to have a hierarchical granular structure that distinguishes groups of bits that are most efficiently accessed together, such as words or rows. The lattice data is organized in memory so that the sets of bits that interact during processing are always accessed together. Such an organization is based on mapping the lattice data into the granular structure of the memories in a manner that has simple spatial translation properties in the emulated space. The mapping permits data movement in the emulated lattice to be achieved by a combination of scheduled memory access and scheduled communication.Type: GrantFiled: August 12, 1999Date of Patent: March 20, 2001Inventor: Norman H. Margolus
-
Publication number: 20010000046Abstract: A processor complex architecture facilitates accurate passing of transient data among processor complex stages of a pipelined processing engine. The processor complex comprises a central processing unit (CPU) coupled to an instruction memory and a pair of context data memory structures via a memory manager circuit. The context memories store transient “context” data for processing by the CPU in accordance with instructions stored in the instruction memory. The architecture further comprises data mover circuitry that cooperates with the context memories and memory manager to provide a technique for efficiently passing data among the stages in a manner that maintains data coherency in the processing engine. An aspect of the architecture is the ability of the CPU to operate on the transient data substantially simultaneously with the passing of that data by the data mover.Type: ApplicationFiled: November 30, 2000Publication date: March 15, 2001Inventors: Michael L. Wright, Darren Kerr, Kenneth Michael Key, William E. Jennings
-
Patent number: 6154809Abstract: A two-dimensional PE (processing element) array that can achieve a small amount of hardware, short transfer time and high flexibility. It includes q.times.r CAMs, where q and r are any integers equal to or greater than two, and hit-flag lines. Each CAM has one-dimensionally arrayed w words, a hit-flag register capable of shift up and shift down, and an upper shift I/O port and a lower shift I/O port for inputting from and outputting to outside the contents of the hit-flag register. Each of the hit-flag lines connects the lower-shift I/O port of one of two horizontally adjacent CAMs with the upper-shift I/O port of the other of the two. The w words are arranged in m rows and n columns and are connected in a zigzag, and each word is assigned to a PE that performs various types of logical and arithmetic operations.Type: GrantFiled: September 11, 1998Date of Patent: November 28, 2000Assignee: Nippon Telegraph & Telephone CorporationInventors: Takeshi Ikenaga, Takeshi Ogura
-
Patent number: 6096091Abstract: An integrated circuit comprising a plurality of reconfigurable logic networks, one or more buffers, a configuration control network, and an embedded processor, all comprised as an integral part of the integrated circuit, and a method of operation of the integrated circuit. One or more of the buffers are coupled between two of the plurality of reconfigurable logic networks. The buffers isolate the plurality of reconfigurable logic networks from one another. The integration control network is coupled to each of the plurality of reconfigurable logic networks, and may also be coupled to one or more buffers. The embedded processor is operable to reconfigure one or more of the plurality of reconfigurable logic networks over the configuration control network. The integrated circuit may also comprise a local memory. The local memory is coupled to the embedded processor, and is operable to store data and/or instructions accessible by the embedded processor.Type: GrantFiled: February 24, 1998Date of Patent: August 1, 2000Assignee: Advanced Micro Devices, Inc.Inventor: Alfred C. Hartmann
-
Patent number: 6094714Abstract: A parallel processing system computer which utilizes the logic programming language Prolog comprising a plurality of processing nodes, each node comprising three central processing units (CPUs), a memory architecture adapted for Prolog execution and interfacing hardware, each processing node being connected to a communication bus and a real-time broadcast bus whereby the real-time data from an input can be broadcast via the real-time broadcast bus to each processing node. One of the CPUs is used to control the communications and scheduling of the node, the other two CPUs are used as sequential Prolog Processors (SPPS). Each node can be arranged such that the collection of unused memory is carried out by one SPP while another SPP continues to run Prolog program enabling continuous real-time operation. The memory architecture is hybrid and comprises local static RAM and global dynamic RAM. The dynamic RAM comprises a Prolog database of known signal for comparison with the real-time input signals.Type: GrantFiled: August 14, 1997Date of Patent: July 25, 2000Assignee: The Secretary of State for Defence in Her Britannic Majesty's Government of the United Kingdom of Great Britain and Northern IrelandInventors: Jonathan Roe, Anthony Pudner, Alan Michael
-
Patent number: 6049859Abstract: The subject matter of the application essentially relates to a matrix array of processor units, each processor unit having, in addition to an arithmetic logic unit and a result register bank, a further arithmetic logic unit, a multiplier/adder unit, a storage unit of a distributed screen section buffer and a local general purpose memory. The processor is distinguished by a high processing speed in conjunction with a small chip area and enables real-time processing even in the case of computation-intensive image processing methods such as 2D convolution, Gabor transformation, Gaussian or Laplacian pyramids, block matching, DCT or MPEG2.Type: GrantFiled: July 15, 1998Date of Patent: April 11, 2000Assignee: Siemens AktiengesellschaftInventors: Jorg Gliese, Ulrich Hachmann, Wolfgang Raab, Alexander Schackow, Ulrich Ramacher, Nikolaus Bruls, Rene Schuffny
-
Patent number: 6041422Abstract: A fault tolerant semiconductor memory system has a main memory (1) having a first plurality of individually addressable storage locations. The system additionally has means for storing the address of ones of the storage locations which are defective, substitute memory comprising a second plurality of individually addressable storage locations mapped to corresponding ones of the defective storage locations, and control means comprising a plurality of comparators (20, 21, 23) for comparing a received address signal with a respective one of the addresses of the defective storage locations, each comparator being directly coupled to a corresponding one of the substitute storage locations, wherein read and write access can be re-routed from a defective storage location to the corresponding substitute storage locations.Type: GrantFiled: October 24, 1997Date of Patent: March 21, 2000Assignee: Memory Corporation Technology LimitedInventor: Alexander Roger Deas
-
Patent number: 6038651Abstract: A remote resource management system for managing resources in a symmetrical multiprocessing comprising a plurality of clusters of symmetric multiprocessors having interfaces between cluster nodes of the symmetric multiprocessor system. Each cluster of the system has a local interface and interface controller. There are one or more remote storage controllers each having its local interface controller, and a local-to-remote data bus. The remote resource manager manages the interface between two clusters of symmetric multiprocessors each of which clusters has a plurality of processors, a shared cache memory, a plurality of I/O adapters and a main memory accessible from the cluster. This remote resource manager manages resources with a remote storage controller to distribute work to a remote controller acting as an agent to perform a desired operation without requiring knowledge of a requester who initiated the work request.Type: GrantFiled: March 23, 1998Date of Patent: March 14, 2000Assignee: International Business Machines CorporationInventors: Gary Alan VanHuben, Michael A. Blake, Pak-kin Mak
-
Patent number: 6029001Abstract: A system for compiling a computer program to implement parallel image processing on a computer having a plurality of arithmetic processors. The program is analyzed to determine whether it contains a parallel image processing identifier, and if so, a plurality of parallel image processing execution codes are generated for use by the arithmetic processors. Thereby, allowing image processing to be conducted at an increased speed.Type: GrantFiled: July 22, 1997Date of Patent: February 22, 2000Assignee: Sony CorporationInventors: Satoshi Katsuo, Taro Shigata
-
Patent number: 6023742Abstract: A configurable computing architecture (10) has its functionality controlled by a combination of static and dynamic control, wherein the configuration is referred to as static control and instructions are referred to as dynamic control. A reconfigurable data path (12) has a plurality of elements including functional units (32, 36), registers (30), and memories (34) whose interconnection and functionality is determined by a combination of static and dynamic control. These elements are connected together, using the static configuration, into a pipelined data path that performs a computation of interest. The dynamic control signals (21) are suitably used to change the operation of a functional unit and the routing of signals between functional units. The static control signals (23) are provided each by a static memory cell (62) that is written by a host (13). The controller (14) generates control instructions (16) that are interpreted by a control path (18) that computes the dynamic control signals.Type: GrantFiled: July 18, 1997Date of Patent: February 8, 2000Assignee: University of WashingtonInventors: William Henry Carl Ebeling, Darren Charles Cronquist, Paul David Franklin
-
Patent number: 5935216Abstract: A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.Type: GrantFiled: August 22, 1991Date of Patent: August 10, 1999Assignee: Sandia CorporationInventors: Robert E. Benner, John L. Gustafson, Gary R. Montry