Concurrent Patents (Class 712/9)
  • Patent number: 7027446
    Abstract: A method and apparatus for of high-speed and memory efficient rule matching, the rule matching being performed on an m-dimensional universe with each dimension bound by a given range of coordinate values, and a set of rules that apply to an undetermined number of coordinates in that universe. More specifically, a high-speed computer based packet classification system, uses an innovative set intersection memory configuration to provide efficient matching of packets flowing through a network system to a specific process flow based on a packet tuple. The system also provides classification of packets as they flow through a network system. More particularly, this system correlates these flowing packets with previously received packets, along with identifying the packets so that they are handled efficiently. The ability to correlate packets to their corresponding process flows permits the implementation of service aware networks (SAN) that are capable of handling network situations at the application level.
    Type: Grant
    Filed: July 18, 2001
    Date of Patent: April 11, 2006
    Assignee: P-Cube Ltd.
    Inventors: Amir Rosenfeld, Ori Finkelman, Reuven A Marko
  • Patent number: 6963341
    Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.
    Type: Grant
    Filed: May 20, 2003
    Date of Patent: November 8, 2005
    Inventor: Tibet Mimar
  • Patent number: 6954841
    Abstract: A configuration of vector units, digital circuitry and associated instructions is disclosed for the parallel processing of multiple Viterbi decoder butterflies on a programmable digital signal processor (DSP) that is based on single-instruction-multiple-data (SIMD) principles and provides indirect access to vector elements. The disclosed configuration uses a processor with two vector units and associated registers, where the vector units are connected back to back for processing Viterbi decoder state metrics. Viterbi add instructions increment vectors of state metrics from a first register, performing a desired permutation of state metrics while reading them indirectly through vector pointers, and writing intermediate result vectors to a second register.
    Type: Grant
    Filed: September 13, 2002
    Date of Patent: October 11, 2005
    Assignee: International Business Machines Corporation
    Inventors: Jaime Humberto Moreno, Fredy Daniel Neeser
  • Patent number: 6922716
    Abstract: A processor includes a first vector processing unit including a first register file and first vector arithmetic logic unit; a second vector processing unit including a second register file and second vector arithmetic logic unit wherein the first register file has a first plurality of cross connections to the second vector arithmetic logic unit; wherein the second register file as a second plurality of cross connections to the first vector arithmetic logic unit.
    Type: Grant
    Filed: July 13, 2001
    Date of Patent: July 26, 2005
    Assignee: Motorola, Inc.
    Inventors: Vipul Anil Desai, David P. Gurney, Benson Chau
  • Patent number: 6904510
    Abstract: A data processor that can perform instructions in parallel on respective fields in an operand includes a respective multiplexer for each of the respective fields. Each respective multiplexer is controlled by condition data for a particular field, preferably from an addressable storage unit. The condition may take three or more values for each field, which allows multiplexing between three or more values, reflecting a less than, equal to, or greater than relation between respective compare inputs. The inputs of the multiplexers can share read ports to a register file with more than one functional unit connected to only two read ports.
    Type: Grant
    Filed: October 7, 1999
    Date of Patent: June 7, 2005
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Fransiscus W. Sijstermans
  • Patent number: 6782468
    Abstract: A shared memory type vector processing system in which CPUs are connected by a bus for transferring a vector processing instruction generated from any of the CPUs to each of the CPUs, and the respective CPUs are grouped into a master CPU which issues a vector processing instruction to other CPUs and slave CPUs operating as a multi-vector pipeline in synchronization with a vector processing unit in the master CPU, the master CPU including a memory access control unit for issuing said vector processing instruction with issuing source CPU information attached for identifying an issuing source CPU, and transferring said instruction to all the CPUs including its own CPU through a bus, and the master CPU and the slave CPU including a vector processing instruction control unit for comparing issuing source CPU information contained in a vector processing instruction and master CPU information set at its own CPU and conducting instruction issuance based on the vector processing instruction when the information accord
    Type: Grant
    Filed: December 13, 1999
    Date of Patent: August 24, 2004
    Assignee: NEC Corporation
    Inventor: Satoshi Nakazato
  • Patent number: 6560775
    Abstract: A method and system for preparing branch instruction of a computer program, for compiling and execution in a computer system, in which each transfer instruction is split into two instructions: a control transfer preparation instruction and a control transfer instruction, wherein the control transfer preparation instruction contains the transfer address and is placed by the compiler several instructions ahead of the control transfer instruction, so that the number of clock cycles in the pipeline between transfer condition generation and transfer itself would be reduced.
    Type: Grant
    Filed: December 24, 1998
    Date of Patent: May 6, 2003
    Assignee: Elbrus International Limited
    Inventors: Alexander M. Artymov, Boris A. Babaian, Feodor A. Gruzdov, Alexey P. Lizorkin, Yuli K. Sakhin, Evgeny Z. Stolyarsky
  • Patent number: 6504495
    Abstract: A clipping and quantization technique is described for producing clipped numbers in a range of 0 to N−1 (from unclipped numbers in a range of −0.5N to (1.5N−1)), where N is 2m and m is the bit length of the desired clipped and quantized number. The most significant bit of the unclipped data value indicates whether an overflow of the permitted range has occurred and that clipping is required. The next most significant bit (m−1th) indicates which saturated value should be adopted. These properties of the unclipped data value may be exploited to generate the desired clipped and quantized numbers using logical left shifting and conditionally executed saturating instructions executing upon a general purpose processor 24. The shifting operations performed to achieve saturation operation may simultaneously yield quantization.
    Type: Grant
    Filed: February 17, 1999
    Date of Patent: January 7, 2003
    Assignee: Arm Limited
    Inventors: Dominic Hugo Symes, Wilco Dijkstra
  • Patent number: 6446193
    Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.
    Type: Grant
    Filed: September 8, 1997
    Date of Patent: September 3, 2002
    Assignee: Agere Systems Guardian Corp.
    Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
  • Patent number: 6401194
    Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.
    Type: Grant
    Filed: January 28, 1997
    Date of Patent: June 4, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
  • Publication number: 20020019925
    Abstract: A field programmable processor comprises a regular array of processing elements each of which is adapted to perform a fixed arithmetic function on packets of data. The processing elements are interconnected by an array of signal conductors extending adjacent the processing elements. Switching means are provided for selectively connecting the processing elements to the adjacent signal conductors so as to interconnect the processing elements. Program data representing desired processing element interconnections is stored, the switching means is controlled in accordance with the stored program data to achieve the desired processing element interconnections. The packets of data are transmitted between the interconnected processing elements.
    Type: Application
    Filed: December 4, 1998
    Publication date: February 14, 2002
    Inventors: ANDREW DEWHURST, GORDEN WORK
  • Patent number: 6336154
    Abstract: A computer system comprises: a processing system for processing data; a memory for storing data processed by, or to be processed by, the processing system; a memory access controller for controlling access to the memory; and at least one data buffer for buffering data to be written to or read from the memory. A burst controller is provided for issuing burst instructions to the memory access controller, and the memory access controller is responsive to such a burst instruction to transfer a plurality of data words between the memory and the data buffer in a single memory transaction. A burst instruction queue is provided so that such a burst instruction can be made available for execution by the memory access controller immediately after a preceding burst instruction has been executed.
    Type: Grant
    Filed: June 20, 2000
    Date of Patent: January 1, 2002
    Assignee: Hewlett-Packard Company
    Inventors: Dominic Paul McCarthy, Stuart Victor Quick
  • Patent number: 6327668
    Abstract: A multiprocessor computer system which provides fault tolerance includes a number of processing sets. At least one of the processing sets is operable asynchronously of a second processing set. A monitor is connected to receive I/O operations output from the processing sets for identifying faulty operation of those units. The monitor is also operable to synchronise operation of the processing sets by signalling the processing sets on receipt of outputs from those units indicative of a plurality of them being at an equivalent stage of processing. The monitor provides for buffering of I/O operations output from the processing sets and for selective forwarding of those I/O operations to an external I/O bus. The processing set may be formed from a single processor or from multiple processors.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: December 4, 2001
    Assignee: Sun Microsystems, Inc.
    Inventor: Emrys J. Williams
  • Patent number: 6324638
    Abstract: A processor capable of executing vector instructions includes at least an instruction sequencing unit and a vector processing unit that receives vector instructions to be executed from the instruction sequencing unit. The vector processing unit includes a plurality of multiply structures, each containing only a single multiply array, that each correspond to at least one element of a vector input operand. Utilizing the single multiply array, each of the plurality of multiply structures is capable of performing a multiplication operation on one element of a vector input operand and is also capable of performing a multiplication operation on multiple elements of a vector input operand concurrently. In an embodiment in which the maximum length of an element of a vector input operand is N bits, each of the plurality of multiply arrays can handle both N by N bit integer multiplication and M by M bit integer multiplication, where N is a non-unitary integer multiple of M.
    Type: Grant
    Filed: March 31, 1999
    Date of Patent: November 27, 2001
    Assignee: International Business Machines Corporation
    Inventors: Thomas Elmer, Michael Putrino
  • Patent number: 6314471
    Abstract: A method and system in a multithreaded processor for processing events without interrupt notifications. In one aspect of the present invention, an operating system creates a thread to execute on a stream of the processor. During execution of the thread, the thread executes a loop that determines whether an event has occurred and, in response to determining whether an event has occurred, assigns a different thread to process the event so that multiple events can be processed in parallel and so that interrupts are not needed to signal that the event has occurred. Another aspect of the present invention provides a method and system for processing asynchronously occurring events without interrupt notifications. To achieve this processing, a first thread is executed to generate a notification that the event has occurred upon receipt of the asynchronously occurring event.
    Type: Grant
    Filed: November 13, 1998
    Date of Patent: November 6, 2001
    Assignee: Cray Inc.
    Inventors: Gail A. Alverson, Charles David Callahan, II, Susan L. Coatney, Laurence S. Kaplan, Richard D. Korry
  • Patent number: 6308250
    Abstract: A method and system for operating a computing system having multiple processing units. According to a new machine instruction, called the iota instruction, the computing system operates on a vector of mask bits to generate an iota vector having a sequence of values. In one form, each value of the iota vector is a sum of a series of the lower order mask bits up to and including the mask bit corresponding to the entry in the iota vector. In another form, each entry in the iota vector is a sum of a series of lower order mask bits but does not include the mask bit corresponding to the particular entry in the iota vector. In order to calculate the iota vector, the multiple processing units of the present invention communicate the mask bits to the other processing units. Advantages of the present invention include the vectorization of software loops having certain data hazards that prevented conventional compilers from vectorizing the software.
    Type: Grant
    Filed: June 23, 1998
    Date of Patent: October 23, 2001
    Assignee: Silicon Graphics, Inc.
    Inventor: Peter Michael Klausler
  • Patent number: 6269435
    Abstract: A processor implements conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed is divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data has been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: July 31, 2001
    Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology
    Inventors: William J. Dally, Scott Whitney Rixner, John Owens, Ujval J. Kapasi
  • Patent number: 6249858
    Abstract: An information processing apparatus such as a microcomputer consisting of a CPU and a coprocessor is provided. The CPU and the coprocessor are connected through a data bus and an address bus. Switches are disposed in the data bus and the address bus which block communication between the CPU and the coprocessor upon execution of an instruction in the coprocessor, thereby allowing the CPU 1 to operate in parallel to the coprocessor.
    Type: Grant
    Filed: February 16, 1999
    Date of Patent: June 19, 2001
    Assignee: Denso Corporation
    Inventors: Hiroshi Hayakawa, Harutsugu Fukumoto, Hiroaki Tanaka, Hideaki Ishihara
  • Publication number: 20010002484
    Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.
    Type: Application
    Filed: January 4, 2001
    Publication date: May 31, 2001
    Applicant: Sun Microsystems, Inc
    Inventor: Robert Yung
  • Patent number: 6237066
    Abstract: One embodiment of the present invention provides an apparatus that supports multiple outstanding load and/or store requests from an execution engine to multiple sources of data in a computer system. This apparatus includes a load store unit coupled to the execution engine, a first data source and a second data source. This load store unit includes a load address buffer, which contains addresses for multiple outstanding load requests. The load store unit also includes a controller that coordinates data flow between the load address buffer, a register file, the first data source and the second data source so that multiple load requests can simultaneously be outstanding for both the first data source and the second data source. These load requests return in-order for each of the multiple sources of data in the computer system, except for load requests directed to a data cache which can return out-of-order. Load requests may return out-of-order with respect to load requests from other data sources.
    Type: Grant
    Filed: March 22, 1999
    Date of Patent: May 22, 2001
    Assignee: Sun Microsystems, Inc.
    Inventors: Bi-Yu Pan, Marc Tremblay
  • Patent number: 6209126
    Abstract: A stall detecting apparatus and a stall detecting method reduce labor and time to develop a program. The apparatus has an input portion for reading a source program, an interpreter for interpreting the read source program according to processor specifications, an instruction developing unit for developing the interpreted source program into states in pipeline stages of pipeline processing, and a stall detector for detecting stalls in the pipeline processing according to the states of the source program developed in the pipeline stages and providing stall information representing the detected stalls. The stall detecting method realizes these functions of the stall detecting apparatus. The method and apparatus statically analyze a given source program while the source program is being coded and efficiently detect stalls to occur in the source program. The method and apparatus display the stall information together with the source program and a pipeline image of the pipeline processing of the source program.
    Type: Grant
    Filed: August 27, 1998
    Date of Patent: March 27, 2001
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Yoko Sasaki, Atsushi Kunimatsu
  • Patent number: 6202141
    Abstract: A vector multiplication mechanism is provided that partitions vector multiplication operation into even and odd paths. In an odd path, odd data elements of first and second source vectors are selected, and multiplication operation is performed between each of the selected odd data elements of the first source vector and corresponding one of the selected odd data elements of the second source vector. In an even path, even data elements of the source vectors are selected, and multiplication operation is performed between each of the selected even data elements of the first source vector and corresponding one of the selected even data elements of the second source vector. Elements of resultant data of the two paths are merged together in a merge operation. The vector multiplication mechanism of the present invention preferably uses a single general-purpose register to store the resultant data of the odd path and the even path.
    Type: Grant
    Filed: June 16, 1998
    Date of Patent: March 13, 2001
    Assignee: International Business Machines Corporation
    Inventors: Keith Everett Diefendorff, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Brett Olsson, Hunter Ledbetter Scales, III
  • Patent number: 6157994
    Abstract: A control bit vector storage is provided. The present control bit vector storage (preferably included within a functional unit) stores control bits indicative of a particular instruction. The control bits are divided into multiple control vectors, each vector indicative of one instruction operation. The control bits control dataflow elements within the functional unit to cause the instruction operation to be performed. Additionally, the present control bit vector storage allows complex instructions (or instructions which produce multiple results) to be divided into simpler operations. The hardware included within the functional unit may be reduced to that employed to perform the simpler operations. In one embodiment, the control bit vector storage comprises a plurality of vector storages. Each vector storage comprises a pair of individual vector storages and a shared vector storage. The shared vector storage stores control bits common to both control vectors.
    Type: Grant
    Filed: July 8, 1998
    Date of Patent: December 5, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Marty L. Pflum
  • Patent number: 6073158
    Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.
    Type: Grant
    Filed: July 29, 1993
    Date of Patent: June 6, 2000
    Assignee: Cirrus Logic, Inc.
    Inventors: Robert Marshall Nally, John Charles Schafer
  • Patent number: 6061777
    Abstract: One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.
    Type: Grant
    Filed: October 28, 1997
    Date of Patent: May 9, 2000
    Assignee: International Business Machines Corporation
    Inventors: Hoichi Cheong, Paul Joseph Jordan, Hung Qui Le, Soummya Mallick
  • Patent number: 6055558
    Abstract: A system and method for pacing, or controlling, the processing of multiple producers when a consumer requires results from the producers in natural order. This invention regulates the use of system resources between the producers to ensure that the required results are available to the consumer in natural order with minimal waiting and to prevent unneeded advanced processing by the producers. This invention implements a buffer structure such that each producer writes its results to an associated buffer. Each producer compares its buffer's percentage complete against a next and previous producer's buffer. If a producer produces results too rapidly, the producer suspends itself until it is resumed by the consumer or the previous producer. The consumer reads the results from the buffers in producer order.
    Type: Grant
    Filed: May 28, 1996
    Date of Patent: April 25, 2000
    Assignee: International Business Machines Corporation
    Inventors: Fen-Ling Lin, Bryan F. Smith, Yun Wang
  • Patent number: 6044448
    Abstract: A processor having a sliceable architecture wherein a slice is the minimum configuration of the processor datapath. The processor can instantiate multiple slices and each slice has a separate datapath. The total processor datapath is the sum of the number of slices multiplied by the width of a slice. Accordingly, all general purpose registers in the processor are as wide as the total datapath. A program executing on the processor can determine the maximum number of slices available in a particular processor by reading a register. In addition, a program can select the number of slices it will use by writing to a different register. The processor replicates control signals for each active slice in the processor and supports instructions for transferring data among the slices. Furthermore, the processor supports a set of instructions for fetching and storing data between multiple slices and the memory.
    Type: Grant
    Filed: December 16, 1997
    Date of Patent: March 28, 2000
    Assignee: S3 Incorporated
    Inventors: Nitin Agrawal, Sunil Nanda
  • Patent number: 5996066
    Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.
    Type: Grant
    Filed: October 10, 1996
    Date of Patent: November 30, 1999
    Assignee: Sun Microsystems, Inc.
    Inventor: Robert Yung