Concurrent Patents (Class 712/9)

Method and apparatus for set intersection rule matching

Patent number: 7027446

Abstract: A method and apparatus for of high-speed and memory efficient rule matching, the rule matching being performed on an m-dimensional universe with each dimension bound by a given range of coordinate values, and a set of rules that apply to an undetermined number of coordinates in that universe. More specifically, a high-speed computer based packet classification system, uses an innovative set intersection memory configuration to provide efficient matching of packets flowing through a network system to a specific process flow based on a packet tuple. The system also provides classification of packets as they flow through a network system. More particularly, this system correlates these flowing packets with previously received packets, along with identifying the packets so that they are handled efficiently. The ability to correlate packets to their corresponding process flows permits the implementation of service aware networks (SAN) that are capable of handling network situations at the application level.

Type: Grant

Filed: July 18, 2001

Date of Patent: April 11, 2006

Assignee: P-Cube Ltd.

Inventors: Amir Rosenfeld, Ori Finkelman, Reuven A Marko
Fast and flexible scan conversion and matrix transpose in a SIMD processor

Patent number: 6963341

Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.

Type: Grant

Filed: May 20, 2003

Date of Patent: November 8, 2005

Inventor: Tibet Mimar
Viterbi decoding for SIMD vector processors with indirect vector element access

Patent number: 6954841

Abstract: A configuration of vector units, digital circuitry and associated instructions is disclosed for the parallel processing of multiple Viterbi decoder butterflies on a programmable digital signal processor (DSP) that is based on single-instruction-multiple-data (SIMD) principles and provides indirect access to vector elements. The disclosed configuration uses a processor with two vector units and associated registers, where the vector units are connected back to back for processing Viterbi decoder state metrics. Viterbi add instructions increment vectors of state metrics from a first register, performing a desired permutation of state metrics while reading them indirectly through vector pointers, and writing intermediate result vectors to a second register.

Type: Grant

Filed: September 13, 2002

Date of Patent: October 11, 2005

Assignee: International Business Machines Corporation

Inventors: Jaime Humberto Moreno, Fredy Daniel Neeser
Method and apparatus for vector processing

Patent number: 6922716

Abstract: A processor includes a first vector processing unit including a first register file and first vector arithmetic logic unit; a second vector processing unit including a second register file and second vector arithmetic logic unit wherein the first register file has a first plurality of cross connections to the second vector arithmetic logic unit; wherein the second register file as a second plurality of cross connections to the first vector arithmetic logic unit.

Type: Grant

Filed: July 13, 2001

Date of Patent: July 26, 2005

Assignee: Motorola, Inc.

Inventors: Vipul Anil Desai, David P. Gurney, Benson Chau
Data processor having a respective multiplexer for each particular field

Patent number: 6904510

Abstract: A data processor that can perform instructions in parallel on respective fields in an operand includes a respective multiplexer for each of the respective fields. Each respective multiplexer is controlled by condition data for a particular field, preferably from an addressable storage unit. The condition may take three or more values for each field, which allows multiplexing between three or more values, reflecting a less than, equal to, or greater than relation between respective compare inputs. The inputs of the multiplexers can share read ports to a register file with more than one functional unit connected to only two read ports.

Type: Grant

Filed: October 7, 1999

Date of Patent: June 7, 2005

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Fransiscus W. Sijstermans
Shared memory type vector processing system, including a bus for transferring a vector processing instruction, and control method thereof

Patent number: 6782468

Abstract: A shared memory type vector processing system in which CPUs are connected by a bus for transferring a vector processing instruction generated from any of the CPUs to each of the CPUs, and the respective CPUs are grouped into a master CPU which issues a vector processing instruction to other CPUs and slave CPUs operating as a multi-vector pipeline in synchronization with a vector processing unit in the master CPU, the master CPU including a memory access control unit for issuing said vector processing instruction with issuing source CPU information attached for identifying an issuing source CPU, and transferring said instruction to all the CPUs including its own CPU through a bus, and the master CPU and the slave CPU including a vector processing instruction control unit for comparing issuing source CPU information contained in a vector processing instruction and master CPU information set at its own CPU and conducting instruction issuance based on the vector processing instruction when the information accord

Type: Grant

Filed: December 13, 1999

Date of Patent: August 24, 2004

Assignee: NEC Corporation

Inventor: Satoshi Nakazato
Branch preparation

Patent number: 6560775

Abstract: A method and system for preparing branch instruction of a computer program, for compiling and execution in a computer system, in which each transfer instruction is split into two instructions: a control transfer preparation instruction and a control transfer instruction, wherein the control transfer preparation instruction contains the transfer address and is placed by the compiler several instructions ahead of the control transfer instruction, so that the number of clock cycles in the pipeline between transfer condition generation and transfer itself would be reduced.

Type: Grant

Filed: December 24, 1998

Date of Patent: May 6, 2003

Assignee: Elbrus International Limited

Inventors: Alexander M. Artymov, Boris A. Babaian, Feodor A. Gruzdov, Alexey P. Lizorkin, Yuli K. Sakhin, Evgeny Z. Stolyarsky
Clipping data values in a data processing system

Patent number: 6504495

Abstract: A clipping and quantization technique is described for producing clipped numbers in a range of 0 to N−1 (from unclipped numbers in a range of −0.5N to (1.5N−1)), where N is 2m and m is the bit length of the desired clipped and quantized number. The most significant bit of the unclipped data value indicates whether an overflow of the permitted range has occurred and that clipping is required. The next most significant bit (m−1th) indicates which saturated value should be adopted. These properties of the unclipped data value may be exploited to generate the desired clipped and quantized numbers using logical left shifting and conditionally executed saturating instructions executing upon a general purpose processor 24. The shifting operations performed to achieve saturation operation may simultaneously yield quantization.

Type: Grant

Filed: February 17, 1999

Date of Patent: January 7, 2003

Assignee: Arm Limited

Inventors: Dominic Hugo Symes, Wilco Dijkstra
Method and apparatus for single cycle processing of data associated with separate accumulators in a dual multiply-accumulate architecture

Patent number: 6446193

Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.

Type: Grant

Filed: September 8, 1997

Date of Patent: September 3, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
Execution unit for processing a data stream independently and in parallel

Patent number: 6401194

Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.

Type: Grant

Filed: January 28, 1997

Date of Patent: June 4, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
FIELD PROGRAMMABLE PROCESSOR USING DEDICATED ARITHMETIC FIXED FUNCTION PROCESSING ELEMENTS

Publication number: 20020019925

Abstract: A field programmable processor comprises a regular array of processing elements each of which is adapted to perform a fixed arithmetic function on packets of data. The processing elements are interconnected by an array of signal conductors extending adjacent the processing elements. Switching means are provided for selectively connecting the processing elements to the adjacent signal conductors so as to interconnect the processing elements. Program data representing desired processing element interconnections is stored, the switching means is controlled in accordance with the stored program data to achieve the desired processing element interconnections. The packets of data are transmitted between the interconnected processing elements.

Type: Application

Filed: December 4, 1998

Publication date: February 14, 2002

Inventors: ANDREW DEWHURST, GORDEN WORK
Method of operating a computer system by identifying source code computational elements in main memory

Patent number: 6336154

Abstract: A computer system comprises: a processing system for processing data; a memory for storing data processed by, or to be processed by, the processing system; a memory access controller for controlling access to the memory; and at least one data buffer for buffering data to be written to or read from the memory. A burst controller is provided for issuing burst instructions to the memory access controller, and the memory access controller is responsive to such a burst instruction to transfer a plurality of data words between the memory and the data buffer in a single memory transaction. A burst instruction queue is provided so that such a burst instruction can be made available for execution by the memory access controller immediately after a preceding burst instruction has been executed.

Type: Grant

Filed: June 20, 2000

Date of Patent: January 1, 2002

Assignee: Hewlett-Packard Company

Inventors: Dominic Paul McCarthy, Stuart Victor Quick
Determinism in a multiprocessor computer system and monitor and processor therefor

Patent number: 6327668

Abstract: A multiprocessor computer system which provides fault tolerance includes a number of processing sets. At least one of the processing sets is operable asynchronously of a second processing set. A monitor is connected to receive I/O operations output from the processing sets for identifying faulty operation of those units. The monitor is also operable to synchronise operation of the processing sets by signalling the processing sets on receipt of outputs from those units indicative of a plurality of them being at an equivalent stage of processing. The monitor provides for buffering of I/O operations output from the processing sets and for selective forwarding of those I/O operations to an external I/O bus. The processing set may be formed from a single processor or from multiple processors.

Type: Grant

Filed: June 30, 1998

Date of Patent: December 4, 2001

Assignee: Sun Microsystems, Inc.

Inventor: Emrys J. Williams
Processor having vector processing capability and method for executing a vector instruction in a processor

Patent number: 6324638

Abstract: A processor capable of executing vector instructions includes at least an instruction sequencing unit and a vector processing unit that receives vector instructions to be executed from the instruction sequencing unit. The vector processing unit includes a plurality of multiply structures, each containing only a single multiply array, that each correspond to at least one element of a vector input operand. Utilizing the single multiply array, each of the plurality of multiply structures is capable of performing a multiplication operation on one element of a vector input operand and is also capable of performing a multiplication operation on multiple elements of a vector input operand concurrently. In an embodiment in which the maximum length of an element of a vector input operand is N bits, each of the plurality of multiply arrays can handle both N by N bit integer multiplication and M by M bit integer multiplication, where N is a non-unitary integer multiple of M.

Type: Grant

Filed: March 31, 1999

Date of Patent: November 27, 2001

Assignee: International Business Machines Corporation

Inventors: Thomas Elmer, Michael Putrino
Techniques for an interrupt free operating system

Patent number: 6314471

Abstract: A method and system in a multithreaded processor for processing events without interrupt notifications. In one aspect of the present invention, an operating system creates a thread to execute on a stream of the processor. During execution of the thread, the thread executes a loop that determines whether an event has occurred and, in response to determining whether an event has occurred, assigns a different thread to process the event so that multiple events can be processed in parallel and so that interrupts are not needed to signal that the event has occurred. Another aspect of the present invention provides a method and system for processing asynchronously occurring events without interrupt notifications. To achieve this processing, a first thread is executed to generate a notification that the event has occurred upon receipt of the asynchronously occurring event.

Type: Grant

Filed: November 13, 1998

Date of Patent: November 6, 2001

Assignee: Cray Inc.

Inventors: Gail A. Alverson, Charles David Callahan, II, Susan L. Coatney, Laurence S. Kaplan, Richard D. Korry
Method and apparatus for processing a set of data values with plural processing units mask bits generated by other processing units

Patent number: 6308250

Abstract: A method and system for operating a computing system having multiple processing units. According to a new machine instruction, called the iota instruction, the computing system operates on a vector of mask bits to generate an iota vector having a sequence of values. In one form, each value of the iota vector is a sum of a series of the lower order mask bits up to and including the mask bit corresponding to the entry in the iota vector. In another form, each entry in the iota vector is a sum of a series of lower order mask bits but does not include the mask bit corresponding to the particular entry in the iota vector. In order to calculate the iota vector, the multiple processing units of the present invention communicate the mask bits to the other processing units. Advantages of the present invention include the vectorization of software loops having certain data hazards that prevented conventional compilers from vectorizing the software.

Type: Grant

Filed: June 23, 1998

Date of Patent: October 23, 2001

Assignee: Silicon Graphics, Inc.

Inventor: Peter Michael Klausler
System and method for implementing conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector

Patent number: 6269435

Abstract: A processor implements conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed is divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data has been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication.

Type: Grant

Filed: September 14, 1998

Date of Patent: July 31, 2001

Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology

Inventors: William J. Dally, Scott Whitney Rixner, John Owens, Ujval J. Kapasi
Information processing apparatus having a CPU and an auxiliary arithmetic unit for achieving high-speed operation

Patent number: 6249858

Abstract: An information processing apparatus such as a microcomputer consisting of a CPU and a coprocessor is provided. The CPU and the coprocessor are connected through a data bus and an address bus. Switches are disposed in the data bus and the address bus which block communication between the CPU and the coprocessor upon execution of an instruction in the coprocessor, thereby allowing the CPU 1 to operate in parallel to the coprocessor.

Type: Grant

Filed: February 16, 1999

Date of Patent: June 19, 2001

Assignee: Denso Corporation

Inventors: Hiroshi Hayakawa, Harutsugu Fukumoto, Hiroaki Tanaka, Hideaki Ishihara
Visual instruction set for CPU with integrated graphics functions

Publication number: 20010002484

Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.

Type: Application

Filed: January 4, 2001

Publication date: May 31, 2001

Applicant: Sun Microsystems, Inc

Inventor: Robert Yung
Supporting multiple outstanding requests to multiple targets in a pipelined memory system

Patent number: 6237066

Abstract: One embodiment of the present invention provides an apparatus that supports multiple outstanding load and/or store requests from an execution engine to multiple sources of data in a computer system. This apparatus includes a load store unit coupled to the execution engine, a first data source and a second data source. This load store unit includes a load address buffer, which contains addresses for multiple outstanding load requests. The load store unit also includes a controller that coordinates data flow between the load address buffer, a register file, the first data source and the second data source so that multiple load requests can simultaneously be outstanding for both the first data source and the second data source. These load requests return in-order for each of the multiple sources of data in the computer system, except for load requests directed to a data cache which can return out-of-order. Load requests may return out-of-order with respect to load requests from other data sources.

Type: Grant

Filed: March 22, 1999

Date of Patent: May 22, 2001

Assignee: Sun Microsystems, Inc.

Inventors: Bi-Yu Pan, Marc Tremblay
Stall detecting apparatus, stall detecting method, and medium containing stall detecting program

Patent number: 6209126

Abstract: A stall detecting apparatus and a stall detecting method reduce labor and time to develop a program. The apparatus has an input portion for reading a source program, an interpreter for interpreting the read source program according to processor specifications, an instruction developing unit for developing the interpreted source program into states in pipeline stages of pipeline processing, and a stall detector for detecting stalls in the pipeline processing according to the states of the source program developed in the pipeline stages and providing stall information representing the detected stalls. The stall detecting method realizes these functions of the stall detecting apparatus. The method and apparatus statically analyze a given source program while the source program is being coded and efficiently detect stalls to occur in the source program. The method and apparatus display the stall information together with the source program and a pipeline image of the pipeline processing of the source program.

Type: Grant

Filed: August 27, 1998

Date of Patent: March 27, 2001

Assignee: Kabushiki Kaisha Toshiba

Inventors: Yoko Sasaki, Atsushi Kunimatsu
Method and apparatus for performing vector operation using separate multiplication on odd and even data elements of source vectors

Patent number: 6202141

Abstract: A vector multiplication mechanism is provided that partitions vector multiplication operation into even and odd paths. In an odd path, odd data elements of first and second source vectors are selected, and multiplication operation is performed between each of the selected odd data elements of the first source vector and corresponding one of the selected odd data elements of the second source vector. In an even path, even data elements of the source vectors are selected, and multiplication operation is performed between each of the selected even data elements of the first source vector and corresponding one of the selected even data elements of the second source vector. Elements of resultant data of the two paths are merged together in a merge operation. The vector multiplication mechanism of the present invention preferably uses a single general-purpose register to store the resultant data of the odd path and the even path.

Type: Grant

Filed: June 16, 1998

Date of Patent: March 13, 2001

Assignee: International Business Machines Corporation

Inventors: Keith Everett Diefendorff, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Brett Olsson, Hunter Ledbetter Scales, III
Microprocessor employing and method of using a control bit vector storage for instruction execution

Patent number: 6157994

Abstract: A control bit vector storage is provided. The present control bit vector storage (preferably included within a functional unit) stores control bits indicative of a particular instruction. The control bits are divided into multiple control vectors, each vector indicative of one instruction operation. The control bits control dataflow elements within the functional unit to cause the instruction operation to be performed. Additionally, the present control bit vector storage allows complex instructions (or instructions which produce multiple results) to be divided into simpler operations. The hardware included within the functional unit may be reduced to that employed to perform the simpler operations. In one embodiment, the control bit vector storage comprises a plurality of vector storages. Each vector storage comprises a pair of individual vector storages and a shared vector storage. The shared vector storage stores control bits common to both control vectors.

Type: Grant

Filed: July 8, 1998

Date of Patent: December 5, 2000

Assignee: Advanced Micro Devices, Inc.

Inventor: Marty L. Pflum
System and method for processing multiple received signal sources

Patent number: 6073158

Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.

Type: Grant

Filed: July 29, 1993

Date of Patent: June 6, 2000

Assignee: Cirrus Logic, Inc.

Inventors: Robert Marshall Nally, John Charles Schafer
Apparatus and method for reducing the number of rename registers required in the operation of a processor

Patent number: 6061777

Abstract: One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.

Type: Grant

Filed: October 28, 1997

Date of Patent: May 9, 2000

Assignee: International Business Machines Corporation

Inventors: Hoichi Cheong, Paul Joseph Jordan, Hung Qui Le, Soummya Mallick
Pacing of multiple producers when information is required in natural order

Patent number: 6055558

Abstract: A system and method for pacing, or controlling, the processing of multiple producers when a consumer requires results from the producers in natural order. This invention regulates the use of system resources between the producers to ensure that the required results are available to the consumer in natural order with minimal waiting and to prevent unneeded advanced processing by the producers. This invention implements a buffer structure such that each producer writes its results to an associated buffer. Each producer compares its buffer's percentage complete against a next and previous producer's buffer. If a producer produces results too rapidly, the producer suspends itself until it is resumed by the consumer or the previous producer. The consumer reads the results from the buffers in producer order.

Type: Grant

Filed: May 28, 1996

Date of Patent: April 25, 2000

Assignee: International Business Machines Corporation

Inventors: Fen-Ling Lin, Bryan F. Smith, Yun Wang
Processor having multiple datapath instances

Patent number: 6044448

Abstract: A processor having a sliceable architecture wherein a slice is the minimum configuration of the processor datapath. The processor can instantiate multiple slices and each slice has a separate datapath. The total processor datapath is the sum of the number of slices multiplied by the width of a slice. Accordingly, all general purpose registers in the processor are as wide as the total datapath. A program executing on the processor can determine the maximum number of slices available in a particular processor by reading a register. In addition, a program can select the number of slices it will use by writing to a different register. The processor replicates control signals for each active slice in the processor and supports instructions for transferring data among the slices. Furthermore, the processor supports a set of instructions for fetching and storing data between multiple slices and the memory.

Type: Grant

Filed: December 16, 1997

Date of Patent: March 28, 2000

Assignee: S3 Incorporated

Inventors: Nitin Agrawal, Sunil Nanda
Partitioned multiply and add/subtract instruction for CPU with integrated graphics functions

Patent number: 5996066

Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.

Type: Grant

Filed: October 10, 1996

Date of Patent: November 30, 1999

Assignee: Sun Microsystems, Inc.

Inventor: Robert Yung

prev 1 2