Vector Processor Patents (Class 712/2)
  • Patent number: 6624819
    Abstract: A method and system for processing graphics data in a computer system are disclosed. The method and system including providing a general-purpose processor and providing a vector co-processor coupled with the general-purpose processor. The general-purpose processor includes an instruction queue for holding a plurality of instructions. The vector co-processor is for processing at least a portion of the graphics data using a portion of the plurality of instructions. The vector co-processor is capable of performing a plurality of mathematical operations in parallel. The plurality of instructions is provided using software written in a general-purpose programming language.
    Type: Grant
    Filed: June 8, 2000
    Date of Patent: September 23, 2003
    Assignee: Broadcom Corporation
    Inventor: Michael C. Lewis
  • Publication number: 20030115577
    Abstract: An optimal code generator for generating structured assembly language expressions is disclosed. Because of the equivalence between unit structured assembly language expressions and the code implementing them, it is possible to represent complex structured assembly language expressions as a vector of unit structured assembly language expressions. A set of rules for systematic manipulation is utilize to allow logical operations on the vector representation of structured assembly language expressions to result in optimal code. Using the equivalence between the code and unit structured assembly language expressions allows the vector representation of a structured assembly language expression to be translated directly into code.
    Type: Application
    Filed: August 24, 2001
    Publication date: June 19, 2003
    Applicant: International Business Machines Corporation
    Inventor: Joseph Franklin Garvey
  • Publication number: 20030079108
    Abstract: A method of executing instructions in a computer system on operands containing a plurality of packed objects in respective lanes of the operand is described. Each instruction defines an operation and contains a condition setting indicator settable independently of the operation. The status of the condition setting indicator determines whether or not multibit condition codes are set. When they are to be set, they are set depending on the results of carrying out the operation for each lane.
    Type: Application
    Filed: November 25, 2002
    Publication date: April 24, 2003
    Applicant: Broadcom UK Ltd.
    Inventor: Sophie Wilson
  • Patent number: 6542981
    Abstract: A method and apparatus for invoking microcode instructions resident on a processor by executing a special RISC instruction on the processor such that special functions are provided. In one embodiment, the special function invoked may be a feature of the processor not included in the processor's publicly known instruction set. In another embodiment, the special function invoked may cause a set of instructions to be transferred from a memory external to the processor to a memory in the processor. In such an embodiment, the method and apparatus include authenticating and decrypting the instructions before transferring from the memory external to the processor to the memory in the processor. In such an embodiment, the method and apparatus may be used for upgrading microcode within a processor by executing the special RISC instruction stored on a writeable non-volatile memory located external to the processor.
    Type: Grant
    Filed: December 28, 1999
    Date of Patent: April 1, 2003
    Assignee: Intel Corporation
    Inventors: Nazar Abbas Zaidi, Gary Hammond, Kin-Yip Liu, Tse-Yu Yeh
  • Patent number: 6523026
    Abstract: A process of identifying terms or sets of terms in target domains having functional relationships (roles) analogous to terms (contained in the query) selected from a source domain whereby queryrelevant but semantically distant (novel) analogies may be retrieved, corresponding to any user defined query. The process is capable of discovering deep functional analogies between terms in source and target domains, even where there is a misleading superficial matching of terms (i.e. same terms, with different meanings) between the query and the target domains. The process comprises the automated generation of abstract representations of source domain content, and application of the abstract representations of content to the efficient discovery of analogous objects in one or more semantically distant target domains. Said abstract representations of terms are preferably vectors in a high dimensionality space, encapsulating characteristic occurrence patterns of terms in the source domain.
    Type: Grant
    Filed: October 2, 2000
    Date of Patent: February 18, 2003
    Assignee: Huntsman International LLC
    Inventor: Herbert R. Gillis
  • Patent number: 6448983
    Abstract: A method for assisting a user in selecting a design of experiment. The method includes obtaining a plurality of attributes associated with a plurality of experimental designs. A series of questions is presented to the user directed to objectives of the design of experiment. The user responds to the questions and one or more user-selected attributes are determined in response to the user's answers to the questions. The process selects or de-selects one or more of the experimental designs in response to the user-selected attributes and notifies the user of the selected experimental designs.
    Type: Grant
    Filed: January 14, 1999
    Date of Patent: September 10, 2002
    Assignee: General Electric Company
    Inventors: Mohamed Ahmed Ali, Ahmed Flasser, Arlie Russell Martin
  • Publication number: 20020040428
    Abstract: A vector computer system includes a plurality of memory banks 40, a vector processor 11, and a plurality of additional processing units 30 each of which is connected to one of the memory banks 40. Each of the additional processing units 30 reads data from the corresponding memory bank 40 by referring to an address designated by the processor 11, and performs a designated operation about the data. Then the additional processing unit 30 stores the result of the operation into the designated address.
    Type: Application
    Filed: September 28, 2001
    Publication date: April 4, 2002
    Inventor: Takumi Washio
  • Patent number: 6349347
    Abstract: A method of configuring peer devices without the unnecessary delay in boot up time using a compatibility bridge. Upon initiating a configuration cycle, a BIOS initialization scans all peer devices located on the host bus. A watchdog timer times out after a predetermined duration when the intended device fails to respond to the configuration cycle. A bit corresponding to the particular device is set in a scorecard register. The compatibility bridge responds to the configuration cycle after the watchdog time-out period.
    Type: Grant
    Filed: June 28, 2000
    Date of Patent: February 19, 2002
    Assignee: Micron Technology, Inc.
    Inventor: A. Kent Porterfield
  • Patent number: 6343337
    Abstract: A crossbar is implemented within multimedia facilities of a processor to perform vector permute operations, in which the bytes of a source operand are reordered in the target output. The crossbar is then reused for other instructions requiring multiplexing or shifting operations, particularly those in which the size of additional multiplexers or the size and delay of a barrel shifter is significant. A wide shift operation, for example, may be performed with one cycle latency by the crossbar and one additional layer of multiplexers or a small barrel shifter. The crossbar facility thus gets reused with improved performance of the instructions now sharing the crossbar and a reduction in the total area required by a multimedia facility within a processor.
    Type: Grant
    Filed: May 17, 2000
    Date of Patent: January 29, 2002
    Assignee: International Business Machines Corporation
    Inventors: Pradeep Kumar Dubey, Brett Olsson, Charles Philip Roth, Keith Everett Diefendorf, Ronald Ray Hochsprung, Hunter Ledbetter Scales, III
  • Patent number: 6327651
    Abstract: A crossbar is implemented within multimedia facilities of a processor to perform vector permute operations, in which the bytes of a source operand are reordered in the target output. The crossbar is then reused for other instructions requiring multiplexing or shifting operations, particularly those in which the size of additional multiplexers or the size and delay of a barrel shifter is significant. A wide shift operation, for example, may be performed with one cycle latency by the crossbar and one additional layer of multiplexers or a small barrel shifter. The crossbar facility thus gets reused with improved performance of the instructions now sharing the crossbar and a reduction in the total area required by a multimedia facility within a processor.
    Type: Grant
    Filed: September 8, 1998
    Date of Patent: December 4, 2001
    Assignees: International Business Machines Corporation, IBM Corporation
    Inventors: Pradeep Kumar Dubey, Brett Olsson, Charles Philip Roth, Keith Everett Diefendorf, Ronald Ray Hochsprung, Hunter Ledbetter Scales, III
  • Patent number: 6324600
    Abstract: A method and an apparatus for controlling movement of data between any host and any network including a set of devices in a computing system environment having a main memory with a queuing mechanism having a plurality of queues capable of being shared between a plurality of independent processes running on at least one host and at least one I/O adapter. A finite-state machine (FSM) is provided in the main memory and the FSM is divided into two disjoint sets of states, one of which represents state-values processed by the host and set by the adapter, and said other set represents state-values processed by the adapter and set by said host. Using each of these set of states free-running, non-deadlocking processes are provided within the host and the adapter so that the processes sequence circularly and continuously through a vector related to the FSMs.
    Type: Grant
    Filed: February 19, 1999
    Date of Patent: November 27, 2001
    Assignee: International Business Machines Corporation
    Inventors: Frank W. Brice, Richard P. Tarcza, Leslie W. Wyman
  • Publication number: 20010042187
    Abstract: Abstract of the Disclosure A processor has a flexible architecture that efficiently handles computing applications having a range of instruction-level parallelism from a very low degree to a very high degree of instruction-level parallelism. The processor includes a plurality of processing units, an individual processing unit of the plurality of processing units including a multiple-instruction parallel execution path. For computing applications having a low degree of instruction-level parallelism, the processor includes control logic that controls the plurality of processing units to execute instructions mutually independently in a plurality of independent execution threads. For computing applications having a high degree of instruction-level parallelism, the processor further includes control logic that controls the plurality of processing units with a low thread synchronization to operate in combination using spatial software pipelining in the manner of a single wide-issue processor.
    Type: Application
    Filed: December 3, 1998
    Publication date: November 15, 2001
    Inventor: MARC TREMBLAY
  • Patent number: 6317819
    Abstract: A digital data processor integrated circuit (1) includes a plurality of functionally identical first processor elements (6A) and a second processor element (5). The first processor elements are bidirectionally coupled to a first cache (12) via a crossbar switch matrix (8). The second processor element is coupled to a second cache (11). Each of the first cache and the second cache contain a two-way, set-associative cache memory that uses a least-recently-used (LRU) replacement algorithm and that operates with a use-as-fill mode to minimize a number of wait states said processor elements need experience before continuing execution after a cache-miss. An operation of each of the first processor elements and an operation of the second processor element are locked together during an execution of a single instruction read from the second cache.
    Type: Grant
    Filed: February 24, 1999
    Date of Patent: November 13, 2001
    Inventor: Steven G. Morton
  • Patent number: 6311280
    Abstract: A battery-powered portable radio device saves on the overall power consumed by the whole device by skipping unnecessary read, write, and refresh cycles of the internal main memory DRAM core. Streaming data input from a radio receiver is analyzed by a vector processor. The DRAM main memory and the vector processor itself share real estate on a common semiconductor chip. This allows a very wide row of DRAM memory to communicate 1024 bits wide with an eight-line cache. Six lines of the cache are reserved for memory operations, and two lines are reversed for I/O operations. Streaming data from the radio receiver is stored up in the DRAM main memory via the two I/O cache lines. As raw data is needed by the vector processor, whole DRAM rows are downloaded to the six lines of memory cache. The single-instruction multiple data vector processor rolls intermediate data around through the cache without causing it to write back to the DRAM.
    Type: Grant
    Filed: February 22, 1999
    Date of Patent: October 30, 2001
    Assignee: nBand Communications
    Inventor: Sanjay Vishin
  • Patent number: 6266758
    Abstract: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: July 24, 2001
    Assignee: MIPS Technologies, Inc.
    Inventors: Timothy J. van Hook, Perter Hsu, William A. Huffman, Henry P. Moreton, Earl A. Killian
  • Patent number: 6263417
    Abstract: In order to implement vector operation at a higher rate, a processor chip, which is provided with a vector unit in addition to a scalar unit, is prepared. A vector operation mode is first determined, among first and second modes, via which the vector operation is implemented under control of the processor chip. The determination of the vector operation mode is carried out in said process chip. Thereafter, the vector operation is implemented using the vector unit provided in the processor chip if the vector operation mode is the first mode. On the other hand, the vector operation is implemented using a vector unit, which is provided outside the processor chip, if the vector operation mode is the second mode.
    Type: Grant
    Filed: October 30, 1998
    Date of Patent: July 17, 2001
    Assignee: NEC Corporation
    Inventor: Hisao Koyanagi
  • Patent number: 6230264
    Abstract: A non-traditional computing machine having no operands and no linear addressing of code or data is disclosed. A code space having multiple dimensions contains programmed instructions each having a unique position defined with respect to the code space dimensions. A data space having multiple dimensions contains data bits each having a unique position defined with respect to the data space dimensions. A code pointer has a position and a direction within the code space. The code pointer position identifies a present instruction. A data pointer has a position and a direction within the data space. The data pointer position identifies a present data bit. The programmed instructions are selected from an instruction set that includes instructions for navigating the code pointer to select instructions and navigating the data pointer to select data bits. The computing machine operates to manipulate the data in the data space according the programmed instructions.
    Type: Grant
    Filed: August 30, 1999
    Date of Patent: May 8, 2001
    Assignee: Microsoft Corporation
    Inventors: Kenieth Robert Peery, Timothy David Corrie, Jr., Sanjay D. Jejurikar
  • Patent number: 6219775
    Abstract: A massively-parallel computer includes a plurality of processing nodes and at least one control node interconnected by a network. The network faciliates the transfer of data among the processing nodes and of commands from the control node to the processing nodes. Each processing node includes an interface for transmitting data over, and receiving data and commands from, the network, at least one memory module for storing data, a node processor and an auxiliary processor. The node processor receives commands received by the interface and processes data in response thereto, in the process generating memory access requests for facilitating the retrieval of data from or storage of data in the memory module. The node processor further controlling the transfer of data over the network by the interface. The auxiliary processor is connected to the memory module and the node processor.
    Type: Grant
    Filed: March 18, 1998
    Date of Patent: April 17, 2001
    Assignee: Thinking Machines Corporation
    Inventors: Jon P. Wade, Daniel R. Cassiday, Robert D. Lordi, Guy Lewis Steele, Jr., Margaret A. St. Pierre, Monica C. Wong-Chan, Zahi S. Abuhamdeh, David C. Douglas, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Scott J. Smith, Shaw-Wen Yang, Robert C. Zak, Jr.
  • Patent number: 6212622
    Abstract: A processor employs ordering dependencies for load instruction operations upon store address instruction operations. The processor divides store operations into store address instruction operations and store data instruction operations. The store address instruction operations generate the address of the store, and the store data instruction operations route the corresponding data to the load/store unit. The processor maintains a store address dependency vector indicating each of the outstanding store addresses and records ordering dependencies upon the store address instruction operations for each load instruction operation. Accordingly, the load instruction operation is not scheduled until each prior store address instruction operation has been scheduled. Store addresses are available for dependency checking against the load address upon execution of the load instruction operation. If a memory dependency exists, it may be detected upon execution of the load instruction operation.
    Type: Grant
    Filed: August 24, 1998
    Date of Patent: April 3, 2001
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6195747
    Abstract: A system and method for reducing data traffic between the processor and the system controller in a data processing system during the execution of a vector or matrix instruction. When the processor receives an operation command requiring that a large quantity of data be processed, the processor issues a local operation request containing the desired operation, addressing information of the operands and a destination location for the result to the system. The system controller includes a local operation unit for locally executing the local operation request issued from the processor. Because the operand data associated with the operation need not be transferred over the system bus connected between the processor and the system controller, the data traffic between the processor and the system controller is reduced.
    Type: Grant
    Filed: September 28, 1998
    Date of Patent: February 27, 2001
    Assignee: Mentor Arc Inc.
    Inventor: Chien-Tzu Hou
  • Patent number: 6170001
    Abstract: A data processing apparatus and method is provided, wherein in a first mode of operation, data of a first data type is processed, and in a second mode of operation, data of a second data type consisting of an even multiple of data words is processed. The data processing apparatus comprises a register bank having a plurality of data slots for storing data words of data of said first type data and data words of data of said second type data, and transfer logic, responsive to a store instruction, to control the storing of the data words in the register bank to a memory. Further, a format register is provided for storing format data indicating the distribution in the register bank of data words of data of said first data type and data words of data of said second data type.
    Type: Grant
    Filed: May 27, 1998
    Date of Patent: January 2, 2001
    Assignee: Arm Limited
    Inventors: Christopher N. Hinds, David J. Seal
  • Patent number: 6141700
    Abstract: To obtain a correct vector address even if an interrupt occurs during erasing or programing of the data in a built-in ROM 18 by moving a part of a built-in RAM13 to a vector address area by a bus controller 27. Thereby, a microcomputer is prevented from running away and the safety of a system is improved at the time of on-board programming of the built-in ROM 18.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: October 31, 2000
    Assignee: Hitachi, Ltd.
    Inventor: Katsumi Iwata
  • Patent number: 6115805
    Abstract: A non-aligned double word fetch buffer is integrated into a digital signal processor to handle non-aligned double word (32 bit) fetches. When a misaligned double word fetch is detected, the buffer causes a two cycle non-interruptable instruction to be initiated. The first cycle is a 16-bit misaligned data fetch. The address pointer is incremented by 2 and stored in a temporary pointer register. The second cycle is a 32-bit double word fetch based on the temporary pointer with its least significant bit set to 0 (an aligned fetch). The low word from this fetch is used to satisfy the current misaligned double word fetch and the high word is stored in a temporary buffer register in case it proves useful in subsequent misaligned double fetch instructions. Finally, the temporary address pointer is incremented by 2 for possible use in subsequent misaligned fetches.
    Type: Grant
    Filed: August 7, 1998
    Date of Patent: September 5, 2000
    Assignee: Lucent Technology Inc.
    Inventors: Douglas J. Rhodes, Mark Ernest Thierbach, Larry R. Tate
  • Patent number: 6098162
    Abstract: Vector shifting elements of a vector register by varying amounts in a single process is achieved in a vector supercomputer processor. A first vector register contains a set of operands, and a second vector register contains a set of shift counts, one shift count for each operand. Operands and shift counts are successively transferred to a vector shift functional unit, which shifts the operand by an amount equal to the value of the shift count. The shifted operands are stored in a third vector register. The vector shift functional unit also achieves word shifting of a predetermined number of vector register elements to different word locations of another vector register.
    Type: Grant
    Filed: August 24, 1998
    Date of Patent: August 1, 2000
    Assignee: Cray Research, Inc.
    Inventors: Alan J. Schiffleger, Ram K. Gupta, Christopher C. Hsiung
  • Patent number: 6094637
    Abstract: A decoding process for a MPEG1 audio subband uses the symmetry of filter coefficients to reduce the number of multiplications required to decode an audio subband. The decoding process can be efficiently implemented on a single-instruction-multiple-data (SIMD) processor having vector registers capable of holding multiple samples from the subband. In a particular embodiment, some of samples are stored in a first vector register in a normal order and other samples are stored in a second vector register in a reverse order. For example, for eight data element vector registers, the first vector register contains a series of samples index values 0 to 7, and the second vector register contains a series of samples index values 31 to 24. Such ordering facilitates SIMD instructions which perform parallel operations combining value of index i with values of index 31-i.
    Type: Grant
    Filed: December 2, 1997
    Date of Patent: July 25, 2000
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Kicheon Hong
  • Patent number: 6085305
    Abstract: A processor including at least one execution unit generating out-of-order results and out-of-order condition codes. Precise architectural state of the processor is maintained by providing a results buffer having a number of slots and providing a condition code buffer having the same number of slots as the results buffer, each slot in the condition code buffer in one-to-one correspondence with a slot in the results buffer. Each live instruction in the processor is assigned a slot in the results buffer and the condition code buffer. Each speculative result produced by the execution units is stored in the assigned slot in the results buffer. When an instruction is retired, the results for that instruction are transferred to an architectural result register and any condition codes generated by that instruction are transferred to an architectural condition code register.
    Type: Grant
    Filed: June 25, 1997
    Date of Patent: July 4, 2000
    Assignee: Sun Microsystems, Inc.
    Inventors: Ramesh Panwar, Arjun Prabhu
  • Patent number: 6026477
    Abstract: An improved branch recovery mechanism includes an instruction fetch unit, an instruction decode stage, a branch prediction unit coupled to the decode stage for predicting whether the branch instruction will be taken, and an instruction pool for receiving and storing micro-ops. After a mispredicted branch is detected, micro-ops corresponding to a correct path are loaded into the instruction pool without waiting for the mispredicted branch instruction to be retired. By immediately loading the correct path into the instruction pool, Front End stall time can be reduced. Micro-ops in the instruction pool are distinguished based on path information for each micro-op stored in the instruction pool. The micro-ops corresponding to the mispredicted path are deleted as quickly as possible without committing their execution results to architectural state.
    Type: Grant
    Filed: December 31, 1997
    Date of Patent: February 15, 2000
    Assignee: Intel Corporation
    Inventors: Alan B. Kyker, Darrell D. Boggs
  • Patent number: 6012135
    Abstract: Method and apparatus for a logical address translator which translates a logical address into a physical address in a computer. The computer includes a plurality of address ports. Each address port includes a logical address translator, which includes a plurality of segment-register sets. Each segment-register set holds values which specify address boundaries and translation mapping of a corresponding logical segment. A segment detector is coupled to the plurality of segment-register sets, wherein the segment detector operates to determine whether the logical address is within the specified address boundaries of the logical segment. An address mapper is coupled to the plurality of segment-register sets, wherein the address mapper operates to translate the logical address into a physical address.
    Type: Grant
    Filed: December 1, 1994
    Date of Patent: January 4, 2000
    Assignee: Cray Research, Inc.
    Inventors: George W Leedom, William T. Moore
  • Patent number: 6006315
    Abstract: A method is provided for writing a scalar value to a vector V1 without reading the vector from a storage device. A scalar value to be written into the vector at a specified position and a scalar value (index) representing such position are read from a storage device into an Arithmetic Logic Unit (ALU) of a vector processor. The ALU then generates another vector V2 having multiple copies of the scalar value to be written into V1. ALU also generates a mask representing the index. The vector V2 is then delivered to the storage storing V1, but the mask is applied so that only one or more, but not all, copies of the scalar value are written from V2 to the storage. The rest of the vector V1 remains unchanged. The invention reduces register file read contention. Furthermore, if the updated V1 (i.e. V1 having the scalar value) is to be used in the next instruction, a copy of V1 is read from the storage and is updated from V2 and the mask, simultaneously with V1 being updated in the storage.
    Type: Grant
    Filed: October 18, 1996
    Date of Patent: December 21, 1999
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Heonchul Park
  • Patent number: 5978843
    Abstract: A scalable server architecture for use in implementing scaled media servers capable of simultaneous real-time data stream retrieval for large numbers of subscribers. A scalable server includes a plurality of stream pumping engines each accessing a particular storage device of a storage subsystem, and a server processor which receives retrieval requests from subscribers and directs the stream pumping engines to retrieve the requested data streams. Each of the stream pumping engines may include a storage controller coupled to its corresponding storage device for directing retrieval of the requested stream therefrom, a network controller for supplying the retrieved stream to a client network, and a processor for directing the operation of the storage and network controllers. Each of the stream pumping engines may also include a shared memory accessible by the corresponding stream pumping engine processor and the server processor.
    Type: Grant
    Filed: October 23, 1996
    Date of Patent: November 2, 1999
    Assignee: Industrial Technology Research Institute
    Inventors: Chiung-Shien Wu, Gin-Kou Ma, Muh-Rong Yang
  • Patent number: 5961628
    Abstract: An apparatus coupled to a requesting unit and a memory. The apparatus includes a data path and a request control circuit. The data path is coupled to the requesting unit and the memory. The data path is for buffering a vector. The vector includes multiple data elements of a substantially similar data type. The request control circuit is coupled to the data path and the requesting unit. The request control circuit is for receiving a vector memory request from the requesting unit. The request control circuit services the vector memory request by causing the transference of the vector between the requesting unit and the memory via the data path.
    Type: Grant
    Filed: January 28, 1997
    Date of Patent: October 5, 1999
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Le Trong Nguyen, Heonchul Park, Seong Rai Cho