Patents Examined by Eddie Chan
  • Patent number: 7487338
    Abstract: A MOD_SAT instruction indicating that a 16 bit saturation is to be carried out with respect to the operation of one of instructions executed in parallel is placed in the left container and an ADD instruction is placed in the right container. When the instruction decode unit decodes these instructions, the instruction decode unit indicates that the instruction execution unit executes the ADD instruction accompanying a saturation process. Accordingly, the operation of a great number of instructions can be modified by combining instructions and, therefore, the basic instruction length can be made short and it becomes possible to increase the code efficiency.
    Type: Grant
    Filed: May 23, 2003
    Date of Patent: February 3, 2009
    Assignee: Renesas Technology Corp.
    Inventor: Masahito Matsuo
  • Patent number: 7444498
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: October 28, 2008
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7421567
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The execution of such future instructions can initiate a prefetch of data or instructions from a distant cache or main memory, or otherwise make forward progress through the instruction stream. In this manner, when the instructions are re-executed (non speculatively executed) after the stall condition expires, they will execute with a reduced execution latency; e.g. by accessing data prefetched into the L1 cache, or enroute to the processor, or by executing the target instructions following a speculatively resolved mispredicted branch.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: September 2, 2008
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7412589
    Abstract: A computer implemented method, apparatus, and computer usable program code for ensuring forward progress of instructions in a pipeline of a processor. Instructions are received in the pipeline. Instruction flushes are counted in the pipeline to determine a flush count. A single step mode in the pipeline is entered in response to the flush count exceeding a threshold. The single step mode instructions are issued in serial such that an instruction is not issued for execution until a prior instruction has completed execution.
    Type: Grant
    Filed: March 31, 2006
    Date of Patent: August 12, 2008
    Assignee: International Business Machines Corporation
    Inventor: Kurt Alan Feiste
  • Patent number: 7389404
    Abstract: A matrix data processor is implemented wherein data elements are stored in physical registers and mapped to logical registers. After being stored in the logical registers, the data elements are then treated as matrix elements. By using a series of variable matrix parameters to define the size and location of the various matrix source and destination elements, as well as the operation(s) to be performed on the matrices, the performance of digital signal processing operations can be significantly enhanced.
    Type: Grant
    Filed: September 12, 2005
    Date of Patent: June 17, 2008
    Assignee: G4 Matrix Technologies, LLC
    Inventors: Gopalan Nair, Archana Sekhar, Prasanth David, Antony Jose
  • Patent number: 7386703
    Abstract: A processor and method for processing matrix data. The processor includes M independent vector register files which are adapted to collectively store a matrix of L data elements. Each data element has B binary bits. The matrix has N rows and M columns, and L=N*M. Each column has K subcolumns. N?2, M?2, K?1, and B?1. Each row and each subcolumn is addressable. The processor does not duplicatively store the L data elements. The matrix includes a set of arrays such that each array is a row or subcolumn of the matrix. The processor may execute an instruction that performs an operation on a first array of the set of arrays, such that the operation is performed with selectivity with respect to the data elements of the first array.
    Type: Grant
    Filed: November 18, 2003
    Date of Patent: June 10, 2008
    Assignee: International Business Machines Corporation
    Inventors: Peter A. Sandon, R. Michael P. West
  • Patent number: 7383422
    Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.
    Type: Grant
    Filed: September 27, 2004
    Date of Patent: June 3, 2008
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
  • Patent number: 7383427
    Abstract: A method is provided for executing a plurality of parallel executable sequences of instructions on a processor having a plurality of execution units operated by a single instruction unit. The method includes a) detecting a plurality of sequences of instructions adapted for parallel execution from instructions being provided to the processor, wherein each sequence is adapted for execution by a subset of the plurality of execution units and b) storing information representing a stall status of the execution units. Then, a step c) is performed, wherein, for each unexecuted sequence of the plurality of sequences: i) all of the plurality of execution units other than the subset which corresponds to the unexecuted sequence are stalled, and ii) the sequence of instructions is executed by the corresponding subset. Thereafter, it is determined in a step d) whether a current stall status of the plurality of execution units matches the stall status represented by the stored information.
    Type: Grant
    Filed: April 20, 2005
    Date of Patent: June 3, 2008
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Takeshi Yamazaki
  • Patent number: 7380109
    Abstract: An apparatus and method are provided for extending a microprocessor instruction set to allow for extended size addresses. The apparatus includes translation logic and extended execution logic. The translation logic translates an extended instruction into an associated micro instruction sequence for execution by the microprocessor, where the extended instruction has an extended prefix and an extended prefix tag. Extended prefix specifies an extended address mode for an address calculation corresponding to an operation, where the extended address mode not otherwise provided for by instructions in an existing instruction set. The extended prefix tag indicates the extended prefix, where the extended prefix tag is an otherwise architecturally specified opcode within the existing instruction set. The extended execution logic is coupled to the translation logic.
    Type: Grant
    Filed: August 22, 2002
    Date of Patent: May 27, 2008
    Assignee: IP-First, LLC
    Inventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks
  • Patent number: 7380106
    Abstract: A method and a system for transferring data between a register in a processor and a point-to-point communications link. More specifically, blocking and non-blocking methods are described to get and put data between a general purpose register of a soft or hard core processor and a queue connected to a point-to-point communications channel. One implementation example is for a Fast Simplex Link multi-processor network.
    Type: Grant
    Filed: February 28, 2003
    Date of Patent: May 27, 2008
    Assignee: Xilinx, Inc.
    Inventor: Goran Bilski
  • Patent number: 7376816
    Abstract: A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.
    Type: Grant
    Filed: November 12, 2004
    Date of Patent: May 20, 2008
    Assignee: International Business Machines Corporation
    Inventors: Brian David Barrick, Kimberly Marie Fernsler, Dwain A. Hicks, Takeki Osanai, David Scott Ray
  • Patent number: 7376819
    Abstract: An apparatus and method for selecting whether a central processing unit (CPU) performs instruction reading in units of 16 bits (a first word length) or in units of 32 bits (a second word length). Depending on whether instruction reading is performed in units of 16 bits or 32 bits, increment values (+2 and +4) by which a program counter (PC) is incremented are switched. Data reading or writing is performed in units of a given data length irrespective of the selecting unit. When the CPU issues a request for instruction reading in units of 16 bits or 32 bits or for data reading or writing, a bus control unit performs reading or writing a predetermined number of times according to a bus width designated for a resource located at an address specified in the request.
    Type: Grant
    Filed: June 11, 2003
    Date of Patent: May 20, 2008
    Assignee: Renesas Technology Corp.
    Inventors: Naoki Mitsuishi, Shinichi Shibahara, Takahiro Okubo
  • Patent number: 7376815
    Abstract: Techniques for ensuring a synchronized predecoding of an instruction string are disclosed. The instruction string contains instructions from a variable length instruction set and embedded data. One technique includes defining a granule to be equal to the smallest length instruction in the instruction set and defining the number of granules that compose the longest length instruction in the instruction set to be MAX. The technique further includes determining the end of an embedded data segment, when a program is compiled or assembled into the instruction string and inserting a padding of length, MAX?1, into the instruction string to the end of the embedded data. Upon predecoding of the padded instruction string, a predecoder maintains synchronization with the instructions in the padded instruction string even if embedded data is coincidentally encoded to resemble an existing instruction in the variable length instruction set.
    Type: Grant
    Filed: February 25, 2005
    Date of Patent: May 20, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Rodney Wayne Smith, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius
  • Patent number: 7376817
    Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.
    Type: Grant
    Filed: August 10, 2005
    Date of Patent: May 20, 2008
    Assignee: P.A. Semi, Inc.
    Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
  • Patent number: 7373482
    Abstract: One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. During operation, the system executes program instructions in a normal-execution mode. Upon encountering a condition which causes the processor to enter scout mode, the system performs a checkpoint and commences execution of instructions in scout mode, wherein the instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. During execution of a load instruction during scout mode, if the load instruction is a special load instruction and if the load instruction causes a lower-level cache miss, the system waits for data to be returned from a higher-level cache before resuming execution of subsequent instructions in scout mode, instead of disregarding the result of the load instruction and immediately resuming execution in scout mode.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: May 13, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Lawrence A. Spracklen, Yuan C. Chou, Santosh G. Abraham
  • Patent number: 7373486
    Abstract: In one embodiment, a renamer comprises a plurality of storage locations and compare circuitry. Each storage location is assigned to a respective renameable resource and is configured to store an identifier corresponding to a youngest instruction operation that writes the respective renameable resource. Coupled to receive an input representing one or more retiring instruction identifiers corresponding to instruction operations that are being retired, the compare circuitry is configured to detect a match between at least a first identifier in a first storage location and one of the retiring identifiers. An encoded form of the identifiers is logically divided into a plurality of fields, and the input comprises a first plurality of bit vectors. Each of the first plurality of bit vectors corresponds to a respective field and includes a bit position for each possible value of the respective field.
    Type: Grant
    Filed: August 29, 2005
    Date of Patent: May 13, 2008
    Assignee: P.A. Semi, Inc.
    Inventors: Wei-Han Lien, John K Yong, Shyam Sundar, Rajat Goel
  • Patent number: 7373489
    Abstract: An apparatus and method for floating point exception prediction and recovery. In one embodiment, a processor may include instruction fetch logic configured to issue a first instruction from one of a plurality of threads and to successively issue a second instruction from another one of the plurality of threads. The processor may also include floating-point arithmetic logic configured to execute a floating-point instruction issued by the instruction fetch logic from a given one of the plurality of threads, and further configured to determine whether the floating-point instruction generates an exception, and may further include exception prediction logic configured to predict whether the floating-point instruction will generate the exception, where the prediction occurs before the floating-point arithmetic logic determines whether the floating-point instruction generates the exception.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: May 13, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Jeffrey S. Brooks, Paul J. Jordan, Rabin A. Sugumar
  • Patent number: 7373485
    Abstract: A clustered superscalar processor for reducing the miss rate of a register cache and reducing the possibility of miss penalties. The processor checks before storing an instruction in an instruction window whether there is a data dependency relationship between the instruction that will be stored in the instruction window and a previous instruction stored in the instruction window. When there is a data dependency relationship, the execution result of the previous instruction of one cluster is communicated to a register cache of another cluster that executes the instruction having a data dependency relationship with the previous instruction.
    Type: Grant
    Filed: March 3, 2005
    Date of Patent: May 13, 2008
    Assignee: National University Corporation Nagoya University
    Inventors: Hideki Ando, Hajime Shimada, Atsushi Mochizuki
  • Patent number: 7370181
    Abstract: Embodiments of apparatuses, systems, and methods for single stepping a virtual machine guest using a reorder buffer are disclosed. In one embodiment, an apparatus includes a sequencer and a reorder buffer. The sequencer is to issue micro-operations. The reorder buffer is to signal the sequencer to signal the sequencer to issue micro-operations corresponding to a monitor trap flag event.
    Type: Grant
    Filed: June 22, 2004
    Date of Patent: May 6, 2008
    Assignee: Intel Corporation
    Inventors: Sanjoy K. Mondal, Venkateswara Rao Madduri
  • Patent number: 7370180
    Abstract: A method of controlling data processing logic which causes a data value to be rotated by a number of bits in order to generate a rotated data value; a number of least significant bits of the rotated data value are masked with other bits of said rotated data value not being masked in order to generate a masked rotated data value; a selected bit of said rotated data value are masked with other bits of said rotated data value not being masked in order to generate a bit preset rotated data value; and said sign-extended bit field extracted data value to be generated by subtracting said masked rotated data value from said bit preset data value or said zero-extended bit field extracted data value to be generated by performing a logical exclusive-OR operation with the masked rotated data value and said bit preset data value.
    Type: Grant
    Filed: March 8, 2004
    Date of Patent: May 6, 2008
    Assignee: ARM Limited
    Inventors: Alexander Edward Nancekievill, David James Seal