Patents Examined by Eddie Chan
-
Patent number: 7487338Abstract: A MOD_SAT instruction indicating that a 16 bit saturation is to be carried out with respect to the operation of one of instructions executed in parallel is placed in the left container and an ADD instruction is placed in the right container. When the instruction decode unit decodes these instructions, the instruction decode unit indicates that the instruction execution unit executes the ADD instruction accompanying a saturation process. Accordingly, the operation of a great number of instructions can be modified by combining instructions and, therefore, the basic instruction length can be made short and it becomes possible to increase the code efficiency.Type: GrantFiled: May 23, 2003Date of Patent: February 3, 2009Assignee: Renesas Technology Corp.Inventor: Masahito Matsuo
-
Patent number: 7444498Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.Type: GrantFiled: December 17, 2004Date of Patent: October 28, 2008Assignee: International Business Machines CorporationInventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
-
Patent number: 7421567Abstract: The present invention allows a microprocessor to identify and speculatively execute future instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The execution of such future instructions can initiate a prefetch of data or instructions from a distant cache or main memory, or otherwise make forward progress through the instruction stream. In this manner, when the instructions are re-executed (non speculatively executed) after the stall condition expires, they will execute with a reduced execution latency; e.g. by accessing data prefetched into the L1 cache, or enroute to the processor, or by executing the target instructions following a speculatively resolved mispredicted branch.Type: GrantFiled: December 17, 2004Date of Patent: September 2, 2008Assignee: International Business Machines CorporationInventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
-
Patent number: 7412589Abstract: A computer implemented method, apparatus, and computer usable program code for ensuring forward progress of instructions in a pipeline of a processor. Instructions are received in the pipeline. Instruction flushes are counted in the pipeline to determine a flush count. A single step mode in the pipeline is entered in response to the flush count exceeding a threshold. The single step mode instructions are issued in serial such that an instruction is not issued for execution until a prior instruction has completed execution.Type: GrantFiled: March 31, 2006Date of Patent: August 12, 2008Assignee: International Business Machines CorporationInventor: Kurt Alan Feiste
-
Patent number: 7389404Abstract: A matrix data processor is implemented wherein data elements are stored in physical registers and mapped to logical registers. After being stored in the logical registers, the data elements are then treated as matrix elements. By using a series of variable matrix parameters to define the size and location of the various matrix source and destination elements, as well as the operation(s) to be performed on the matrices, the performance of digital signal processing operations can be significantly enhanced.Type: GrantFiled: September 12, 2005Date of Patent: June 17, 2008Assignee: G4 Matrix Technologies, LLCInventors: Gopalan Nair, Archana Sekhar, Prasanth David, Antony Jose
-
Patent number: 7386703Abstract: A processor and method for processing matrix data. The processor includes M independent vector register files which are adapted to collectively store a matrix of L data elements. Each data element has B binary bits. The matrix has N rows and M columns, and L=N*M. Each column has K subcolumns. N?2, M?2, K?1, and B?1. Each row and each subcolumn is addressable. The processor does not duplicatively store the L data elements. The matrix includes a set of arrays such that each array is a row or subcolumn of the matrix. The processor may execute an instruction that performs an operation on a first array of the set of arrays, such that the operation is performed with selectivity with respect to the data elements of the first array.Type: GrantFiled: November 18, 2003Date of Patent: June 10, 2008Assignee: International Business Machines CorporationInventors: Peter A. Sandon, R. Michael P. West
-
Patent number: 7383422Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.Type: GrantFiled: September 27, 2004Date of Patent: June 3, 2008Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
-
Patent number: 7383427Abstract: A method is provided for executing a plurality of parallel executable sequences of instructions on a processor having a plurality of execution units operated by a single instruction unit. The method includes a) detecting a plurality of sequences of instructions adapted for parallel execution from instructions being provided to the processor, wherein each sequence is adapted for execution by a subset of the plurality of execution units and b) storing information representing a stall status of the execution units. Then, a step c) is performed, wherein, for each unexecuted sequence of the plurality of sequences: i) all of the plurality of execution units other than the subset which corresponds to the unexecuted sequence are stalled, and ii) the sequence of instructions is executed by the corresponding subset. Thereafter, it is determined in a step d) whether a current stall status of the plurality of execution units matches the stall status represented by the stored information.Type: GrantFiled: April 20, 2005Date of Patent: June 3, 2008Assignee: Sony Computer Entertainment Inc.Inventor: Takeshi Yamazaki
-
Patent number: 7380109Abstract: An apparatus and method are provided for extending a microprocessor instruction set to allow for extended size addresses. The apparatus includes translation logic and extended execution logic. The translation logic translates an extended instruction into an associated micro instruction sequence for execution by the microprocessor, where the extended instruction has an extended prefix and an extended prefix tag. Extended prefix specifies an extended address mode for an address calculation corresponding to an operation, where the extended address mode not otherwise provided for by instructions in an existing instruction set. The extended prefix tag indicates the extended prefix, where the extended prefix tag is an otherwise architecturally specified opcode within the existing instruction set. The extended execution logic is coupled to the translation logic.Type: GrantFiled: August 22, 2002Date of Patent: May 27, 2008Assignee: IP-First, LLCInventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks
-
Patent number: 7380106Abstract: A method and a system for transferring data between a register in a processor and a point-to-point communications link. More specifically, blocking and non-blocking methods are described to get and put data between a general purpose register of a soft or hard core processor and a queue connected to a point-to-point communications channel. One implementation example is for a Fast Simplex Link multi-processor network.Type: GrantFiled: February 28, 2003Date of Patent: May 27, 2008Assignee: Xilinx, Inc.Inventor: Goran Bilski
-
Patent number: 7376819Abstract: An apparatus and method for selecting whether a central processing unit (CPU) performs instruction reading in units of 16 bits (a first word length) or in units of 32 bits (a second word length). Depending on whether instruction reading is performed in units of 16 bits or 32 bits, increment values (+2 and +4) by which a program counter (PC) is incremented are switched. Data reading or writing is performed in units of a given data length irrespective of the selecting unit. When the CPU issues a request for instruction reading in units of 16 bits or 32 bits or for data reading or writing, a bus control unit performs reading or writing a predetermined number of times according to a bus width designated for a resource located at an address specified in the request.Type: GrantFiled: June 11, 2003Date of Patent: May 20, 2008Assignee: Renesas Technology Corp.Inventors: Naoki Mitsuishi, Shinichi Shibahara, Takahiro Okubo
-
Patent number: 7376815Abstract: Techniques for ensuring a synchronized predecoding of an instruction string are disclosed. The instruction string contains instructions from a variable length instruction set and embedded data. One technique includes defining a granule to be equal to the smallest length instruction in the instruction set and defining the number of granules that compose the longest length instruction in the instruction set to be MAX. The technique further includes determining the end of an embedded data segment, when a program is compiled or assembled into the instruction string and inserting a padding of length, MAX?1, into the instruction string to the end of the embedded data. Upon predecoding of the padded instruction string, a predecoder maintains synchronization with the instructions in the padded instruction string even if embedded data is coincidentally encoded to resemble an existing instruction in the variable length instruction set.Type: GrantFiled: February 25, 2005Date of Patent: May 20, 2008Assignee: QUALCOMM IncorporatedInventors: Rodney Wayne Smith, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius
-
Patent number: 7376817Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.Type: GrantFiled: August 10, 2005Date of Patent: May 20, 2008Assignee: P.A. Semi, Inc.Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
-
Patent number: 7376816Abstract: A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.Type: GrantFiled: November 12, 2004Date of Patent: May 20, 2008Assignee: International Business Machines CorporationInventors: Brian David Barrick, Kimberly Marie Fernsler, Dwain A. Hicks, Takeki Osanai, David Scott Ray
-
Patent number: 7373486Abstract: In one embodiment, a renamer comprises a plurality of storage locations and compare circuitry. Each storage location is assigned to a respective renameable resource and is configured to store an identifier corresponding to a youngest instruction operation that writes the respective renameable resource. Coupled to receive an input representing one or more retiring instruction identifiers corresponding to instruction operations that are being retired, the compare circuitry is configured to detect a match between at least a first identifier in a first storage location and one of the retiring identifiers. An encoded form of the identifiers is logically divided into a plurality of fields, and the input comprises a first plurality of bit vectors. Each of the first plurality of bit vectors corresponds to a respective field and includes a bit position for each possible value of the respective field.Type: GrantFiled: August 29, 2005Date of Patent: May 13, 2008Assignee: P.A. Semi, Inc.Inventors: Wei-Han Lien, John K Yong, Shyam Sundar, Rajat Goel
-
Patent number: 7373485Abstract: A clustered superscalar processor for reducing the miss rate of a register cache and reducing the possibility of miss penalties. The processor checks before storing an instruction in an instruction window whether there is a data dependency relationship between the instruction that will be stored in the instruction window and a previous instruction stored in the instruction window. When there is a data dependency relationship, the execution result of the previous instruction of one cluster is communicated to a register cache of another cluster that executes the instruction having a data dependency relationship with the previous instruction.Type: GrantFiled: March 3, 2005Date of Patent: May 13, 2008Assignee: National University Corporation Nagoya UniversityInventors: Hideki Ando, Hajime Shimada, Atsushi Mochizuki
-
Patent number: 7373482Abstract: One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. During operation, the system executes program instructions in a normal-execution mode. Upon encountering a condition which causes the processor to enter scout mode, the system performs a checkpoint and commences execution of instructions in scout mode, wherein the instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. During execution of a load instruction during scout mode, if the load instruction is a special load instruction and if the load instruction causes a lower-level cache miss, the system waits for data to be returned from a higher-level cache before resuming execution of subsequent instructions in scout mode, instead of disregarding the result of the load instruction and immediately resuming execution in scout mode.Type: GrantFiled: May 26, 2005Date of Patent: May 13, 2008Assignee: Sun Microsystems, Inc.Inventors: Lawrence A. Spracklen, Yuan C. Chou, Santosh G. Abraham
-
Patent number: 7373489Abstract: An apparatus and method for floating point exception prediction and recovery. In one embodiment, a processor may include instruction fetch logic configured to issue a first instruction from one of a plurality of threads and to successively issue a second instruction from another one of the plurality of threads. The processor may also include floating-point arithmetic logic configured to execute a floating-point instruction issued by the instruction fetch logic from a given one of the plurality of threads, and further configured to determine whether the floating-point instruction generates an exception, and may further include exception prediction logic configured to predict whether the floating-point instruction will generate the exception, where the prediction occurs before the floating-point arithmetic logic determines whether the floating-point instruction generates the exception.Type: GrantFiled: June 30, 2004Date of Patent: May 13, 2008Assignee: Sun Microsystems, Inc.Inventors: Jeffrey S. Brooks, Paul J. Jordan, Rabin A. Sugumar
-
Patent number: 7370180Abstract: A method of controlling data processing logic which causes a data value to be rotated by a number of bits in order to generate a rotated data value; a number of least significant bits of the rotated data value are masked with other bits of said rotated data value not being masked in order to generate a masked rotated data value; a selected bit of said rotated data value are masked with other bits of said rotated data value not being masked in order to generate a bit preset rotated data value; and said sign-extended bit field extracted data value to be generated by subtracting said masked rotated data value from said bit preset data value or said zero-extended bit field extracted data value to be generated by performing a logical exclusive-OR operation with the masked rotated data value and said bit preset data value.Type: GrantFiled: March 8, 2004Date of Patent: May 6, 2008Assignee: ARM LimitedInventors: Alexander Edward Nancekievill, David James Seal
-
Patent number: 7370176Abstract: A system and method for a high frequency stall design is presented. An issue unit includes a first instruction stage, a second instruction stage, and issue control logic. During a first instruction cycle, the issue unit performs two tasks, which are 1) the instructions located in the first instruction stage are moved to a second instruction stage, and 2) the issue control logic determines whether to issue or stall the instructions that are moved to the second instruction stage based upon their particular instruction attributes and the issue control unit's previous state. During a second instruction cycle that immediately follows the first instruction cycle, the second instruction stage's instructions are either issued or stalled based upon the issue control logic's decision from the first instruction cycle.Type: GrantFiled: August 16, 2005Date of Patent: May 6, 2008Assignee: International Business Machines CorporationInventors: Jonathan James DeMent, Kurt Alan Feiste, Robert Alan Philhower, David Shippy