Patents Examined by Jesse R Moll
  • Patent number: 7904697
    Abstract: An apparatus and method for executing a Load Register instruction in which the source data of the Load Register instruction is retained in its original physical register while the architected target register is mapped to this same physical target register. In this state the two architected registers alias to one physical register. When the source register of the Load Address instruction is specified as the target address of a subsequent instruction, a free physical register is assigned to the Load Registers source register. And with this assignment the alias is thus broken. Similarly when the target register of the Load Address instruction is the target address of a subsequent instruction, a new physical register is assigned to the Load Registers target address. And with this assignment the alias is thus broken.
    Type: Grant
    Filed: March 7, 2008
    Date of Patent: March 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: Brian David Barrick, Brian William Curran, Lee Evan Eisen
  • Patent number: 7904698
    Abstract: The electronic circuit contains a plurality of processing elements (10), which are supplied with instructions under control of a common program flow, typically for SIMD operation wherein the same instructions are applied to all processing elements and different operand data of the instructions to respective ones of the processing elements (10). Under control of the instructions each processing element (10) determines, whether an operand data dependent condition has occurred. The processing element outputs a condition signal dependent on said determination. The condition signals are summed to form a sum signal. Program flow is controlled by a conditional jump dependent on a value represented by the sum signal.
    Type: Grant
    Filed: February 9, 2006
    Date of Patent: March 8, 2011
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Richard P. Kleihorst, Anteneh A. Abbo, Sebastien F. Mouy
  • Patent number: 7877585
    Abstract: One embodiment of a computing system configured to manage divergent threads in a SIMD thread group includes a stack configured to store state information for processing control instructions. A parallel processing unit is configured to perform the steps of determining if one or more threads diverge during execution of a conditional control instruction. A disable mask allows for the use of conditional return and break instructions in a multithreaded SIMD architecture. Additional control instructions are used to set up thread processing target addresses for synchronization, breaks, and returns.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: January 25, 2011
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John R. Nickolls, John Erik Lindholm, Svetoslav D. Tzvetkov
  • Patent number: 7877581
    Abstract: A networking application processor is provided. The processor includes an input socket configured to receive data packets. The processor includes a memory for holding instructions and circuitry configured to access data structures associated with the processing stages. The circuitry configured to access data structures enables a single cycle access to an operand from a memory location. An arithmetic logic unit (ALU) is provided. Circuitry for aligning operands to be processed by the ALU is included. The circuitry for aligning the operands causes the operand to be aligned by a lowest significant bit, wherein the circuitry for aligning the operand supplies an extension to the operand to allow the ALU to process different size operands.
    Type: Grant
    Filed: December 2, 2003
    Date of Patent: January 25, 2011
    Assignee: PMC-Sierra US, Inc.
    Inventors: Shridhar Mukund, Mahesh Gopalan, Neeraj Kashalkar
  • Patent number: 7873812
    Abstract: The new system provides for efficient implementation of matrix multiplication in a SIMD processor. The new system provides ability to map any element of a source vector register to be paired with any element of a second source vector register for vector operations, and specifically vector multiply and vector-multiply-accumulate operations to implement a variety of matrix multiplications without the additional permute or data re-ordering instructions. Operations such as DCT and Color-space transformations for video processing could be very efficiently implemented using this system.
    Type: Grant
    Filed: April 5, 2004
    Date of Patent: January 18, 2011
    Inventor: Tibet Mimar
  • Patent number: 7844801
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: November 30, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
  • Patent number: 7844800
    Abstract: A processor 2 utilising register renaming executes program instructions requiring a large number of architectural register specifiers to be renamed by dividing the renaming tasks into an initial set and a remaining set. The initial set are performed first and the results passed via a main channel 32 for further processing. The remaining set are performed in sequence with the results being passed via a background channel 34 for further processing. This technique is particularly useful for performing renaming operations for load/store multiple LDM instructions.
    Type: Grant
    Filed: August 21, 2007
    Date of Patent: November 30, 2010
    Assignee: ARM Limited
    Inventors: Melanie Emanuelle Lucie Vincent, Florent Begon, Cedric Denis Robert Airaud, Norbert Bernard Eugene Lataille
  • Patent number: 7844802
    Abstract: Ordering instructions for specifying the execution order of other instructions improve throughput in a pipelined multiprocessor. Memory write operations local to a CPU are allowed to occur in an arbitrary order, and constraints are placed on shared memory operations. Multiple sets of instructions are provided in which order of execution of the instructions is maintained through the use of CPU registers, write buffers in conjunction with assignment of sequence numbers to the instruction, or a hierarchical ordering system. The system ensures that an earlier designated instruction has reach a specified state of execution prior to a latter instruction reaching a specified state of execution. The ordering of operations allows memory operations local to a CPU to occur in conjunction with other memory operations that are not affected by such execution.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: November 30, 2010
    Assignee: International Business Machines Corporation
    Inventor: Paul E. McKenney
  • Patent number: 7840786
    Abstract: A memory subsystem includes a first memory, a second memory, a first compressor, and a first decompressor. The first memory is configured to store instruction bytes of a fetch window and to store first predecode information and first branch information that characterizes the instruction bytes of the fetch window. The second memory is configured to store the instruction bytes of the fetch window upon eviction of the instruction bytes from the first memory and to store combined predecode/branch information that also characterizes the instruction bytes of the fetch window. The first compressor is configured to compress the first predecode information and the first branch information into the combined predecode/branch information.
    Type: Grant
    Filed: April 16, 2007
    Date of Patent: November 23, 2010
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David Neal Suggs
  • Patent number: 7840783
    Abstract: A system, method, and computer program product are provided for performing a register renaming operation utilizing hardware which operates in at least two modes. In operation, hardware is operated in at least two modes including a first mode for operating the hardware using a logical register of a first bit width and a second mode for operating the hardware using a logical register of a second bit width. The first bit width is twice a width of the second bit width. Additionally, a register renaming operation is performed, including renaming at least one logical register to at least one physical register of the first bit width, utilizing the hardware.
    Type: Grant
    Filed: September 10, 2007
    Date of Patent: November 23, 2010
    Assignee: Netlogic Microsystems, Inc.
    Inventors: Gaurav Singh, Srivatsan Srinivasan, Ricardo Ramirez, Wei-Hsiang Chen, Hai Ngoc Nguyen
  • Patent number: 7822949
    Abstract: A command supply device supplies a command sequence that forms a loop. A loop command buffer accumulates a first partial command sequence. The first partial command sequence is a head part of a first command sequence repeatedly supplied to a CPU from among command sequences stored in a main memory, and is accumulated before the first command sequence is supplied to the CPU again. A linking command buffer accumulates a second partial command sequence. The second partial command sequence follows the first partial command sequence in the first command sequence, and is accumulated while the accumulated first partial command sequence in the loop command buffer is supplied to the CPU. A selection circuit supplies, to the CPU, a command from the accumulated second partial command sequence in the linking command buffer when the entirety of the first partial command sequence has been supplied to the CPU.
    Type: Grant
    Filed: May 9, 2005
    Date of Patent: October 26, 2010
    Assignee: Panasonic Corporation
    Inventor: Satoshi Ogura
  • Patent number: 7822952
    Abstract: Provided is a context switching device capable of reducing conflicts among accesses due to retrieving and saving of contexts by plural processors. The context switching device has: a transfer unit which transfers context data, according to one of (i) the first transfer mode in which the context data is transferred continuously through cycles by a processor, and (ii) the second transfer mode in which plural pieces of the context data are transferred alternately per cycle by switching respective processors of the context data; and a control unit which (i) decides the processor to be used in the first transfer mode and the processors to be used in the second transfer mode, when there is a conflict in requests of the processors for switching context data, the number of processors being more than M, and (ii) controls the transfer unit based on the decision.
    Type: Grant
    Filed: October 4, 2006
    Date of Patent: October 26, 2010
    Assignee: Panasonic Corporation
    Inventor: Masanori Hemmi
  • Patent number: 7802077
    Abstract: A new class traces for a processing engine, called “extended blocks,” possess an architecture that permits possible many entry points but only a single exit point. These extended blocks may be indexed based upon the address of the last instruction therein. Use of the new trace architecture provides several advantages, including reduction of instruction redundancies, dynamic block extension and a sharing of instructions among various extended blocks.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: September 21, 2010
    Assignee: Intel Corporation
    Inventors: Stephen J. Jourdan, Lihu Rappoport, Ronny Ronen, Adi Yoaz
  • Patent number: 7797521
    Abstract: A method, system, and computer program product are provided, for maintaining a path history register of register indirect branches. A set of bits is generated based on a set of target address bits using a hit selection and/or a hash function operation, and the generated set of bits is inserted into a path history register by shifting bits in the path history register and/or applying a hash operation, information corresponding to prior history is removed from the path history register, using a shift out operation and/or a hash operation. The path, history register is used to maintain a recent target, table and generate register-indirect branch target address predictions based on path history correlation between register-indirect branches captured by the path history register.
    Type: Grant
    Filed: April 12, 2007
    Date of Patent: September 14, 2010
    Assignee: International Business Machines Corporation
    Inventors: Richard J. Eickemeyer, Michael K. Gschwind, Ravi Nair, Robert A. Philhower
  • Patent number: 7797517
    Abstract: Reference architecture instructions are translated into target architecture operations. Sequences of operations, in a predicted execution order in some embodiments, form traces. In some embodiments, a trace is based on a plurality of basic blocks. In some embodiments, a trace is committed or aborted as a single entity. Sequences of operations are optimized by fusing collections of operations; fused operations specify a same observable function as respective collections, but advantageously enable more efficient processing. In some embodiments, a collection comprises multiple register operations. Fusing a register operation with a branch operation in a trace forms a fused reg-op/branch operation. In some embodiments, branch instructions translate into assert operations. Fusing an assert operation with another operation forms a fused assert operation. In some embodiments, fused operations only set architectural state, such as high-order portions of registers, that is subsequently read before being written.
    Type: Grant
    Filed: November 17, 2006
    Date of Patent: September 14, 2010
    Assignee: Oracle America, Inc.
    Inventor: John Gregory Favor
  • Patent number: 7761697
    Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is an indirect branch instruction, and processing the indirect branch instruction as a sequence of two-way branches to execute an indirect branch instruction with multiple branch addresses. Indirect branch instructions may be used to allow greater flexibility since the branch address or multiple branch addresses do not need to be determined at compile time.
    Type: Grant
    Filed: November 6, 2006
    Date of Patent: July 20, 2010
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm, Peter C. Mills, John R. Nickolls
  • Patent number: 7757067
    Abstract: A processor (e.g., a co-processor) comprising a decoder coupled to a pre-decoder, in which the decoder decodes a current instruction in parallel with the pre-decoder pre-decoding a subsequent instruction. In particular, the pre-decoder examines at least five Bytecodes in parallel with the decoder decoding a current instruction. The pre-decoder determines if a subsequent instruction contains a prefix. If a prefix is detected in at least one of the five Bytecodes, a program counter skips the prefix and changes the behavior of the decoder during the decoding of the subsequent instruction.
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: July 13, 2010
    Assignee: Texas Instruments Incorporated
    Inventors: Gerard Chauvel, Serge Lasserre, Maija Kuusela
  • Patent number: 7747845
    Abstract: Disclosed is a method and apparatus providing the ability to create a multi-level prediction algorithm, whereby branch predictions beyond the first level of prediction are maintained at a secondary level because the prior level was unsuccessfully able to highly predict the direction of the stated branch accurately. A secondary level is smaller in size than the upper level through selected filtering thereby enabling high prediction accuracy of branches while minimizing the amount of hardware required to perform stated predictions.
    Type: Grant
    Filed: May 12, 2004
    Date of Patent: June 29, 2010
    Assignee: International Business Machines Corporation
    Inventors: Brian Robert Prasky, Moinuddin Khalil Ahmed Qureshi
  • Patent number: 7739483
    Abstract: A method and apparatus for dual-target register allocation is described, intended to enable the efficient mapping/renaming of registers associated with instructions within a pipelined microprocessor architecture.
    Type: Grant
    Filed: September 28, 2001
    Date of Patent: June 15, 2010
    Assignee: Intel Corporation
    Inventors: Rajesh Patel, James Dundas, Adi Yoaz
  • Patent number: 7730289
    Abstract: A method for preloading data in a CPU pipeline is provided, which includes the following steps. When a hint instruction is executed, allocate and initiate an entry in a preload table. When a load instruction is fetched, load a piece of data from a memory into the entry according to the entry. When a use instruction which uses the data loaded by the load instruction is executed, forward the data for the use instruction from the entry instead of from the memory. When the load instruction is executed, update the entry according to the load instruction.
    Type: Grant
    Filed: September 27, 2007
    Date of Patent: June 1, 2010
    Assignee: Faraday Technology Corp.
    Inventors: I-Jui Sung, Ming-Chung Kao