Patents Examined by Jesse R Moll

Load register instruction short circuiting method

Patent number: 7904697

Abstract: An apparatus and method for executing a Load Register instruction in which the source data of the Load Register instruction is retained in its original physical register while the architected target register is mapped to this same physical target register. In this state the two architected registers alias to one physical register. When the source register of the Load Address instruction is specified as the target address of a subsequent instruction, a free physical register is assigned to the Load Registers source register. And with this assignment the alias is thus broken. Similarly when the target register of the Load Address instruction is the target address of a subsequent instruction, a new physical register is assigned to the Load Registers target address. And with this assignment the alias is thus broken.

Type: Grant

Filed: March 7, 2008

Date of Patent: March 8, 2011

Assignee: International Business Machines Corporation

Inventors: Brian David Barrick, Brian William Curran, Lee Evan Eisen
Electronic parallel processing circuit for performing jump instructions

Patent number: 7904698

Abstract: The electronic circuit contains a plurality of processing elements (10), which are supplied with instructions under control of a common program flow, typically for SIMD operation wherein the same instructions are applied to all processing elements and different operand data of the instructions to respective ones of the processing elements (10). Under control of the instructions each processing element (10) determines, whether an operand data dependent condition has occurred. The processing element outputs a condition signal dependent on said determination. The condition signals are summed to form a sum signal. Program flow is controlled by a conditional jump dependent on a value represented by the sum signal.

Type: Grant

Filed: February 9, 2006

Date of Patent: March 8, 2011

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Richard P. Kleihorst, Anteneh A. Abbo, Sebastien F. Mouy
Structured programming control flow in a SIMD architecture

Patent number: 7877585

Abstract: One embodiment of a computing system configured to manage divergent threads in a SIMD thread group includes a stack configured to store state information for processing control instructions. A parallel processing unit is configured to perform the steps of determining if one or more threads diverge during execution of a conditional control instruction. A disable mask allows for the use of conditional return and break instructions in a multithreaded SIMD architecture. Additional control instructions are used to set up thread processing target addresses for synchronization, breaks, and returns.

Type: Grant

Filed: August 27, 2007

Date of Patent: January 25, 2011

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John R. Nickolls, John Erik Lindholm, Svetoslav D. Tzvetkov
Networked processor for a pipeline architecture

Patent number: 7877581

Abstract: A networking application processor is provided. The processor includes an input socket configured to receive data packets. The processor includes a memory for holding instructions and circuitry configured to access data structures associated with the processing stages. The circuitry configured to access data structures enables a single cycle access to an operand from a memory location. An arithmetic logic unit (ALU) is provided. Circuitry for aligning operands to be processed by the ALU is included. The circuitry for aligning the operands causes the operand to be aligned by a lowest significant bit, wherein the circuitry for aligning the operand supplies an extension to the operand to allow the ALU to process different size operands.

Type: Grant

Filed: December 2, 2003

Date of Patent: January 25, 2011

Assignee: PMC-Sierra US, Inc.

Inventors: Shridhar Mukund, Mahesh Gopalan, Neeraj Kashalkar
Method and system for efficient matrix multiplication in a SIMD processor architecture

Patent number: 7873812

Abstract: The new system provides for efficient implementation of matrix multiplication in a SIMD processor. The new system provides ability to map any element of a source vector register to be paired with any element of a second source vector register for vector operations, and specifically vector multiply and vector-multiply-accumulate operations to implement a variety of matrix multiplications without the additional permute or data re-ordering instructions. Operations such as DCT and Color-space transformations for video processing could be very efficiently implemented using this system.

Type: Grant

Filed: April 5, 2004

Date of Patent: January 18, 2011

Inventor: Tibet Mimar
Method and apparatus for affinity-guided speculative helper threads in chip multiprocessors

Patent number: 7844801

Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.

Type: Grant

Filed: July 31, 2003

Date of Patent: November 30, 2010

Assignee: Intel Corporation

Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
Method for renaming a large number of registers in a data processing system using a background channel

Patent number: 7844800

Abstract: A processor 2 utilising register renaming executes program instructions requiring a large number of architectural register specifiers to be renamed by dividing the renaming tasks into an initial set and a remaining set. The initial set are performed first and the results passed via a main channel 32 for further processing. The remaining set are performed in sequence with the results being passed via a background channel 34 for further processing. This technique is particularly useful for performing renaming operations for load/store multiple LDM instructions.

Type: Grant

Filed: August 21, 2007

Date of Patent: November 30, 2010

Assignee: ARM Limited

Inventors: Melanie Emanuelle Lucie Vincent, Florent Begon, Cedric Denis Robert Airaud, Norbert Bernard Eugene Lataille
Instructions for ordering execution in pipelined processes

Patent number: 7844802

Abstract: Ordering instructions for specifying the execution order of other instructions improve throughput in a pipelined multiprocessor. Memory write operations local to a CPU are allowed to occur in an arbitrary order, and constraints are placed on shared memory operations. Multiple sets of instructions are provided in which order of execution of the instructions is maintained through the use of CPU registers, write buffers in conjunction with assignment of sequence numbers to the instruction, or a hierarchical ordering system. The system ensures that an earlier designated instruction has reach a specified state of execution prior to a latter instruction reaching a specified state of execution. The ordering of operations allows memory operations local to a CPU to occur in conjunction with other memory operations that are not affected by such execution.

Type: Grant

Filed: June 24, 2008

Date of Patent: November 30, 2010

Assignee: International Business Machines Corporation

Inventor: Paul E. McKenney
Techniques for storing instructions and related information in a memory hierarchy

Patent number: 7840786

Abstract: A memory subsystem includes a first memory, a second memory, a first compressor, and a first decompressor. The first memory is configured to store instruction bytes of a fetch window and to store first predecode information and first branch information that characterizes the instruction bytes of the fetch window. The second memory is configured to store the instruction bytes of the fetch window upon eviction of the instruction bytes from the first memory and to store combined predecode/branch information that also characterizes the instruction bytes of the fetch window. The first compressor is configured to compress the first predecode information and the first branch information into the combined predecode/branch information.

Type: Grant

Filed: April 16, 2007

Date of Patent: November 23, 2010

Assignee: Advanced Micro Devices, Inc.

Inventor: David Neal Suggs
System and method for performing a register renaming operation utilizing hardware which is capable of operating in at least two modes utilizing registers of multiple widths

Patent number: 7840783

Abstract: A system, method, and computer program product are provided for performing a register renaming operation utilizing hardware which operates in at least two modes. In operation, hardware is operated in at least two modes including a first mode for operating the hardware using a logical register of a first bit width and a second mode for operating the hardware using a logical register of a second bit width. The first bit width is twice a width of the second bit width. Additionally, a register renaming operation is performed, including renaming at least one logical register to at least one physical register of the first bit width, utilizing the hardware.

Type: Grant

Filed: September 10, 2007

Date of Patent: November 23, 2010

Assignee: Netlogic Microsystems, Inc.

Inventors: Gaurav Singh, Srivatsan Srinivasan, Ricardo Ramirez, Wei-Hsiang Chen, Hai Ngoc Nguyen
Command supply device that supplies a command read out from a main memory to a central processing unit

Patent number: 7822949

Abstract: A command supply device supplies a command sequence that forms a loop. A loop command buffer accumulates a first partial command sequence. The first partial command sequence is a head part of a first command sequence repeatedly supplied to a CPU from among command sequences stored in a main memory, and is accumulated before the first command sequence is supplied to the CPU again. A linking command buffer accumulates a second partial command sequence. The second partial command sequence follows the first partial command sequence in the first command sequence, and is accumulated while the accumulated first partial command sequence in the loop command buffer is supplied to the CPU. A selection circuit supplies, to the CPU, a command from the accumulated second partial command sequence in the linking command buffer when the entirety of the first partial command sequence has been supplied to the CPU.

Type: Grant

Filed: May 9, 2005

Date of Patent: October 26, 2010

Assignee: Panasonic Corporation

Inventor: Satoshi Ogura
Context switching device

Patent number: 7822952

Abstract: Provided is a context switching device capable of reducing conflicts among accesses due to retrieving and saving of contexts by plural processors. The context switching device has: a transfer unit which transfers context data, according to one of (i) the first transfer mode in which the context data is transferred continuously through cycles by a processor, and (ii) the second transfer mode in which plural pieces of the context data are transferred alternately per cycle by switching respective processors of the context data; and a control unit which (i) decides the processor to be used in the first transfer mode and the processors to be used in the second transfer mode, when there is a conflict in requests of the processors for switching context data, the number of processors being more than M, and (ii) controls the transfer unit based on the decision.

Type: Grant

Filed: October 4, 2006

Date of Patent: October 26, 2010

Assignee: Panasonic Corporation

Inventor: Masanori Hemmi
Trace indexing via trace end addresses

Patent number: 7802077

Abstract: A new class traces for a processing engine, called “extended blocks,” possess an architecture that permits possible many entry points but only a single exit point. These extended blocks may be indexed based upon the address of the last instruction therein. Use of the new trace architecture provides several advantages, including reduction of instruction redundancies, dynamic block extension and a sharing of instructions among various extended blocks.

Type: Grant

Filed: June 30, 2000

Date of Patent: September 21, 2010

Assignee: Intel Corporation

Inventors: Stephen J. Jourdan, Lihu Rappoport, Ronny Ronen, Adi Yoaz
Method, system, and computer program product for path-correlated indirect address predictions

Patent number: 7797521

Abstract: A method, system, and computer program product are provided, for maintaining a path history register of register indirect branches. A set of bits is generated based on a set of target address bits using a hit selection and/or a hash function operation, and the generated set of bits is inserted into a path history register by shifting bits in the path history register and/or applying a hash operation, information corresponding to prior history is removed from the path history register, using a shift out operation and/or a hash operation. The path, history register is used to maintain a recent target, table and generate register-indirect branch target address predictions based on path history correlation between register-indirect branches captured by the path history register.

Type: Grant

Filed: April 12, 2007

Date of Patent: September 14, 2010

Assignee: International Business Machines Corporation

Inventors: Richard J. Eickemeyer, Michael K. Gschwind, Ravi Nair, Robert A. Philhower
Trace optimization via fusing operations of a target architecture operation set

Patent number: 7797517

Abstract: Reference architecture instructions are translated into target architecture operations. Sequences of operations, in a predicted execution order in some embodiments, form traces. In some embodiments, a trace is based on a plurality of basic blocks. In some embodiments, a trace is committed or aborted as a single entity. Sequences of operations are optimized by fusing collections of operations; fused operations specify a same observable function as respective collections, but advantageously enable more efficient processing. In some embodiments, a collection comprises multiple register operations. Fusing a register operation with a branch operation in a trace forms a fused reg-op/branch operation. In some embodiments, branch instructions translate into assert operations. Fusing an assert operation with another operation forms a fused assert operation. In some embodiments, fused operations only set architectural state, such as high-order portions of registers, that is subsequently read before being written.

Type: Grant

Filed: November 17, 2006

Date of Patent: September 14, 2010

Assignee: Oracle America, Inc.

Inventor: John Gregory Favor
Processing an indirect branch instruction in a SIMD architecture

Patent number: 7761697

Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is an indirect branch instruction, and processing the indirect branch instruction as a sequence of two-way branches to execute an indirect branch instruction with multiple branch addresses. Indirect branch instructions may be used to allow greater flexibility since the branch address or multiple branch addresses do not need to be determined at compile time.

Type: Grant

Filed: November 6, 2006

Date of Patent: July 20, 2010

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm, Peter C. Mills, John R. Nickolls
Pre-decoding bytecode prefixes selectively incrementing stack machine program counter

Patent number: 7757067

Abstract: A processor (e.g., a co-processor) comprising a decoder coupled to a pre-decoder, in which the decoder decodes a current instruction in parallel with the pre-decoder pre-decoding a subsequent instruction. In particular, the pre-decoder examines at least five Bytecodes in parallel with the decoder decoding a current instruction. The pre-decoder determines if a subsequent instruction contains a prefix. If a prefix is detected in at least one of the five Bytecodes, a program counter skips the prefix and changes the behavior of the decoder during the decoding of the subsequent instruction.

Type: Grant

Filed: July 31, 2003

Date of Patent: July 13, 2010

Assignee: Texas Instruments Incorporated

Inventors: Gerard Chauvel, Serge Lasserre, Maija Kuusela
State machine based filtering of non-dominant branches to use a modified gshare scheme

Patent number: 7747845

Abstract: Disclosed is a method and apparatus providing the ability to create a multi-level prediction algorithm, whereby branch predictions beyond the first level of prediction are maintained at a secondary level because the prior level was unsuccessfully able to highly predict the direction of the stated branch accurately. A secondary level is smaller in size than the upper level through selected filtering thereby enabling high prediction accuracy of branches while minimizing the amount of hardware required to perform stated predictions.

Type: Grant

Filed: May 12, 2004

Date of Patent: June 29, 2010

Assignee: International Business Machines Corporation

Inventors: Brian Robert Prasky, Moinuddin Khalil Ahmed Qureshi
Method and apparatus for increasing load bandwidth

Patent number: 7739483

Abstract: A method and apparatus for dual-target register allocation is described, intended to enable the efficient mapping/renaming of registers associated with instructions within a pipelined microprocessor architecture.

Type: Grant

Filed: September 28, 2001

Date of Patent: June 15, 2010

Assignee: Intel Corporation

Inventors: Rajesh Patel, James Dundas, Adi Yoaz
Method for preloading data in a CPU pipeline

Patent number: 7730289

Abstract: A method for preloading data in a CPU pipeline is provided, which includes the following steps. When a hint instruction is executed, allocate and initiate an entry in a preload table. When a load instruction is fetched, load a piece of data from a memory into the entry according to the entry. When a use instruction which uses the data loaded by the load instruction is executed, forward the data for the use instruction from the entry instead of from the memory. When the load instruction is executed, update the entry according to the load instruction.

Type: Grant

Filed: September 27, 2007

Date of Patent: June 1, 2010

Assignee: Faraday Technology Corp.

Inventors: I-Jui Sung, Ming-Chung Kao

prev 1 2 3 4 next