Patents Examined by Kenneth S. Kim

Lane crossing instruction selecting operand data bits conveyed from register via direct path and lane crossing path for execution

Patent number: 8560811

Abstract: The present invention provides a method and apparatus for handling lane-crossing instructions in an execution pipeline. One embodiment of the method includes conveying bits of an instruction from a register to an execution stage in a pipeline along a first data path that includes a lane crossing stage configured to change a first mapping of the register to the execution stage to a second mapping. The method also includes concurrently conveying the bits along a second data path from the register to the execution stage that bypasses the lane crossing stage. The method further includes selecting the first or second data path to provide the bits to the execution stage.

Type: Grant

Filed: August 5, 2010

Date of Patent: October 15, 2013

Assignee: Advanced Micro Devices, Inc.

Inventor: John M. King
SIMD processor with each processing element receiving buffered control signal from clocked register positioned in the middle of the group

Patent number: 8024550

Abstract: Disclosed is an SIMD-type microprocessor comprising a processor element group, plural processor elements with an operation part and a register file being arranged therein and a processor element control signal generator configured to output a processor element control signal controlling an operation of the processor element, wherein a feed part configured to feed a processor element control signal output from the processor element control signal generator to the processor element is provided at a center of the processor element group.

Type: Grant

Filed: January 21, 2009

Date of Patent: September 20, 2011

Assignee: Ricoh Company, Ltd.

Inventor: Hidehito Kitamura
Pipeline processor with write control and validity flags for controlling write-back of execution result data stored in pipeline buffer register

Patent number: 8019974

Abstract: A bypass circuit is provided in a pipeline processor. A pipeline register is provided between an instruction execution stage and a write-back stage. The pipeline register stores a data validity flag and a WRITE control flag to control writing data into a general purpose register unit. The data retained in the pipeline register is allowed to be written back into the general purpose register unit when the WRITE control flag indicates “valid”. The pipeline register continues to retain the retained data even after the writing of the retained data into the general purpose register unit. The first pipeline register supplies the retained data to the second stage through the bypass circuit at the time of executing a subsequent instruction having data dependency on a preceding instruction.

Type: Grant

Filed: January 12, 2009

Date of Patent: September 13, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventor: Jun Tanabe
Updating instructions to free core in multi-core processor with core sequence table indicating linking of thread sequences for processing queued packets

Patent number: 8015392

Abstract: A method of updating execution instructions of a multi-core processor comprising receiving execution instructions at a processor including multiple programmable processing cores integrated on a single die, selecting subset of at least one of the cores, and loading at least a portion of the execution instructions to the subset of cores and replacing existing execution instructions, associated with the first subset of programmable processing cores, with the received execution instructions while at least one of the other cores continues to process received packets, wherein a sequence of threads provided by the cores sequentially retrieve packets to process from at least one queue, the sequence proceeding from a subsequence of at least one thread of one core to a subsequence of at least one thread on another core and wherein the sequence of threads is specified by data identifying, at least, the next core in the sequence.

Type: Grant

Filed: September 29, 2004

Date of Patent: September 6, 2011

Assignee: Intel Corporation

Inventors: Uday Naik, Ching Boon Lee, Ai Bee Lim, Koji Sahara
Loop processing counter with automatic start time set or trigger modes in context reconfigurable PE array

Patent number: 7996661

Abstract: A dynamic reconfigurable circuit that implements optional processing by dynamically switching a processing content of a reconfigurable processing element (PE) and a connection content between the PEs in accordance with a context, includes: a configuration register section for setting a content of loop processing on the basis of the context, the loop processing content including an output source of an output signal from each of a set of the reconfigured PEs, an output destination of the output signal, and a condition for outputting the output signal to the output destination; and at least one counter circuit including a loop control section and an output register section that implement the set loop processing, that count the number of implementations of the loop processing implemented by the loop control section, and that output the output signal to the output destination based on the counted number of implementations and the condition.

Type: Grant

Filed: September 17, 2008

Date of Patent: August 9, 2011

Assignee: Fujitsu Semiconductor Limited

Inventors: Takashi Hanai, Shinichi Sutou, Masaki Arai, Mitsuharu Wakayoshi
Message passing module in hybrid computing system starting and sending operation information to service program for accelerator to execute application program

Patent number: 7984267

Abstract: Executing a service program for an accelerator application program in a hybrid computing environment that includes a host computer and an accelerator, the host computer and the accelerator adapted to one another for data communications by a system level message passing module; where the service program includes a host portion and an accelerator portion and executing a service program for an accelerator includes receiving, from the host portion, operating information for the accelerator portion; starting the accelerator portion on the accelerator; providing, to the accelerator portion, operating information for the accelerator application program; establishing direct data communications between the host portion and the accelerator portion; and, responsive to an instruction communicated directly from the host portion, executing the accelerator application program.

Type: Grant

Filed: September 4, 2008

Date of Patent: July 19, 2011

Assignee: International Business Machines Corporation

Inventors: Michael E. Aho, Ricardo M. Matinata, Amir F. Sanjar, Gordon G. Stewart, Cornell G. Wright, Jr.
Multi-core processors for 3D array transposition by logically retrieving in-place physically transposed sub-array data

Patent number: 7979672

Abstract: A method and system for transposing a multi-dimensional array for a multi-processor system having a main memory for storing the multi-dimensional array and a local memory is provided. One implementation involves partitioning the multi-dimensional array into a number of equally sized portions in the local memory, in each processor performing a transpose function including a logical transpose on one of said portions and then a physical transpose of said portion, and combining the transposed portions and storing back in their original place in the main memory.

Type: Grant

Filed: July 25, 2008

Date of Patent: July 12, 2011

Assignee: International Business Machines Corporation

Inventors: Ahmed H. M. R. El-Mahdy, Ali A. El-Moursy, Hisham ElShishiny
Re-executing launcher program upon termination of launched programs in MIMD mode booted SIMD partitions

Patent number: 7979674

Abstract: Executing MIMD programs on a SIMD machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing one or more SIMD partitions, booting one or more SIMD partitions in MIMD mode; establishing a MIMD partition; executing by launcher programs a plurality of MIMD programs on two or more of the compute nodes of the MIMD partition; and re-executing a launcher program by an operating system on a compute node in the MIMD partition upon termination of the MIMD program executed by the launcher program.

Type: Grant

Filed: May 16, 2007

Date of Patent: July 12, 2011

Assignee: International Business Machines Corporation

Inventors: Todd A. Inglett, Patrick J. McCarthy, Amanda Peters, Thomas A. Budnik, Michael B. Mundy, Gordon G. Stewart
Reconfiguration of execution path upon verification of extension security information and disabling upon configuration change in instruction extensible microprocessor

Patent number: 7975126

Abstract: Described is microprocessor architecture that includes at least one reconfigurable execution path (e.g., implemented via FPGAs or CPLDs). When an instruction is fetched, a mechanism determines whether the reconfigurable execution path (and/or which path) will handle that instruction. A content addressable memory may be used to determine the execution path when fed the instruction's operational code, or an arbiter and multiplexer may resolve conflicts if multiple instruction decode blocks recognize the same instruction. The execution path may be dynamically reconfigured, activated or deactivated as needed, such as to extend an instruction set, to optimize instructions for a particular application program, to implement a peripheral device, to provide parallel computing, and/or based on power consumption and/or processing power needs. Security may be provided by having the reconfigurable execution path loaded from an extension file that is associated with metadata, including security information.

Type: Grant

Filed: March 19, 2009

Date of Patent: July 5, 2011

Assignee: Microsoft Corporation

Inventors: Richard Neil Pittman, Alessandro Forin, Nathaniel L. Lynch
Limiting entries in load issued premature part of load reorder queue searched to detect invalid retrieved values to between store safe and snoop safe pointers for the congruence class

Patent number: 7971033

Abstract: A method for reducing the number of load instructions in the load reorder queue (LRQ) that are searched when a load instruction is executed by a processor, including dispatching the load instructions; inserting the load instructions in the LRQ in program order; clearing a load received data field; executing the load instructions; checking load reorder queue (LRQ) entries; re-executing the load instruction of the matching LRQ entry; continuing execution; getting the load data; setting the load received data field; comparing a load sequence number (LSQN) of each load instruction to a snoop_safe register contents; ANDing all the load received data bits if the LSQN is greater in magnitude to the snoop_safe; setting the snoop_safe register to the LSQN of the load instruction; searching the LRQ entry; and setting a load_peril_snoop register to the LRQ index value where the first load instruction younger to the snoop_safe was found.

Type: Grant

Filed: July 14, 2008

Date of Patent: June 28, 2011

Assignee: International Business Machines

Inventors: Erik R. Altman, Vijayalakshmi Srinivasan
Determining length of instruction with escape and addressing form bytes without evaluating opcode

Patent number: 7966476

Abstract: A method, apparatus and system are disclosed for decoding an instruction in a variable-length instruction set. The instruction is one of a set of new types of instructions that uses a new escape code value, which is two bytes in length, to indicate that a third opcode byte includes the instruction-specific opcode for a new instruction. The new instructions are defined such the length of each instruction in the opcode map for one of the new escape opcode values may be determined using the same set of inputs, where each of the inputs is relevant to determining the length of each instruction in the new opcode map. For at least one embodiment, the length of one of the new instructions is determined without evaluating the instruction-specific opcode.

Type: Grant

Filed: February 28, 2008

Date of Patent: June 21, 2011

Assignee: Intel Corporation

Inventors: James S. Coke, Peter J. Ruscito, Masood Tahir, David B. Jackson, Ves A. Naydenov, Scott D. Rodgers, Bret L. Toll, Frank Binns
Interleaving saturated lower half of data elements from two source registers of packed data

Patent number: 7966482

Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.

Type: Grant

Filed: June 12, 2006

Date of Patent: June 21, 2011

Assignee: Intel Corporation

Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
Limiting entries in load reorder queue searched for snoop check to between snoop peril and tail pointers

Patent number: 7966478

Abstract: A method for reducing entries searched in a load reorder queue (LRQ) when snoop instructions are executed by a processor, including checking load reorder queue (LRQ) entries located between a load_peril_snoop register and a lrq_tail register for addresses matching the address of the snoop; and setting a snooped bit in the LRQ entry for any matches found.

Type: Grant

Filed: July 14, 2008

Date of Patent: June 21, 2011

Assignee: International Business Machines Corporation

Inventors: Erik R. Altman, Vijayalakshmi Srinivasan
Dynamic runtime range checking of different types on a register using upper and lower bound value registers for the register

Patent number: 7962729

Abstract: Software defects (e.g., array access out of bounds, stack overflow, infinite loops, and data corruption) occur due to integer values falling outside their expected range. Because programming languages do not include range-checking instructions as part of their language, to detect software defects and ensure that the code runs smoothly, programmers generally use 1) runtime assertions and/or 2) sub-range data types. However, these techniques cause additional conditional branches, incur additional overhead, and decrease processor performance. Processors comprising a range checking hardware feature supported by machine instructions for runtime integer range checking can eliminate the conditional branches generated during runtime integer range checks. Programming language extensions for the range checking hardware can allow dynamic range bounds to be defined during runtime without decreasing the processor's performance. This can allow for easier programming and code that is easier to maintain.

Type: Grant

Filed: January 5, 2009

Date of Patent: June 14, 2011

Assignee: International Business Machines Corporation

Inventor: Jose G. Rivera
Replaying memory operation assigned a load/store buffer entry occupied by store operation processed beyond exception reporting stage and retired from scheduler

Patent number: 7962730

Abstract: In one embodiment, a processor comprises a retire unit and a load/store unit coupled thereto. The retire unit is configured to retire a first store memory operation responsive to the first store memory operation having been processed at least to a pipeline stage at which exceptions are reported for the first store memory operation. The load/store unit comprises a queue having a first entry assigned to the first store memory operation. The load/store unit is configured to retain the first store memory operation in the first entry subsequent to retirement of the first store memory operation if the first store memory operation is not complete. The queue may have multiple entries, and more than one store may be retained in the queue after being retired by the retire unit.

Type: Grant

Filed: November 25, 2008

Date of Patent: June 14, 2011

Assignee: Apple Inc.

Inventors: Wei-Han Lien, Po-Yung Chang
Processing stream instruction in IC of mesh connected matrix of processors containing pipeline coupled switch transferring messages over consecutive cycles from one link to another link or memory

Patent number: 7958341

Abstract: In some embodiments, each matrix processor in a matrix of mesh-interconnected matrix processors includes an instruction processing pipeline, and a hardware data switch capable of streaming data to/from one or more inter-processor matrix links and/or a matrix processor local memory links in response to execution of a data streaming instruction by the instruction processing pipeline. The data switch can transfer each data stream, which includes multiple words, at wire speed, one word per cycle. After initiating a data stream, the processing pipeline can execute other instructions, including streaming instructions, while a stream transfer is in progress. Different data streaming instructions may be used to transfer data streams from local memory to one or more inter-processor links, from an inter-processor link to local memory, from an inter-processor link to one or more inter-processor links, and from an inter-processor link to one or more inter-processor links and synchronously to local memory.

Type: Grant

Filed: July 7, 2008

Date of Patent: June 7, 2011

Assignee: Ovics

Inventors: Sorin C Cismas, Ilie Garbacea
Branch prediction table storing addresses with compressed high order bits

Patent number: 7949862

Abstract: Address control section includes an encoding section to generate higher-order address information made by compressing a predetermined higher-order bit part from predetermined higher-order and lower-order bit parts included in an instruction address, and a restoring section to restore the higher-order bit part from the higher-order address information. Branch instruction predicting section includes a history memory section that stores the higher-order bit part and the lower-order bit part corresponding to a branch address of a processed branch instruction at either one of a plurality of storing places determined from the higher-order bit part and the lower-order bit part corresponding to a branch address of a processed branch instruction.

Type: Grant

Filed: August 21, 2008

Date of Patent: May 24, 2011

Assignee: Fujitsu Limited

Inventors: Megumi Yokoi, Masaki Ukai, Takashi Suzuki
Scheduler in multi-threaded processor prioritizing instructions passing qualification rule

Patent number: 7949855

Abstract: A processor buffers asynchronous threads. Instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one computation operation and at least one memory access operation. Instructions within each phase are qualified and prioritized. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the current instructions. The instructions may also be qualified based on an age of each instruction, status of the execution units, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.

Type: Grant

Filed: April 28, 2008

Date of Patent: May 24, 2011

Assignee: NVIDIA Corporation

Inventors: Peter C. Mills, John Erik Lindholm, Brett W. Coon, Gary M. Tarolli, John Matthew Burgess
Conditional execution of floating point store instruction by simultaneously reading condition code and store data from multi-port register file

Patent number: 7945766

Abstract: A processor capable of executing conditional store instructions without being limited by the number of condition codes is provided. Condition data is stored in floating-point registers, and an operation unit executes a conditional floating-point store instruction of determining whether to store, in cache, store data.

Type: Grant

Filed: November 25, 2008

Date of Patent: May 17, 2011

Assignee: Fujitsu Limited

Inventor: Toshio Yoshida
Groups of serially coupled processor cores propagating memory write packet while maintaining coherency within each group towards a switch coupled to memory partitions

Patent number: 7941637

Abstract: A system has a first plurality of cores in a first coherency group. Each core transfers data in packets. The cores are directly coupled serially to form a serial path. The data packets are transferred along the serial path. The serial path is coupled at one end to a packet switch. The packet switch is coupled to a memory. The first plurality of cores and the packet switch are on an integrated circuit. The memory may or may not be on the integrated circuit. In another aspect a second plurality of cores in a second coherency group is coupled to the packet switch. The cores of the first and second pluralities may be reconfigured to form or become part of coherency groups different from the first and second coherency groups.

Type: Grant

Filed: April 15, 2008

Date of Patent: May 10, 2011

Assignee: Freescale Semiconductor, Inc.

Inventors: Perry H. Pelley, III, George P. Hoekstra, Lucio F. C. Pessoa

1 2 3 4 5 … next