Patents Examined by Eddie P. Chan

System and method for prioritizing store instructions

Patent number: 7865700

Abstract: The present invention provides a system and method for prioritizing store instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one store instruction is in the issue group, if so scheduling the least one store instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one store instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: January 4, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
Processor instruction including option bits encoding which instructions of an instruction packet to execute

Patent number: 7861061

Abstract: A processor and a method for executing VLIW instructions by first fetching a VLIW instruction and then identifying from option bits encoded in a first one of the instructions within the fetched VLIW instruction packet which, if any, of the remaining instructions within the VLIW instruction are to be executed in the same execution cycle as the first instruction. Finally, executing the first instruction and any remaining instructions identified from the encoded option bits.

Type: Grant

Filed: May 23, 2003

Date of Patent: December 28, 2010

Assignee: STMicroelectronics (R&D) Ltd.

Inventor: Zahid Hussain
System and method for handling load and/or store operations in a superscalar microprocessor

Patent number: 7861069

Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.

Type: Grant

Filed: December 19, 2006

Date of Patent: December 28, 2010

Assignee: Seiko-Epson Corporation

Inventors: Cheryl D. Senter, Johannes Wang
Delay slot handling in a processor

Patent number: 7861063

Abstract: In one embodiment, a processor comprises a fetch unit and a pick unit. The fetch unit is configured to fetch instructions for execution by the processor. The pick unit is configured to schedule instructions fetched by the fetch unit for execution in the processor. The pick unit is configured to inhibit scheduling a delayed control transfer instruction (DCTI) until a delay slot instruction of the DCTI is available for scheduling. For example, in some embodiments, the pick unit may inhibit scheduling until the delay slot instruction is written to an instruction buffer, until the delay slot instruction is fetched, etc.

Type: Grant

Filed: June 30, 2004

Date of Patent: December 28, 2010

Assignee: Oracle America, Inc.

Inventors: Robert T. Golla, Paul J. Jordan, Jama I. Barreh
Trace compression method for debug and trace interface wherein differences of register contents between logically adjacent registers are packed and increases of program counter addresses are categorized

Patent number: 7861070

Abstract: The present invention proposed a trace compression method for a debug and trace interface of a microprocessor, in which the debug and trace interface is associated with a plurality of registers for storing data. The trace compression method comprises the steps of: (1) finding register content of each of the registers in a first cycle and register content of each of the registers in a second cycle, in which the second cycle is next to the first cycle; (2) calculating difference of the register content of each of the registers in the second cycle and the register content of each of the registers in the first cycle; and (3) packing the differences of the register contents into data trace packets, in which the differences of the register contents of adjacent registers are condensed into a single data trace packet when the differences of the register contents of the adjacent registers are zeroes.

Type: Grant

Filed: June 12, 2008

Date of Patent: December 28, 2010

Assignee: National Tsing Hua University

Inventors: Chih Tsun Huang, Yen Ju Ho, Ming Chang Hsieh
Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior

Patent number: 7861060

Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.

Type: Grant

Filed: December 15, 2005

Date of Patent: December 28, 2010

Assignee: NVIDIA Corporation

Inventors: John R. Nickolls, Stephen D. Lew
FPGA co-processor for accelerated computation

Patent number: 7856545

Abstract: A co-processor module for accelerating computational performance includes a Field Programmable Gate Array (“FPGA”) and a Programmable Logic Device (“PLD”) coupled to the FPGA and configured to control start-up configuration of the FPGA. A non-volatile memory is coupled to the PLD and configured to store a start-up bitstream for the start-up configuration of the FPGA. A mechanical and electrical interface is for being plugged into a microprocessor socket of a motherboard for direct communication with at least one microprocessor capable of being coupled to the motherboard. After completion of a start-up cycle, the FPGA is configured for direct communication with the at least one microprocessor via a microprocessor bus to which the microprocessor socket is coupled.

Type: Grant

Filed: July 27, 2007

Date of Patent: December 21, 2010

Assignee: DRC Computer Corporation

Inventor: Steven Casselman
Prediction of data values read from memory by a microprocessor using a dynamic confidence threshold

Patent number: 7856548

Abstract: Prediction of data values to be read from memory by a microprocessor for load operations. In one aspect, a method for predicting a data value that will result from a load operation to be executed by the microprocessor includes accessing an entry in a load value prediction table that stores a predicted data value corresponding to the load operation. The predicted data value is provided as a result of the load operation without waiting for execution of the load operation to complete based on a confidence parameter stored in the entry compared to a dynamic confidence threshold.

Type: Grant

Filed: December 26, 2006

Date of Patent: December 21, 2010

Assignee: Oracle America, Inc.

Inventors: Chris Nelson, Matthew Ashcraft, John Gregory Favor
Configurable processor module accelerator using a programmable logic device

Patent number: 7856546

Abstract: A configurable processor module accelerator using a programmable logic device is described. According to one embodiment, the accelerator module includes a circuit board having coupled thereto a first programmable logic device, a controller, and a first memory. The first programmable logic device has access to a bitstream which is stored in the first memory. Access to the bitstream by the first programmable logic device is controlled by the controller. The bitstream is capable of being instantiated in the first programmable logic device using programmable logic thereof to provide at least a transport interface for communication between the first programmable logic device and one or more other devices associated with the motherboard using the microprocessor interface.

Type: Grant

Filed: July 27, 2007

Date of Patent: December 21, 2010

Assignee: DRC Computer Corporation

Inventors: Steven Casselman, Stephen Sample
Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream

Patent number: 7856543

Abstract: A data processing architecture comprising: an input device for receiving an incoming stream of data packets; and a plurality of processing elements which are operable to process data received thereby; wherein the input device is operable to distribute data packets in whole or in part to the processing elements in dependence upon the data processing bandwidth of the processing elements.

Type: Grant

Filed: February 14, 2002

Date of Patent: December 21, 2010

Assignee: Rambus Inc.

Inventors: John Rhoades, Ken Cameron, Paul Winser, Ray McConnell, Gordon Faulds, Simon McIntosh-Smith, Anthony Spencer, Jeff Bond, Matthias Dejaegher, Danny Halamish, Gajinder Panesar
Managing buffer storage in a parallel processing environment

Patent number: 7853774

Abstract: An integrated circuit including a plurality of tiles. Each tile comprises a processor; a switch including switching circuitry to forward data words over data paths from other tiles to the processor and to switches of other tiles; and memory coupled to the switch to buffer data transmitted among the tiles. The switches form a plurality of networks among the tiles. At least one of the networks is configured to transmit data among the tiles using an approach that reserves sufficient buffer space in the memories coupled to the switches to avoid deadlock conditions, and at least one of the networks is configured to transmit data among the tiles using an approach to detect and recover from deadlock conditions.

Type: Grant

Filed: December 21, 2005

Date of Patent: December 14, 2010

Assignee: Tilera Corporation

Inventor: David Wentzlaff
Instruction/skid buffers in a multithreading microprocessor that store dispatched instructions to avoid re-fetching flushed instructions

Patent number: 7853777

Abstract: An apparatus for reducing instruction re-fetching in a multithreading processor configured to concurrently execute a plurality of threads is disclosed. The apparatus includes a buffer for each thread that stores fetched instructions of the thread, having an indicator for indicating which of the fetched instructions in the buffer have already been dispatched for execution. An input for each thread indicates that one or more of the already-dispatched instructions in the buffer has been flushed from execution. Control logic for each thread updates the indicator to indicate the flushed instructions are no longer already-dispatched, in response to the input. This enables the processor to re-dispatch the flushed instructions from the buffer to avoid re-fetching the flushed instructions. In one embodiment, there are fewer buffers than threads, and they are dynamically allocatable by the threads. In one embodiment, a single integrated buffer is shared by all the threads.

Type: Grant

Filed: February 4, 2005

Date of Patent: December 14, 2010

Assignee: MIPS Technologies, Inc.

Inventors: Darren M. Jones, Ryan C. Kinter, G. Michael Uhler, Sanjay Vishin
Handover between software and hardware accelerator

Patent number: 7853776

Abstract: A bytecode accelerator which translates stack-based intermediate language (bytecodes) into register-based CPU instructions transfers plural pieces of internal information from a register file of a CPU to the bytecode accelerator by means of an internal transfer bus between the bytecode accelerator and the CPU and an input selection logic of the bytecode accelerator when the bytecode accelerator is started and transfers plural pieces of internal information in the bytecode accelerator to the register file of the CPU by means of the internal transfer bus, an output selector and an output selector selection logic of the bytecode accelerator when the bytecode accelerator ends its operation in transition between hardware processing and software processing by software virtual machine.

Type: Grant

Filed: October 28, 2005

Date of Patent: December 14, 2010

Assignee: Renesas Technology Corp.

Inventors: Tetsuya Yamada, Naohiko Irie
Enhanced processor virtualization mechanism via saving and restoring soft processor/system states

Patent number: 7849298

Abstract: A method and system are disclosed for saving soft state information, which is non-critical for executing a process in a processor, upon a receipt of a process interrupt by the processor. The soft state is transmitted to a memory associated with the processor via a memory interface. Preferably, the soft state is transmitted within the processor to the memory interface via a scan-chain pathway within the processor, which allows functional data pathways to remain unobstructed by the storage of the soft state. Thereafter, the stored soft state can be restored from memory when the process is again executed.

Type: Grant

Filed: January 12, 2009

Date of Patent: December 7, 2010

Assignee: International Business Machines Corporation

Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Guy Lynn Guthrie, William John Starke
Flag optimization of a trace

Patent number: 7849292

Abstract: A method and apparatus for optimizing a sequence of operations adapted for execution by a processor is disclosed to include locating an operation, if any, that is next within the sequence of operations and setting a current operation to be that operation. The current operation is processed as follows: a) de-activating, if not already de-activated, a consumed indicator associated with the current operation; and b) when the current operation is of the producer type, then activating, if not already activated, a producer indicator associated with the current operation, and locating a first set of operations, if any, that i) are earlier in the sequence of operations than the current operation, ii) have their associated producer indicator activated, and iii) have their associated consumed indicator de-activated, and then de-activating the producer indicator associated with each operation in the first set.

Type: Grant

Filed: November 16, 2007

Date of Patent: December 7, 2010

Assignee: Oracle America, Inc.

Inventors: Matthew William Ashcraft, John Gregory Favor, Christopher Patrick Nelson, Ivan Pavle Radivojevic, Joseph Byron Rowlands, Richard Win Thaik
Monitoring control for monitoring at least two domains of multi-domain processors

Patent number: 7849296

Abstract: There is provided a method of controlling a monitoring function of a processor, the processor being operable in at least two domains, comprising a first domain and a second domain, the first and second domains each comprising at least one mode, the method comprising the steps of: setting at least one control value, the at least one control value relating to a condition and being indicative of whether the monitoring function is allowable in the first domain; and only allowing initiation of the monitoring function in the first domain when the condition is present if its related control value indicates that the monitoring function is allowable. In some embodiments the first domain is a secure domain and the monitoring function is a debug or trace function.

Type: Grant

Filed: November 17, 2003

Date of Patent: December 7, 2010

Assignee: ARM Limited

Inventors: Simon Charles Watt, Luc Orion
Microprocessor system for simultaneously accessing multiple branch history table entries using a single port

Patent number: 7849299

Abstract: Provided is a means for accessing multiple entries from a branch history table (BHT) in a single clock cycle, in the context of pipelined instruction processing. In a first clock cycle, a plurality of conditional branch instructions is fetched. A value is accessed from a global history record (GHR) of conditional branch resolutions and predictions for a fetched conditional branch instruction. An associated instruction address is hashed with a left-shifted GHR value. The result is used to access a word in an indexed BHT stored in a single-port random access memory (RAM). The word comprises a branch direction count for the plurality of fetched conditional branch instructions. In a second clock cycle a conditional branch instruction is executed at an execute stage and the BHT is written with an updated branch direction count in response to a resolution of the executed conditional branch instruction.

Type: Grant

Filed: May 5, 2008

Date of Patent: December 7, 2010

Assignee: Applied Micro Circuits Corporation

Inventors: Terrence Matthew Potter, Jon A. Loschke
Portable processing device having a modem selectively coupled to a RISC core or a CISC core

Patent number: 7844805

Abstract: A processor for a portable electronic device. The processor includes a RISC (reduced instruction set computing) core a CISC (complex instruction set computing) core, a video accelerator circuit and an audio accelerator circuit. Each of the video and audio accelerator circuits are coupled to both the RISC and CISC cores, with both cores and both accelerator circuit being incorporated into a single integrated circuit. In a first plurality of operational modes, the RISC core is active, while the CISC core is in one of a sleep state or a power off state. In a second plurality of modes, both the RISC and CISC cores are active.

Type: Grant

Filed: June 6, 2007

Date of Patent: November 30, 2010

Assignee: VIA Technologies, Inc.

Inventor: Chi Chang
Global history branch prediction updating responsive to taken branches

Patent number: 7844806

Abstract: A system and method are provided for updating a global history prediction record in a microprocessor system using pipelined instruction processing. The method accepts a microprocessor instruction of consecutive operations, including a conditional branch operation with an associated branch address, at a first stage in a pipelined microprocessor execution process. A global history record (GHR) of conditional branch resolutions and predictions is accessed and hashed with the branch address, creating a first hash result. The first hash result is used to access an indexed branch history table (BHT) of branch direction counts and the BHT is used to make a branch prediction. If the branch prediction being “taken”, the current GHR value is left-shifted and hashed with the branch address, creating a second hash result which is used in creating an updated GHR.

Type: Grant

Filed: January 31, 2008

Date of Patent: November 30, 2010

Assignee: Applied Micro Circuits Corporation

Inventors: Jon A. Loschke, Timothy A. Olson, Terrence Matthew Potter
Expansion of a stacked register file using shadow registers

Patent number: 7844804

Abstract: One or more Shadow Register Files (SRF) are interposed between a Physical Register File (PRF) and a Backing Store (BS) in a shadow register file system. The SRFs comprise dual-port registers connected serially in a chain of arbitrary depth from the PRF. A Register Save Engine has random access to one port of the registers in the final SRF in the chain, and saves/restores data between the final SRF and the BS, e.g., RAM. As PRF registers are deallocated from calling procedures for use by called procedures, data are serially shifted from multi-port registers in the PRF through successive corresponding dual-port registers in SRFs, and are serially shifted back toward the multi-port registers as the PRF registers are reallocated to calling procedures. Since no procedure can access more than the number of registers in the PRF, the effective size of the PRF is increased, using less costly dual-port registers.

Type: Grant

Filed: November 10, 2005

Date of Patent: November 30, 2010

Assignee: QUALCOMM Incorporated

Inventor: Bohuslav Rychlik

prev … 5 6 7 8 9 10 11 12 13 … next