Patents Examined by Eddie P. Chan
  • Patent number: 7865700
    Abstract: The present invention provides a system and method for prioritizing store instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one store instruction is in the issue group, if so scheduling the least one store instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one store instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: January 4, 2011
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Patent number: 7861061
    Abstract: A processor and a method for executing VLIW instructions by first fetching a VLIW instruction and then identifying from option bits encoded in a first one of the instructions within the fetched VLIW instruction packet which, if any, of the remaining instructions within the VLIW instruction are to be executed in the same execution cycle as the first instruction. Finally, executing the first instruction and any remaining instructions identified from the encoded option bits.
    Type: Grant
    Filed: May 23, 2003
    Date of Patent: December 28, 2010
    Assignee: STMicroelectronics (R&D) Ltd.
    Inventor: Zahid Hussain
  • Patent number: 7861069
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Grant
    Filed: December 19, 2006
    Date of Patent: December 28, 2010
    Assignee: Seiko-Epson Corporation
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 7861063
    Abstract: In one embodiment, a processor comprises a fetch unit and a pick unit. The fetch unit is configured to fetch instructions for execution by the processor. The pick unit is configured to schedule instructions fetched by the fetch unit for execution in the processor. The pick unit is configured to inhibit scheduling a delayed control transfer instruction (DCTI) until a delay slot instruction of the DCTI is available for scheduling. For example, in some embodiments, the pick unit may inhibit scheduling until the delay slot instruction is written to an instruction buffer, until the delay slot instruction is fetched, etc.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: December 28, 2010
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Paul J. Jordan, Jama I. Barreh
  • Patent number: 7861070
    Abstract: The present invention proposed a trace compression method for a debug and trace interface of a microprocessor, in which the debug and trace interface is associated with a plurality of registers for storing data. The trace compression method comprises the steps of: (1) finding register content of each of the registers in a first cycle and register content of each of the registers in a second cycle, in which the second cycle is next to the first cycle; (2) calculating difference of the register content of each of the registers in the second cycle and the register content of each of the registers in the first cycle; and (3) packing the differences of the register contents into data trace packets, in which the differences of the register contents of adjacent registers are condensed into a single data trace packet when the differences of the register contents of the adjacent registers are zeroes.
    Type: Grant
    Filed: June 12, 2008
    Date of Patent: December 28, 2010
    Assignee: National Tsing Hua University
    Inventors: Chih Tsun Huang, Yen Ju Ho, Ming Chang Hsieh
  • Patent number: 7861060
    Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.
    Type: Grant
    Filed: December 15, 2005
    Date of Patent: December 28, 2010
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Stephen D. Lew
  • Patent number: 7856545
    Abstract: A co-processor module for accelerating computational performance includes a Field Programmable Gate Array (“FPGA”) and a Programmable Logic Device (“PLD”) coupled to the FPGA and configured to control start-up configuration of the FPGA. A non-volatile memory is coupled to the PLD and configured to store a start-up bitstream for the start-up configuration of the FPGA. A mechanical and electrical interface is for being plugged into a microprocessor socket of a motherboard for direct communication with at least one microprocessor capable of being coupled to the motherboard. After completion of a start-up cycle, the FPGA is configured for direct communication with the at least one microprocessor via a microprocessor bus to which the microprocessor socket is coupled.
    Type: Grant
    Filed: July 27, 2007
    Date of Patent: December 21, 2010
    Assignee: DRC Computer Corporation
    Inventor: Steven Casselman
  • Patent number: 7856548
    Abstract: Prediction of data values to be read from memory by a microprocessor for load operations. In one aspect, a method for predicting a data value that will result from a load operation to be executed by the microprocessor includes accessing an entry in a load value prediction table that stores a predicted data value corresponding to the load operation. The predicted data value is provided as a result of the load operation without waiting for execution of the load operation to complete based on a confidence parameter stored in the entry compared to a dynamic confidence threshold.
    Type: Grant
    Filed: December 26, 2006
    Date of Patent: December 21, 2010
    Assignee: Oracle America, Inc.
    Inventors: Chris Nelson, Matthew Ashcraft, John Gregory Favor
  • Patent number: 7856546
    Abstract: A configurable processor module accelerator using a programmable logic device is described. According to one embodiment, the accelerator module includes a circuit board having coupled thereto a first programmable logic device, a controller, and a first memory. The first programmable logic device has access to a bitstream which is stored in the first memory. Access to the bitstream by the first programmable logic device is controlled by the controller. The bitstream is capable of being instantiated in the first programmable logic device using programmable logic thereof to provide at least a transport interface for communication between the first programmable logic device and one or more other devices associated with the motherboard using the microprocessor interface.
    Type: Grant
    Filed: July 27, 2007
    Date of Patent: December 21, 2010
    Assignee: DRC Computer Corporation
    Inventors: Steven Casselman, Stephen Sample
  • Patent number: 7856543
    Abstract: A data processing architecture comprising: an input device for receiving an incoming stream of data packets; and a plurality of processing elements which are operable to process data received thereby; wherein the input device is operable to distribute data packets in whole or in part to the processing elements in dependence upon the data processing bandwidth of the processing elements.
    Type: Grant
    Filed: February 14, 2002
    Date of Patent: December 21, 2010
    Assignee: Rambus Inc.
    Inventors: John Rhoades, Ken Cameron, Paul Winser, Ray McConnell, Gordon Faulds, Simon McIntosh-Smith, Anthony Spencer, Jeff Bond, Matthias Dejaegher, Danny Halamish, Gajinder Panesar
  • Patent number: 7853774
    Abstract: An integrated circuit including a plurality of tiles. Each tile comprises a processor; a switch including switching circuitry to forward data words over data paths from other tiles to the processor and to switches of other tiles; and memory coupled to the switch to buffer data transmitted among the tiles. The switches form a plurality of networks among the tiles. At least one of the networks is configured to transmit data among the tiles using an approach that reserves sufficient buffer space in the memories coupled to the switches to avoid deadlock conditions, and at least one of the networks is configured to transmit data among the tiles using an approach to detect and recover from deadlock conditions.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: December 14, 2010
    Assignee: Tilera Corporation
    Inventor: David Wentzlaff
  • Patent number: 7853777
    Abstract: An apparatus for reducing instruction re-fetching in a multithreading processor configured to concurrently execute a plurality of threads is disclosed. The apparatus includes a buffer for each thread that stores fetched instructions of the thread, having an indicator for indicating which of the fetched instructions in the buffer have already been dispatched for execution. An input for each thread indicates that one or more of the already-dispatched instructions in the buffer has been flushed from execution. Control logic for each thread updates the indicator to indicate the flushed instructions are no longer already-dispatched, in response to the input. This enables the processor to re-dispatch the flushed instructions from the buffer to avoid re-fetching the flushed instructions. In one embodiment, there are fewer buffers than threads, and they are dynamically allocatable by the threads. In one embodiment, a single integrated buffer is shared by all the threads.
    Type: Grant
    Filed: February 4, 2005
    Date of Patent: December 14, 2010
    Assignee: MIPS Technologies, Inc.
    Inventors: Darren M. Jones, Ryan C. Kinter, G. Michael Uhler, Sanjay Vishin
  • Patent number: 7853776
    Abstract: A bytecode accelerator which translates stack-based intermediate language (bytecodes) into register-based CPU instructions transfers plural pieces of internal information from a register file of a CPU to the bytecode accelerator by means of an internal transfer bus between the bytecode accelerator and the CPU and an input selection logic of the bytecode accelerator when the bytecode accelerator is started and transfers plural pieces of internal information in the bytecode accelerator to the register file of the CPU by means of the internal transfer bus, an output selector and an output selector selection logic of the bytecode accelerator when the bytecode accelerator ends its operation in transition between hardware processing and software processing by software virtual machine.
    Type: Grant
    Filed: October 28, 2005
    Date of Patent: December 14, 2010
    Assignee: Renesas Technology Corp.
    Inventors: Tetsuya Yamada, Naohiko Irie
  • Patent number: 7849298
    Abstract: A method and system are disclosed for saving soft state information, which is non-critical for executing a process in a processor, upon a receipt of a process interrupt by the processor. The soft state is transmitted to a memory associated with the processor via a memory interface. Preferably, the soft state is transmitted within the processor to the memory interface via a scan-chain pathway within the processor, which allows functional data pathways to remain unobstructed by the storage of the soft state. Thereafter, the stored soft state can be restored from memory when the process is again executed.
    Type: Grant
    Filed: January 12, 2009
    Date of Patent: December 7, 2010
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Guy Lynn Guthrie, William John Starke
  • Patent number: 7849292
    Abstract: A method and apparatus for optimizing a sequence of operations adapted for execution by a processor is disclosed to include locating an operation, if any, that is next within the sequence of operations and setting a current operation to be that operation. The current operation is processed as follows: a) de-activating, if not already de-activated, a consumed indicator associated with the current operation; and b) when the current operation is of the producer type, then activating, if not already activated, a producer indicator associated with the current operation, and locating a first set of operations, if any, that i) are earlier in the sequence of operations than the current operation, ii) have their associated producer indicator activated, and iii) have their associated consumed indicator de-activated, and then de-activating the producer indicator associated with each operation in the first set.
    Type: Grant
    Filed: November 16, 2007
    Date of Patent: December 7, 2010
    Assignee: Oracle America, Inc.
    Inventors: Matthew William Ashcraft, John Gregory Favor, Christopher Patrick Nelson, Ivan Pavle Radivojevic, Joseph Byron Rowlands, Richard Win Thaik
  • Patent number: 7849296
    Abstract: There is provided a method of controlling a monitoring function of a processor, the processor being operable in at least two domains, comprising a first domain and a second domain, the first and second domains each comprising at least one mode, the method comprising the steps of: setting at least one control value, the at least one control value relating to a condition and being indicative of whether the monitoring function is allowable in the first domain; and only allowing initiation of the monitoring function in the first domain when the condition is present if its related control value indicates that the monitoring function is allowable. In some embodiments the first domain is a secure domain and the monitoring function is a debug or trace function.
    Type: Grant
    Filed: November 17, 2003
    Date of Patent: December 7, 2010
    Assignee: ARM Limited
    Inventors: Simon Charles Watt, Luc Orion
  • Patent number: 7849299
    Abstract: Provided is a means for accessing multiple entries from a branch history table (BHT) in a single clock cycle, in the context of pipelined instruction processing. In a first clock cycle, a plurality of conditional branch instructions is fetched. A value is accessed from a global history record (GHR) of conditional branch resolutions and predictions for a fetched conditional branch instruction. An associated instruction address is hashed with a left-shifted GHR value. The result is used to access a word in an indexed BHT stored in a single-port random access memory (RAM). The word comprises a branch direction count for the plurality of fetched conditional branch instructions. In a second clock cycle a conditional branch instruction is executed at an execute stage and the BHT is written with an updated branch direction count in response to a resolution of the executed conditional branch instruction.
    Type: Grant
    Filed: May 5, 2008
    Date of Patent: December 7, 2010
    Assignee: Applied Micro Circuits Corporation
    Inventors: Terrence Matthew Potter, Jon A. Loschke
  • Patent number: 7844805
    Abstract: A processor for a portable electronic device. The processor includes a RISC (reduced instruction set computing) core a CISC (complex instruction set computing) core, a video accelerator circuit and an audio accelerator circuit. Each of the video and audio accelerator circuits are coupled to both the RISC and CISC cores, with both cores and both accelerator circuit being incorporated into a single integrated circuit. In a first plurality of operational modes, the RISC core is active, while the CISC core is in one of a sleep state or a power off state. In a second plurality of modes, both the RISC and CISC cores are active.
    Type: Grant
    Filed: June 6, 2007
    Date of Patent: November 30, 2010
    Assignee: VIA Technologies, Inc.
    Inventor: Chi Chang
  • Patent number: 7844806
    Abstract: A system and method are provided for updating a global history prediction record in a microprocessor system using pipelined instruction processing. The method accepts a microprocessor instruction of consecutive operations, including a conditional branch operation with an associated branch address, at a first stage in a pipelined microprocessor execution process. A global history record (GHR) of conditional branch resolutions and predictions is accessed and hashed with the branch address, creating a first hash result. The first hash result is used to access an indexed branch history table (BHT) of branch direction counts and the BHT is used to make a branch prediction. If the branch prediction being “taken”, the current GHR value is left-shifted and hashed with the branch address, creating a second hash result which is used in creating an updated GHR.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: November 30, 2010
    Assignee: Applied Micro Circuits Corporation
    Inventors: Jon A. Loschke, Timothy A. Olson, Terrence Matthew Potter
  • Patent number: 7844804
    Abstract: One or more Shadow Register Files (SRF) are interposed between a Physical Register File (PRF) and a Backing Store (BS) in a shadow register file system. The SRFs comprise dual-port registers connected serially in a chain of arbitrary depth from the PRF. A Register Save Engine has random access to one port of the registers in the final SRF in the chain, and saves/restores data between the final SRF and the BS, e.g., RAM. As PRF registers are deallocated from calling procedures for use by called procedures, data are serially shifted from multi-port registers in the PRF through successive corresponding dual-port registers in SRFs, and are serially shifted back toward the multi-port registers as the PRF registers are reallocated to calling procedures. Since no procedure can access more than the number of registers in the PRF, the effective size of the PRF is increased, using less costly dual-port registers.
    Type: Grant
    Filed: November 10, 2005
    Date of Patent: November 30, 2010
    Assignee: QUALCOMM Incorporated
    Inventor: Bohuslav Rychlik