Patents Examined by Eddie Chan
  • Patent number: 8880854
    Abstract: An out-of-order execution microprocessor executes an architectural segment register-loading instruction that instructs the microprocessor to load a new value into an architectural segment register of the microprocessor. A comparator compares the new value specified by the architectural segment register-loading instruction with a current contents of the architectural segment register. A control unit causes to be re-executed using the new value all instructions in the microprocessor that used the current architectural segment register contents as a source operand and that are newer in program order than the architectural segment register-loading instruction whenever the comparator indicates the new value does not equal the current contents.
    Type: Grant
    Filed: February 11, 2009
    Date of Patent: November 4, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: Rodney E. Hooker, Gerard M. Col, Terry Parks
  • Patent number: 8028290
    Abstract: Multiple instruction set architectures are supported in a system that provides a power-efficient and flexible platform for virtual machine environments requiring multiple support for multiple instruction set architectures (ISAs). A processor includes multiple cores having disparate native ISAs and that may be selectively enabled for operation, so that power is conserved when support for a particular ISA is not required of the processor. A hypervisor controls operation of the cores, locates a core and enables it if necessary when a request to instantiate a virtual machine having a specified ISA is received. The ISA may be specified by a particular operating system and/or application program requirements.
    Type: Grant
    Filed: August 30, 2006
    Date of Patent: September 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: James Walter Rymarczyk, Michael Ignatowski, Thomas J. Heller, Jr.
  • Patent number: 8006070
    Abstract: An information handling system includes a processor that throttles an instruction fetcher whenever a group of instructions in a branch instruction queue together exhibits a confidence in the accuracy of branch predictions of branch instructions therein that is less than a first predetermined threshold confidence threshold. In one embodiment, the processor includes a fetch throttle controller that inhibits fetch throttling by the instruction fetcher when confidence in the accuracy of a branch prediction for a particular currently issued branch instruction exhibits less than a second predetermined threshold confidence threshold.
    Type: Grant
    Filed: December 5, 2007
    Date of Patent: August 23, 2011
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Robert Alan Philhower, Raymond Cheung Yeung
  • Patent number: 8006073
    Abstract: A system and method for management of resource allocation of threads for efficient execution of instructions. Prior to dispatching decoded instructions of a first thread from the instruction fetch unit to a buffer within a scheduler, logic within the instruction fetch unit may determine the buffer is already full of dispatched instructions. However, the logic may also determine that a buffer for a second thread within the core or micro core is available. The second buffer may receive and issue decoded instructions for the first thread until the buffer is becomes unavailable. While the second buffer receives and issues instructions for the first thread, the throughput of the system for the first thread may increase due to a reduction in wait cycles.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: August 23, 2011
    Assignee: Oracle America, Inc.
    Inventors: Abid Ali, Shailender Chaudhry
  • Patent number: 8006075
    Abstract: Systems and methods for storage of writes to memory corresponding to multiple threads. A processor comprises a store queue, wherein the queue dynamically allocates a current entry for a committed store instruction in which entries of the array may be allocated out of program order. For a given thread, the store queue conveys store data to a memory in program order. The queue is further configured to identify an entry of the plurality of entries that corresponds to an oldest committed store instruction for a given thread and determine a next entry of the array that corresponds to a next committed store instruction in program order following the oldest committed store instruction of the given thread, wherein said next entry includes data identifying the entry. The queue marks an entry as unfilled upon successful conveying of store data to the memory.
    Type: Grant
    Filed: May 21, 2009
    Date of Patent: August 23, 2011
    Assignee: Oracle America, Inc.
    Inventor: Mark A. Luttrell
  • Patent number: 8001363
    Abstract: A value representative of a processor's speculative branch prediction efficiency is determined and the speculative branch prediction depth is adjusted accordingly. The processor's speculative branch prediction efficiency may be represented by the average number of clocks per instruction (CPI), whereby an increase in the average CPI indicates that the processor is becoming less efficient due to incorrectly predicted speculative branch predictions and, conversely, a decrease indicates that the processor has a higher ratio of properly predicted speculative branch predictions.
    Type: Grant
    Filed: April 4, 2005
    Date of Patent: August 16, 2011
    Inventor: Elias Shihadeh
  • Patent number: 8001506
    Abstract: A disclosed image processing apparatus includes a SIMD microprocessor in which multiple processor elements are arranged in one dimension, each of the processor elements including multiple access registers arranged in stages for storing image data; and multiple data processing devices corresponding one-to-one with the stages of the access registers, arranged in one dimension in the same direction as the processor elements, and configured to read and write image data from/to the access registers. The access registers of each of the stages, each of which access registers is included in a different one of the processor elements, are connected with a common line. Wiring outlets, each of which connects the common line of a different one of the stages to a corresponding data processing device, are individually disposed within the SIMD microprocessor in such a manner that each wiring outlet has a shortest possible distance to the corresponding data processing device.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: August 16, 2011
    Assignee: Ricoh Company, Ltd.
    Inventor: Tomoaki Ozaki
  • Patent number: 7996662
    Abstract: In one embodiment, a processor comprises a plurality of storage locations, a decode circuit, and a status/control register (SCR). Each storage location is addressable as a speculative register and is configured to store result data generated during execution of an instruction operation and a value representing an update for the SCR. The value includes at least a first encoding that represents an update to a plurality of bits in the SCR, and a first number of bits in the plurality of bits is greater than a second number of bits in the first encoding. The decode circuit is coupled to receive the first encoding from a first storage location responsive to retirement of a first instruction operation assigned to use the first storage location as a destination, and is configured to decode the first encoding and generate the plurality of bits. The decode circuit is configured to update the SCR.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: August 9, 2011
    Assignee: Apple Inc.
    Inventors: Wei-Han Lien, Daniel C. Murray, Junji Sugisawa
  • Patent number: 7996655
    Abstract: One embodiment provides a method of forwarding data in a processor. The method generally includes providing at least one cascaded delayed execution pipeline unit having at least a first pipeline and a second pipeline for executing first and second instructions in a common issue group, wherein the second pipeline executes the second instruction in a delayed manner relative to the execution of the first instruction in the first pipeline, storing results generated by an execution unit of the first pipeline in a first-in first-out (FIFO) storage target delay queue, determining if the target delay queue contains source data for executing the second instruction, and if the target delay queue contains source data for the second instruction, forwarding the source data for the second instruction from the target delay queue to an execution unit of the second pipeline.
    Type: Grant
    Filed: April 22, 2008
    Date of Patent: August 9, 2011
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Patent number: 7991984
    Abstract: A loop control system comprises at least one loop flag in an instruction word, at least one loop counter associated with the at least one loop flag operable to store and compute a number of times a program loop is to be executed, at least one start address register associated with the at least one loop flag operable to store a program loop starting address, and at least one end address register associated with the at least one loop flag operable to store a program loop ending address.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: August 2, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Eran Pisek
  • Patent number: 7991978
    Abstract: Data processing on a network on chip (‘NOC’) that includes integrated processor (‘IP’) blocks, each of a plurality of the IP blocks including at least one computer processor, each such computer processor implementing a plurality of hardware threads of execution; low latency, high bandwidth application messaging interconnects; memory communications controllers; network interface controllers; and routers; each of the IP blocks adapted to a router through a separate one of the low latency, high bandwidth application messaging interconnects, a separate one of the memory communications controllers, and a separate one of the network interface controllers; each application messaging interconnect abstracting into an architected state of each processor, for manipulation by computer programs executing on the processor, hardware inter-thread communications among the hardware threads of execution; each memory communications controller controlling communication between an IP block and memory; each network interface contro
    Type: Grant
    Filed: May 9, 2008
    Date of Patent: August 2, 2011
    Assignee: International Business Machines Corporation
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Eric O. Mejdrich, Paul E. Schardt
  • Patent number: 7984266
    Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: July 19, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7971030
    Abstract: An apparatus, method, and system for synchronicity independent, resource delegating, power and instruction optimizing processor is provided where instructions are delegated between various processing resources of the processor. An Integer Processing Unit (IPU) of the processor delegates complicated mathematical instructions to a Mathematical Processing Unit (MPU) of the processor. Furthermore, the processor puts underutilized processing resources to sleep thereby increasing power usage efficiency. A cache of the processor is also capable of accepting delegated operations from the IPU. As such, the cache performs various logical operations on delegated requests allowing it to lock and share memory without requiring extra processing cycles by the entire processor.
    Type: Grant
    Filed: August 6, 2003
    Date of Patent: June 28, 2011
    Assignee: MMAGIX Technology Limited
    Inventor: Daniel Shane O'Sullivan
  • Patent number: 7966480
    Abstract: Trap flags and a pointer trap are associated with registers in a processor. Each trap flag indicates whether a corresponding register has been written with valid data. If not, the trap flag is set to indicate that the register corresponding to the trap flag contains invalid data. During instruction processing, the pointer trap receives control signals from instruction fetch/decode logic on the processor indicating an instruction being processed calls for a register to be used as a pointer. If the specified pointer register has its corresponding trap flag set, then the pointer trap indicates that a processing exception has occurred. The interrupt logic/exception processing logic then causes a trap interrupt service routine (ISR) to be executed in response to the exception. The ISR prevents errors from being introduced in the instruction processing due to invalid pointer values.
    Type: Grant
    Filed: December 20, 2004
    Date of Patent: June 21, 2011
    Assignee: Microchip Technology Incorporated
    Inventor: Michael I. Catherwood
  • Patent number: 7962716
    Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.
    Type: Grant
    Filed: November 17, 2004
    Date of Patent: June 14, 2011
    Assignee: QST Holdings, Inc.
    Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
  • Patent number: 7958342
    Abstract: A Nyquist sampling frequency is determined for performance counter events to be measured. Based on the Nyquist sampling frequencies, a schedule for measuring the performance counter events is determined. The performance counter event measurements are then conducted in accordance with the schedule, whereby the measurements yield a set of sample data for each performance counter event. A signal reconstruction algorithm is applied to the set of sample data for each performance counter event to reconstruct an essentially complete signal for each performance counter event. The essentially complete signal for each performance counter event is then used to improve either a design or a utilization of either a microprocessor or an application to be executed on the microprocessor.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: June 7, 2011
    Assignee: Oracle America, Inc.
    Inventors: Robert M. Lane, Kenneth Tracton, Zenon Fortuna
  • Patent number: 7937557
    Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.
    Type: Grant
    Filed: March 16, 2004
    Date of Patent: May 3, 2011
    Assignee: VNS Portfolio LLC
    Inventor: Charles H. Moore
  • Patent number: 7584345
    Abstract: A method for dynamically programming Field Programmable Gate Arrays (FPGA in a coprocessor, the coprocessor coupled to a processor, includes: beginning an execution of an application by the processor; receiving an instruction from the processor to the coprocessor to perform a function for the application; determining that the FPGA in the coprocessor is not programmed with logic for the function; fetching a configuration bit stream for the function; and programming the FPGA with the configuration bit stream. In this manner, the FPGA are programmable “on the fly”, i.e., dynamically during the execution of an application. The hardware acceleration and resource sharing advantages provided by the FPGA can be utilized more often by the application. Logic flexibility and space savings on the chip comprising thecoprocessor and processor are provided as well.
    Type: Grant
    Filed: October 30, 2003
    Date of Patent: September 1, 2009
    Assignee: International Business Machines Corporation
    Inventors: Andreas C. Doering, Silvio Dragone, Andreas Herkersdorf, Richard G. Hofmann, Charles E. Kuhlmann
  • Patent number: 7516301
    Abstract: Heterogeneous processors can cooperate for distributed processing tasks in a multiprocessor computing system. Each processor is operable in a “compatible” mode, in which all processors within a family accept the same baseline command set and produce identical results upon executing any command in the baseline command set. The processors also have a “native” mode of operation in which the command set and/or results may differ in at least some respects from the baseline command set and results. Heterogeneous processors with a compatible mode defined by reference to the same baseline can be used cooperatively for distributed processing by configuring each processor to operate in the compatible mode.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: April 7, 2009
    Assignee: Nvidia Corporation
    Inventors: Henry Packard Moreton, Abraham B. de Waal
  • Patent number: 7506140
    Abstract: A return data selector is disclosed. A pipelined microprocessor includes N functional units that request to return data to the pipeline. In a given selection cycle, some of the functional units may not be requesting to return data. The return data selector includes a circuit for selecting one of functional units in a round-robin fashion. The circuit 1-bit left rotatively increments a first addend by a second addend to generate a sum that is ANDed with the inverse of the first addend to generate a 1-hot vector indicating which of the functional units is selected next. The first addend is an N-bit vector where each bit is false if the corresponding functional unit is requesting to return a result to the pipeline. The second addend is a 1-hot vector indicating the last selected functional unit.
    Type: Grant
    Filed: March 22, 2005
    Date of Patent: March 17, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Michael Gottlieb Jensen