Patents Examined by Eddie Chan
-
Patent number: 8880854Abstract: An out-of-order execution microprocessor executes an architectural segment register-loading instruction that instructs the microprocessor to load a new value into an architectural segment register of the microprocessor. A comparator compares the new value specified by the architectural segment register-loading instruction with a current contents of the architectural segment register. A control unit causes to be re-executed using the new value all instructions in the microprocessor that used the current architectural segment register contents as a source operand and that are newer in program order than the architectural segment register-loading instruction whenever the comparator indicates the new value does not equal the current contents.Type: GrantFiled: February 11, 2009Date of Patent: November 4, 2014Assignee: VIA Technologies, Inc.Inventors: Rodney E. Hooker, Gerard M. Col, Terry Parks
-
Patent number: 8028290Abstract: Multiple instruction set architectures are supported in a system that provides a power-efficient and flexible platform for virtual machine environments requiring multiple support for multiple instruction set architectures (ISAs). A processor includes multiple cores having disparate native ISAs and that may be selectively enabled for operation, so that power is conserved when support for a particular ISA is not required of the processor. A hypervisor controls operation of the cores, locates a core and enables it if necessary when a request to instantiate a virtual machine having a specified ISA is received. The ISA may be specified by a particular operating system and/or application program requirements.Type: GrantFiled: August 30, 2006Date of Patent: September 27, 2011Assignee: International Business Machines CorporationInventors: James Walter Rymarczyk, Michael Ignatowski, Thomas J. Heller, Jr.
-
Patent number: 8006070Abstract: An information handling system includes a processor that throttles an instruction fetcher whenever a group of instructions in a branch instruction queue together exhibits a confidence in the accuracy of branch predictions of branch instructions therein that is less than a first predetermined threshold confidence threshold. In one embodiment, the processor includes a fetch throttle controller that inhibits fetch throttling by the instruction fetcher when confidence in the accuracy of a branch prediction for a particular currently issued branch instruction exhibits less than a second predetermined threshold confidence threshold.Type: GrantFiled: December 5, 2007Date of Patent: August 23, 2011Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, Robert Alan Philhower, Raymond Cheung Yeung
-
Patent number: 8006075Abstract: Systems and methods for storage of writes to memory corresponding to multiple threads. A processor comprises a store queue, wherein the queue dynamically allocates a current entry for a committed store instruction in which entries of the array may be allocated out of program order. For a given thread, the store queue conveys store data to a memory in program order. The queue is further configured to identify an entry of the plurality of entries that corresponds to an oldest committed store instruction for a given thread and determine a next entry of the array that corresponds to a next committed store instruction in program order following the oldest committed store instruction of the given thread, wherein said next entry includes data identifying the entry. The queue marks an entry as unfilled upon successful conveying of store data to the memory.Type: GrantFiled: May 21, 2009Date of Patent: August 23, 2011Assignee: Oracle America, Inc.Inventor: Mark A. Luttrell
-
Patent number: 8006073Abstract: A system and method for management of resource allocation of threads for efficient execution of instructions. Prior to dispatching decoded instructions of a first thread from the instruction fetch unit to a buffer within a scheduler, logic within the instruction fetch unit may determine the buffer is already full of dispatched instructions. However, the logic may also determine that a buffer for a second thread within the core or micro core is available. The second buffer may receive and issue decoded instructions for the first thread until the buffer is becomes unavailable. While the second buffer receives and issues instructions for the first thread, the throughput of the system for the first thread may increase due to a reduction in wait cycles.Type: GrantFiled: September 28, 2007Date of Patent: August 23, 2011Assignee: Oracle America, Inc.Inventors: Abid Ali, Shailender Chaudhry
-
Patent number: 8001363Abstract: A value representative of a processor's speculative branch prediction efficiency is determined and the speculative branch prediction depth is adjusted accordingly. The processor's speculative branch prediction efficiency may be represented by the average number of clocks per instruction (CPI), whereby an increase in the average CPI indicates that the processor is becoming less efficient due to incorrectly predicted speculative branch predictions and, conversely, a decrease indicates that the processor has a higher ratio of properly predicted speculative branch predictions.Type: GrantFiled: April 4, 2005Date of Patent: August 16, 2011Inventor: Elias Shihadeh
-
SIMD image forming apparatus for minimizing wiring distance between registers and processing devices
Patent number: 8001506Abstract: A disclosed image processing apparatus includes a SIMD microprocessor in which multiple processor elements are arranged in one dimension, each of the processor elements including multiple access registers arranged in stages for storing image data; and multiple data processing devices corresponding one-to-one with the stages of the access registers, arranged in one dimension in the same direction as the processor elements, and configured to read and write image data from/to the access registers. The access registers of each of the stages, each of which access registers is included in a different one of the processor elements, are connected with a common line. Wiring outlets, each of which connects the common line of a different one of the stages to a corresponding data processing device, are individually disposed within the SIMD microprocessor in such a manner that each wiring outlet has a shortest possible distance to the corresponding data processing device.Type: GrantFiled: January 21, 2009Date of Patent: August 16, 2011Assignee: Ricoh Company, Ltd.Inventor: Tomoaki Ozaki -
Patent number: 7996655Abstract: One embodiment provides a method of forwarding data in a processor. The method generally includes providing at least one cascaded delayed execution pipeline unit having at least a first pipeline and a second pipeline for executing first and second instructions in a common issue group, wherein the second pipeline executes the second instruction in a delayed manner relative to the execution of the first instruction in the first pipeline, storing results generated by an execution unit of the first pipeline in a first-in first-out (FIFO) storage target delay queue, determining if the target delay queue contains source data for executing the second instruction, and if the target delay queue contains source data for the second instruction, forwarding the source data for the second instruction from the target delay queue to an execution unit of the second pipeline.Type: GrantFiled: April 22, 2008Date of Patent: August 9, 2011Assignee: International Business Machines CorporationInventor: David A. Luick
-
Patent number: 7996662Abstract: In one embodiment, a processor comprises a plurality of storage locations, a decode circuit, and a status/control register (SCR). Each storage location is addressable as a speculative register and is configured to store result data generated during execution of an instruction operation and a value representing an update for the SCR. The value includes at least a first encoding that represents an update to a plurality of bits in the SCR, and a first number of bits in the plurality of bits is greater than a second number of bits in the first encoding. The decode circuit is coupled to receive the first encoding from a first storage location responsive to retirement of a first instruction operation assigned to use the first storage location as a destination, and is configured to decode the first encoding and generate the plurality of bits. The decode circuit is configured to update the SCR.Type: GrantFiled: November 17, 2005Date of Patent: August 9, 2011Assignee: Apple Inc.Inventors: Wei-Han Lien, Daniel C. Murray, Junji Sugisawa
-
Patent number: 7991978Abstract: Data processing on a network on chip (‘NOC’) that includes integrated processor (‘IP’) blocks, each of a plurality of the IP blocks including at least one computer processor, each such computer processor implementing a plurality of hardware threads of execution; low latency, high bandwidth application messaging interconnects; memory communications controllers; network interface controllers; and routers; each of the IP blocks adapted to a router through a separate one of the low latency, high bandwidth application messaging interconnects, a separate one of the memory communications controllers, and a separate one of the network interface controllers; each application messaging interconnect abstracting into an architected state of each processor, for manipulation by computer programs executing on the processor, hardware inter-thread communications among the hardware threads of execution; each memory communications controller controlling communication between an IP block and memory; each network interface controType: GrantFiled: May 9, 2008Date of Patent: August 2, 2011Assignee: International Business Machines CorporationInventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Eric O. Mejdrich, Paul E. Schardt
-
Patent number: 7991984Abstract: A loop control system comprises at least one loop flag in an instruction word, at least one loop counter associated with the at least one loop flag operable to store and compute a number of times a program loop is to be executed, at least one start address register associated with the at least one loop flag operable to store a program loop starting address, and at least one end address register associated with the at least one loop flag operable to store a program loop ending address.Type: GrantFiled: December 23, 2005Date of Patent: August 2, 2011Assignee: Samsung Electronics Co., Ltd.Inventor: Eran Pisek
-
Patent number: 7984266Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.Type: GrantFiled: June 5, 2007Date of Patent: July 19, 2011Assignee: VNS Portfolio LLCInventor: Charles H. Moore
-
Patent number: 7971030Abstract: An apparatus, method, and system for synchronicity independent, resource delegating, power and instruction optimizing processor is provided where instructions are delegated between various processing resources of the processor. An Integer Processing Unit (IPU) of the processor delegates complicated mathematical instructions to a Mathematical Processing Unit (MPU) of the processor. Furthermore, the processor puts underutilized processing resources to sleep thereby increasing power usage efficiency. A cache of the processor is also capable of accepting delegated operations from the IPU. As such, the cache performs various logical operations on delegated requests allowing it to lock and share memory without requiring extra processing cycles by the entire processor.Type: GrantFiled: August 6, 2003Date of Patent: June 28, 2011Assignee: MMAGIX Technology LimitedInventor: Daniel Shane O'Sullivan
-
Patent number: 7966480Abstract: Trap flags and a pointer trap are associated with registers in a processor. Each trap flag indicates whether a corresponding register has been written with valid data. If not, the trap flag is set to indicate that the register corresponding to the trap flag contains invalid data. During instruction processing, the pointer trap receives control signals from instruction fetch/decode logic on the processor indicating an instruction being processed calls for a register to be used as a pointer. If the specified pointer register has its corresponding trap flag set, then the pointer trap indicates that a processing exception has occurred. The interrupt logic/exception processing logic then causes a trap interrupt service routine (ISR) to be executed in response to the exception. The ISR prevents errors from being introduced in the instruction processing due to invalid pointer values.Type: GrantFiled: December 20, 2004Date of Patent: June 21, 2011Assignee: Microchip Technology IncorporatedInventor: Michael I. Catherwood
-
Patent number: 7962716Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.Type: GrantFiled: November 17, 2004Date of Patent: June 14, 2011Assignee: QST Holdings, Inc.Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
-
Patent number: 7958342Abstract: A Nyquist sampling frequency is determined for performance counter events to be measured. Based on the Nyquist sampling frequencies, a schedule for measuring the performance counter events is determined. The performance counter event measurements are then conducted in accordance with the schedule, whereby the measurements yield a set of sample data for each performance counter event. A signal reconstruction algorithm is applied to the set of sample data for each performance counter event to reconstruct an essentially complete signal for each performance counter event. The essentially complete signal for each performance counter event is then used to improve either a design or a utilization of either a microprocessor or an application to be executed on the microprocessor.Type: GrantFiled: January 24, 2007Date of Patent: June 7, 2011Assignee: Oracle America, Inc.Inventors: Robert M. Lane, Kenneth Tracton, Zenon Fortuna
-
Patent number: 7937557Abstract: A computer array (10) has a plurality of computers (12) for accomplishing a larger task that is divided into smaller tasks, each of the smaller tasks being assigned to one or more of the computers (12). Each of the computers (12) may be configured for specific functions and individual input/output circuits (26) associated with exterior computers (12) are specifically adapted for particular input/output functions. An example of 25 computers (12) arranged in the computer array (10) has a centralized computational core (34) with the computers (12) nearer the edge of the die (14) being configured for input and/or output.Type: GrantFiled: March 16, 2004Date of Patent: May 3, 2011Assignee: VNS Portfolio LLCInventor: Charles H. Moore
-
Patent number: 7584345Abstract: A method for dynamically programming Field Programmable Gate Arrays (FPGA in a coprocessor, the coprocessor coupled to a processor, includes: beginning an execution of an application by the processor; receiving an instruction from the processor to the coprocessor to perform a function for the application; determining that the FPGA in the coprocessor is not programmed with logic for the function; fetching a configuration bit stream for the function; and programming the FPGA with the configuration bit stream. In this manner, the FPGA are programmable “on the fly”, i.e., dynamically during the execution of an application. The hardware acceleration and resource sharing advantages provided by the FPGA can be utilized more often by the application. Logic flexibility and space savings on the chip comprising thecoprocessor and processor are provided as well.Type: GrantFiled: October 30, 2003Date of Patent: September 1, 2009Assignee: International Business Machines CorporationInventors: Andreas C. Doering, Silvio Dragone, Andreas Herkersdorf, Richard G. Hofmann, Charles E. Kuhlmann
-
Patent number: 7516301Abstract: Heterogeneous processors can cooperate for distributed processing tasks in a multiprocessor computing system. Each processor is operable in a “compatible” mode, in which all processors within a family accept the same baseline command set and produce identical results upon executing any command in the baseline command set. The processors also have a “native” mode of operation in which the command set and/or results may differ in at least some respects from the baseline command set and results. Heterogeneous processors with a compatible mode defined by reference to the same baseline can be used cooperatively for distributed processing by configuring each processor to operate in the compatible mode.Type: GrantFiled: December 16, 2005Date of Patent: April 7, 2009Assignee: Nvidia CorporationInventors: Henry Packard Moreton, Abraham B. de Waal
-
Patent number: 7506140Abstract: A return data selector is disclosed. A pipelined microprocessor includes N functional units that request to return data to the pipeline. In a given selection cycle, some of the functional units may not be requesting to return data. The return data selector includes a circuit for selecting one of functional units in a round-robin fashion. The circuit 1-bit left rotatively increments a first addend by a second addend to generate a sum that is ANDed with the inverse of the first addend to generate a 1-hot vector indicating which of the functional units is selected next. The first addend is an N-bit vector where each bit is false if the corresponding functional unit is requesting to return a result to the pipeline. The second addend is a 1-hot vector indicating the last selected functional unit.Type: GrantFiled: March 22, 2005Date of Patent: March 17, 2009Assignee: MIPS Technologies, Inc.Inventor: Michael Gottlieb Jensen