Patents Examined by Eddie P. Chan
  • Patent number: 8166480
    Abstract: Illustrative embodiments provide a computer implemented method, a data processing system and a computer program product for lock contention reduction. In one illustrative embodiment, the computer implemented method provides a lock to an active thread, increments a lock counter, receives a request to de-schedule the active thread, and determines whether the lock is held by the active thread. The computer implemented method, responsive to a determination that the lock is held by the active thread, adds a first pre-determined amount to a time slice of the active thread.
    Type: Grant
    Filed: July 29, 2008
    Date of Patent: April 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Nathan D. Fontenot, Jacob Lorien Moilanen, Joel Howard Schopp, Michael Thomas Strosaker, Mark Wayne VanderWiele
  • Patent number: 8161271
    Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.
    Type: Grant
    Filed: July 11, 2007
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
  • Patent number: 8161270
    Abstract: A programmable processor configured to perform one or more packet modifications through execution of one or more commands. A pipelined processor core comprises a first stage configured to selectively shift and mask data in each of a plurality of categories in response to one or more decoded commands, and combine the selectively shifted and masked data in each of the categories. The pipelined processor core further comprises a second stage configured to selectively perform one or more operations on the combined data from the first stage and other data responsive to the one or more decoded commands. In one implementation, the processor is implemented as an application specific integrated circuit (ASIC).
    Type: Grant
    Filed: March 30, 2004
    Date of Patent: April 17, 2012
    Assignee: Extreme Networks, Inc.
    Inventors: David K. Parker, Erik R. Swenson, Christopher J. Young
  • Patent number: 8145887
    Abstract: A method, system, and computer program product are provided for enhancing the execution of independent loads in a processing unit. A processing unit detects if a long-latency miss associated with a load instruction has been encountered. Responsive to a long-latency miss, the processing unit enters a load lookahead mode. Responsive to entering the load lookahead mode, the processing unit dispatches each instruction from a first set of instructions from a first buffer with an associated vector. The processing unit determines if the first set of instructions from the first buffer have completed execution. Responsive to completed execution of the first set of instructions from the first buffer, the processing unit copies the set of vectors from a first vector array to a second vector array. Then the processing unit dispatches a second set of instructions from a second buffer with an associated vector from the second vector array.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen
  • Patent number: 8140826
    Abstract: Methods, apparatus, and computer program products are disclosed for executing a gather operation on a parallel computer according to embodiments of the present invention. Embodiments include configuring, by the logical root, a result buffer or the logical root, the result buffer having positions, each position corresponding to a ranked node in the operational group and for storing contribution data gathered from that ranked node.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: March 20, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Joseph D. Ratterman
  • Patent number: 8135941
    Abstract: One embodiment of the invention provides a processor. The processor generally includes a first and second processor core, each having a plurality of pipelined execution units for executing an issue group of multiple instructions and scheduling logic configured to issue a first issue group of instructions to the first processor core for execution and a second issue group of instructions to the second processor core for execution when the processor is in a first mode of operation and configured to issue one or more vector instructions for concurrent execution on the first and second processor cores when the processor is in a second mode of operation.
    Type: Grant
    Filed: September 19, 2008
    Date of Patent: March 13, 2012
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Patent number: 8131976
    Abstract: Mechanisms, in a data processing system, are provided for tracking effective addresses through a processor pipeline of the data processing system. The mechanisms comprise logic for fetching an instruction from an instruction cache and associating, by an effective address table logic in the data processing system, an entry in an effective address table (EAT) data structure with the fetched instruction. The mechanisms further comprise logic for associating an effective address tag (eatag) with the fetched instruction, the eatag comprising a base eatag that points to the entry in the EAT and an eatag offset. Moreover, the mechanisms comprise logic for processing the instruction through the processor pipeline by processing the eatag.
    Type: Grant
    Filed: April 13, 2009
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Richard W. Doing, Susan E. Eisen, David S. Levitan, Kevin N. Magill, Brian R. Mestan, Balaram Sinharoy, Benjamin W. Stolt, Jeffrey R. Summers, Albert J. Van Norstrand, Jr.
  • Patent number: 8131980
    Abstract: A design structure for resolving the occurrence of livelock at the interface between the processor core and memory subsystem controller. Livelock is resolved by introducing a livelock detection mechanism (which includes livelock detection utility or logic) within the processor to detect a livelock condition and dynamically change the duration of the delay stage(s) in order to alter the “harmonic” fixed-cycle loop behavior. The livelock detection logic (LDL) counts the number of flushes a particular instruction takes or the number of times an instruction re-issues without completing. The LDL then compares that number to a preset threshold number. Based on the result of the comparison, the LDL triggers the implementation of one of two different livelock resolution processes.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ronald Hall, Michael L. Karm, Alvan W. Ng, Todd A. Venton
  • Patent number: 8127117
    Abstract: A method and system to combine corresponding half word units from multiple register units within a microprocessor, such as, for example, a digital signal processor, during execution of a single instruction are described. An instruction to combine predetermined disparate source register units from a register file structure is received within a processing unit. The instruction is then executed to combine corresponding half word units from the source register units and to input the half word units into respective portions of a resulting destination register unit. During the execution of the instruction, the predetermined source register units are identified and corresponding most significant half word units and associated data are retrieved from the identified register units. The retrieved half word units are further combined and input into a respective most significant portion of a resulting destination register unit.
    Type: Grant
    Filed: May 10, 2006
    Date of Patent: February 28, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Mao Zeng, Lucian Codrescu
  • Patent number: 8122231
    Abstract: Selective power control of one or more processing elements matches a degree of parallelism to requirements of a task performed in a highly parallel programmable data processor. For example, when program operations require less than the full width of the data path, a software instruction of the program sets a mode of operation requiring a subset of the parallel processing capacity. At least one parallel processing element, that is not needed, can be shut down to conserve power. At a later time, when the added capacity is needed, execution of another software instruction sets the mode of operation to that of the wider data path, typically the full width, and the mode change reactivates the previously shut-down processing element.
    Type: Grant
    Filed: February 17, 2010
    Date of Patent: February 21, 2012
    Assignee: QUALCOMM Incorporated
    Inventor: Kenneth Alan Dockser
  • Patent number: 8108654
    Abstract: The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to receiving an issue group of instructions, reordering the issue group of instructions using instruction type priority, and executing the reordered issue group of instructions in the cascaded delayed execution pipeline unit. The method, among others, can be broadly summarized by the following steps: receiving an issue group of instructions, reordering the issue group of instructions using instruction type priority, and executing the reordered issue group of instructions in the cascaded delayed execution pipeline unit.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: January 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey P. Bradford, David A. Luick
  • Patent number: 8095778
    Abstract: Sharing functional units within a multithreaded processor. In one embodiment, the multithreaded processor may include a multithreaded instruction source that may provide an instruction from each of a plurality of thread groups in a given cycle. A given thread group may include one or more instructions from one or more threads. The arbitration functionality may arbitrate between the plurality of thread groups for access to a functional unit such as a load store unit, for example, that may be shared between the thread groups.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: January 10, 2012
    Assignee: Open Computing Trust I & II
    Inventor: Robert T. Golla
  • Patent number: 8095779
    Abstract: The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if a plurality of load instructions are in the issue group, if so, schedule the plurality of load instructions in descending order of longest dependency chain depth to shortest dependency chain depth in a shortest to longest available execution pipelines; and (3) execute the issue group of instructions in the cascaded delayed execution pipeline unit.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: January 10, 2012
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Patent number: 8095782
    Abstract: Graphics processing elements are capable of processing multiple contexts simultaneously, reducing the need to perform time consuming context switches compared with processing a single context at a time. Processing elements of a graphics processing pipeline may be configured to support all of the multiple contexts or only a portion of the multiple contexts. Each processing element may be allocated to process a particular context or a portion of the multiple contexts in order to simultaneously process more than one context. The allocation of processing elements to the multiple contexts may be determined dynamically in order to improve graphics processing throughput.
    Type: Grant
    Filed: June 14, 2007
    Date of Patent: January 10, 2012
    Assignee: NVIDIA Corporation
    Inventors: John M. Danskin, Lacky V. Shah
  • Patent number: 8095780
    Abstract: A multi-issue processor includes a register file and a plurality of issue slots, each one of the plurality of issue slots having a plurality of functional units and a plurality of holdable registers. The plurality of issue slots include a first set of issue slots and a second set of issue slots, and the register file is accessible by the plurality of issue slots. A location of at least a part of the plurality of holdable registers in the first set of issue slots is different from a location of at least a corresponding part of the plurality of holdable registers in the second set of issue slots.
    Type: Grant
    Filed: April 1, 2003
    Date of Patent: January 10, 2012
    Assignee: Nytell Software LLC
    Inventor: Jeroen Anton Johan Leijten
  • Patent number: 8095777
    Abstract: A design structure embodied in a machine readable medium used in a design process includes an apparatus for predictive decoding, the apparatus including register logic for fetching an instruction; predictor logic containing predictor information including prior instruction execution characteristics; logic for obtaining predictor information for the fetched instruction from the predictor; and decode logic for generating a selected one of a plurality of decode operation streams corresponding to the fetched instruction, wherein the decode operation stream is selected based on the predictor information.
    Type: Grant
    Filed: November 1, 2007
    Date of Patent: January 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Bartholomew Blaner, Michael K. Gschwind
  • Patent number: 8090933
    Abstract: The present invention relates to a method for the unification of PER branch and PER store operations within the same dataflow. The method comprises determining a PER range, the PER range comprising a storage area defined by a designated storage starting area and a designated storage ending area, wherein the storage starting area is designated by a value of the contents of a first control register and the storage ending area is designated by a value of the contents of a second control register. The method also comprises retrieving register field content values that are stored at a plurality of registers, wherein the retrieved content values comprises a length field content value, and setting the length field content value to zero for a PER branch instruction, thereby enabling a PER branch instruction to performed similarly to a PER storage instruction.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: January 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Bruce C. Giamei
  • Patent number: 8086824
    Abstract: A stream processing system includes a stream processing module coupled to a memory module and operable so as to fetch stream elements from the memory module, to process the stream elements fetched thereby, and to store processed stream elements in the memory module. The stream processing module includes a number (N) of stream processing units, and the memory module is configured with a number (N) of memory bank units each corresponding to a respective one of the stream processing units. The memory module is reconfigurable based on a desired inter-level configuration so that each of the memory bank units is configured to have a memory size sufficient to meet processing requirement of the respective one of the stream processing units.
    Type: Grant
    Filed: May 5, 2009
    Date of Patent: December 27, 2011
    Assignee: National Taiwan University
    Inventors: You-Ming Tsao, Liang-Gee Chen, Shao-Yi Chien
  • Patent number: 8086832
    Abstract: A design structure embodied in a machine readable, non-transitory storage medium used in a design process includes a system for dynamically varying the pipeline depth of a computing device. The system includes a state machine that determines an optimum length of a pipeline architecture based on a processing function to be performed. A pipeline sequence controller, responsive to the state machine, varies the depth of the pipeline based on the optimum length. A plurality of clock splitter elements, each associated with a corresponding plurality of latch stages in the pipeline architecture, are coupled to the pipeline sequence controller and adapted to operate in a functional mode, one or more clock gating modes, and a pass-through flush mode. For each of the clock splitter elements operating in the pass-through flush mode, data is passed through the associated latch stage without oscillation of clock signals associated therewith.
    Type: Grant
    Filed: October 9, 2007
    Date of Patent: December 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: Susan K. Lichtensteiger, Pascal A. Nsame, Sebastian T. Ventrone
  • Patent number: 8082428
    Abstract: A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: December 20, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius