Patents by Inventor Matthew Ray Tubbs

Matthew Ray Tubbs has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100100712
    Abstract: A processing unit includes multiple execution units and sequencer logic that is disposed downstream of instruction buffer logic, and that is responsive to a sequencer instruction present in an instruction stream. In response to such an instruction, the sequencer logic issues a plurality of instructions associated with a long latency operation to one execution unit, while blocking instructions from the instruction buffer logic from being issued to that execution unit. In addition, the blocking of instructions from being issued to the execution unit does not affect the issuance of instructions to any other execution unit, and as such, other instructions from the instruction buffer logic are still capable of being issued to and executed by other execution units even while the sequencer logic is issuing the plurality of instructions associated with the long latency operation.
    Type: Application
    Filed: October 16, 2008
    Publication date: April 22, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20100031009
    Abstract: A floating point execution unit calculates a one minus dot product value in a single pass. As such, the dependency that otherwise would be required to perform the calculations is eliminated, resulting in a substantially faster performance of such calculations. The floating point execution unit may be used, for example, to accelerate pixel shading algorithms such as Fresnel and electron microscope effects.
    Type: Application
    Filed: August 1, 2008
    Publication date: February 4, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090315908
    Abstract: A circuit arrangement and method utilize texture data prefetching to prefetch texture data used by an anisotropic filtering algorithm. In particular, stride-based prefetching may be used to prefetch texture data for use in anisotropic filtering, where the value of the stride, or difference between successive accesses, is based upon a distance in a memory address space between sample points taken along the line of anisotropy used in an anisotropic filtering algorithm.
    Type: Application
    Filed: April 25, 2008
    Publication date: December 24, 2009
    Inventors: Miguel Comparan, Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Patent number: 7634527
    Abstract: In a first aspect, a first method of reciprocal estimate computation using floating point pipeline logic is provided. The first method includes the steps of (1) receiving an input value having an exponent and a mantissa when represented as a floating point number on which a reciprocal estimate computation is to be performed; (2) determining whether the exponent is one of a plurality of predetermined numbers; and (3) if the exponent is one of the plurality of predetermined numbers, adjusting at least one of a plurality of modified mantissa bits (e.g., mantissa bits internal to leading zero anticipator (LZA) logic) and the exponent so as to prevent an underflow result of the reciprocal estimate computation. Numerous other aspects are provided.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: December 15, 2009
    Assignee: International Business Machines Corporation
    Inventors: Sherman Matthew Dance, Andrew Patrick Freemyer, Matthew Ray Tubbs
  • Publication number: 20090300335
    Abstract: A circuit arrangement and method couple a hardware-based pseudorandom number generator (PRNG) to an execution unit in such a manner that pseudorandom numbers generated by the PRNG may be selectively output to the execution unit for use as an operand during the execution of instructions by the execution unit. A PRNG may be coupled to an input of an operand multiplexer that outputs to an operand input of an execution unit so that operands provided by instructions supplied to the execution unit are selectively overridden with pseudorandom numbers generated by the PRNG. Furthermore, overridden operands provided by instructions supplied to the execution unit may be used as seed values for the PRNG.
    Type: Application
    Filed: June 3, 2008
    Publication date: December 3, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090292907
    Abstract: A pipelined execution unit incorporates one or more low power modes that reduce power consumption by dynamically merging pipeline stages in an execution pipeline together with one another. In particular, the execution logic in successive pipeline stages in an execution pipeline may be dynamically merged together by setting one or more latches that are intermediate to such pipeline stages to a transparent state such that the output of the pipeline stage preceding such latches is passed to the subsequent pipeline stage during the same clock cycle so that both such pipeline stages effectively perform steps for the same instruction during each clock cycle. Then, with the selected pipeline stages merged, the power consumption of the execution pipeline can be reduced (e.g., by reducing the clock frequency and/or operating voltage of the execution pipeline), often with minimal adverse impact on performance.
    Type: Application
    Filed: May 22, 2008
    Publication date: November 26, 2009
    Inventors: Stephen Joseph Schwinn, Matthew Ray Tubbs, Charles David Wait
  • Publication number: 20090293061
    Abstract: A circuit arrangement and method utilize a plurality of execution units having different power and performance characteristics and capabilities within a multithreaded processor core, and selectively route instructions having different performance requirements to different execution units based upon those performance requirements. As such, instructions that have high performance requirements, such as instructions associated with primary tasks or time sensitive tasks, can be routed to a higher performance execution unit to maximize performance when executing those instructions, while instructions that have low performance requirements, such as instructions associated with background tasks or non-time sensitive tasks, can be routed to a reduced power execution unit to reduce the power consumption (and associated heat generation) associated with executing those instructions.
    Type: Application
    Filed: May 22, 2008
    Publication date: November 26, 2009
    Inventors: Stephen Joseph Schwinn, Matthew Ray Tubbs, Charles David Wait
  • Publication number: 20090240920
    Abstract: An execution unit supports data dependent conditional write instructions that write data to a target only when a particular condition is met. In one implementation, a data dependent conditional write instruction identifies a condition as well as data to be tested against that condition. The data is tested against that condition, and the result of the test is used to selectively enable or disable a write to a target associated with the data dependent conditional write instruction. Then, a write is attempted while the write to the target is enabled or disabled such that the write will update the contents of the target only when the write is selectively enabled as a result of the test. By doing so, dependencies are typically avoided, as is use of an architected condition register that might otherwise introduce branch prediction mispredict penalties, enabling improved performance with z-buffer test and similar types of algorithms.
    Type: Application
    Filed: March 18, 2008
    Publication date: September 24, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090228682
    Abstract: A software-accessible special purpose register is architected into a processing unit in order to implement persistent vector multiplexer control of a vector-based execution unit. A persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be stored in the special purpose register such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.
    Type: Application
    Filed: March 10, 2008
    Publication date: September 10, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Robert Allen Shearer, Matthew Ray Tubbs
  • Publication number: 20090228681
    Abstract: Persistent vector multiplexer control is used in a vector-based execution unit to control the shuffling of words in operand vectors processed by the execution unit. In addition, a persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be persisted such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.
    Type: Application
    Filed: March 10, 2008
    Publication date: September 10, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Robert Allen Shearer, Matthew Ray Tubbs
  • Publication number: 20090228690
    Abstract: An “early exit” of an iterative refinement algorithm is implemented by effectively disabling read after write dependency stalls of newer instructions, as well as disabling the register write enable of these instructions, for the remainder of the algorithm, in addition to disabling the register write enable of these instructions. By doing so, the latency of the algorithm is reduced and the performance is increased without the complexity and potential poor performance of compare and branch instructions that might otherwise be required.
    Type: Application
    Filed: March 10, 2008
    Publication date: September 10, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090228689
    Abstract: A programmable “early exit” of an iterative refinement algorithm is implemented by effectively disabling read after write dependency stalls of newer instructions, as well as disabling the register write enable of these instructions, for the remainder of the algorithm, in addition to disabling the register write enable of these instructions. In addition, programmable logic is provided to enable a custom early exit condition to be specified for the iterative refinement algorithm so that the underlying hardware can be configured for optimal execution of particular iterative refinement algorithms. By doing so, the latency of the algorithm is reduced and the performance is increased without the complexity and potential poor performance of compare and branch instructions that might otherwise be required.
    Type: Application
    Filed: March 10, 2008
    Publication date: September 10, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090182987
    Abstract: A multirate execution unit is capable of being operated in a plurality of modes, with the execution unit being capable of clocked at multiple different rates relative to a multithreaded issue unit such that, in applications where maximum performance is desired, the execution unit can be clocked at a rate that is faster than the clock rate for the multithreaded issue unit, and in applications where a lower power profile is desired, the execution unit can be throttled back to a slower rate to reduce the power consumption of the execution unit. When the execution unit is clocked at a faster rate than the multithreaded issue unit, the issue unit is permitted to issue more instructions per cycle than when the execution unit is throttled to the slower rate to increase overall instruction throughput.
    Type: Application
    Filed: January 11, 2008
    Publication date: July 16, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090182986
    Abstract: A circuit arrangement and method utilize an issue rate-based predictive thermal management technique in a microprocessor or other integrated circuit that tracks the rate in which instructions are issued to one or more execution units in the processing unit, and selectively delays the issuance of subsequent instructions to the execution unit(s) based upon the tracked issue rate to predictively control the thermal output of the integrated circuit.
    Type: Application
    Filed: January 16, 2008
    Publication date: July 16, 2009
    Inventors: Stephen Joseph Schwinn, Matthew Ray Tubbs, Charles David Wait
  • Publication number: 20090150647
    Abstract: A vectorizable execution unit is capable of being operated in a plurality of modes, with the processing lanes in the vectorizable execution unit grouped into different combinations of logical execution units in different modes. By doing so, processing lanes can be selectively grouped together to operate as different types of vector execution units and/or scalar execution units, and if desired, dynamically switched during runtime to process various types of instruction streams in a manner that is best suited for each type of instruction stream. As a consequence, a single vectorizable execution unit may be configurable, e.g., via software control, to operate either as a vector execution or a plurality of scalar execution units.
    Type: Application
    Filed: December 7, 2007
    Publication date: June 11, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Patent number: 7542044
    Abstract: An approach to optimize specular highlight generation is presented. A single microprocessor instruction is used to generate an intensity value based upon a viewing angle value. An application stores a viewing angle value in an input register. When called, the “intensity instruction” retrieves the viewing angle value from the input register, and calculates an intensity value using three distinct steps. In turn, the intensity instruction stores the intensity value in an output register for the application to retrieve and further process. In one embodiment, the invention may be implemented using PowerPC™ assembly and VMX™ or Altivec™ instructions. In this embodiment, the intensity instruction may be represented as a “vspecefp” instruction, which stands for a “vector specular estimate floating point” instruction.
    Type: Grant
    Filed: March 15, 2008
    Date of Patent: June 2, 2009
    Assignee: International Business Machines Corporation
    Inventors: Gordon Clyde Fossum, Stephen Joseph Schwinn, Matthew Ray Tubbs
  • Publication number: 20090083357
    Abstract: A method, computer-readable medium, and an apparatus for implementing a floating point weighted average function. The method includes receiving an input containing 2N input values, 2N weights, and an opcode, where N is a positive integer number and each of the input values corresponds to one of the weights. Furthermore, the method also includes using existing dot product circuit function to generate 2N addends by multiplying each of the input values with the corresponding weight. In addition, the method includes generating a sum value by adding the 2N addends, where the sum value includes an exponent value, and generating the weighted average value based on the sum value by decreasing the exponent value by N. In this fashion, the same circuit area may be used to carry out both dot product and weighted average calculations, leading to greater circuit area savings and performance advantages.
    Type: Application
    Filed: September 26, 2007
    Publication date: March 26, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090070398
    Abstract: A method, computer-readable medium, and an apparatus for generating a transcendental value. The method includes receiving an input containing an input value and an opcode and determining whether the opcode corresponds to a trigonometric operation or a power-of-two operation. The method also includes calculating a fractional value and an integer value from the input value, generating the transcendental value based on the fractional value by adding at least a portion of the fractional value with at least one of a shifted fractional value produced by shifting the portion of the fractional value and a constant value, and providing the transcendental value in response to the request. In this fashion, the same circuit area may be used to carry out both trigonometric and power-of-two calculations, leading to greater circuit area savings and performance advantages while not sacrificing significant accuracy.
    Type: Application
    Filed: September 7, 2007
    Publication date: March 12, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090063608
    Abstract: Embodiments of the invention are generally related to the field of image processing, and more specifically to vector units for supporting image processing. A vector unit may comprise a plurality of operand multiplexers associated with each vector processing lane of the vector unit. The operand multiplexers may select vector operands from one or more register files for performing a cross product operation. A first multiply operation may be performed in a first pipeline stage by multiplying a first set of operands in a multiplier. In a second pipeline stage, a second multiply operation may be performed by multiplying a second set of operands. The results of the first multiply operation and the second multiply operation may be transferred to an adder to complete the cross product instruction.
    Type: Application
    Filed: September 4, 2007
    Publication date: March 5, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090049113
    Abstract: Embodiments of the invention provide methods and apparatus for executing a multiple operand instruction. Executing the multiple operand instruction comprises computing an arithmetic result of a pair of operands in each processing lane of a vector unit. The arithmetic results generated in each processing lane of the vector unit may be transferred to a dot product unit. The dot product unit may compute an arithmetic result using the arithmetic result computed by each processing lane of the vector unit to generate an arithmetic result of more than two operands.
    Type: Application
    Filed: August 17, 2007
    Publication date: February 19, 2009
    Inventors: Adam James Muff, Matthew Ray Tubbs