Patents by Inventor Jeffrey A. Lohman

Jeffrey A. Lohman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11360780
    Abstract: Techniques are disclosed relating to context switching in a SIMD processor. In some embodiments, an apparatus includes pipeline circuitry configured to execute graphics instructions included in threads of a group of single-instruction multiple-data (SIMD) threads in a thread group. In some embodiments, context switch circuitry is configured to atomically: save, for the SIMD group, a program counter and information that indicates whether threads in the SIMD group are active using one or more context switch registers, set all threads to an active state for the SIMD group, and branch to handler code for the SIMD group. In some embodiments, the pipeline circuitry is configured to execute the handler code to save context information for the SIMD group and subsequently execute threads of another thread group. Disclosed techniques may allow instruction-level context switching even when some SIMD threads are non-active.
    Type: Grant
    Filed: January 22, 2020
    Date of Patent: June 14, 2022
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Jeffrey T. Brady, Brian K. Reynolds, Jeffrey A. Lohman
  • Patent number: 11204774
    Abstract: Techniques are disclosed relating to a thread-group-scoped gate instruction. In some embodiments, graphics processor circuitry is configured to execute, for multiple SIMD groups of a thread group, a graphics program that includes a gate instruction. During execution of the gate instruction for a first SIMD group, the processor accesses state information to determine that a threshold number of other SIMD groups in the thread group have not yet executed the gate instruction. Based on the determination, the processor executes a particular set of instructions of the graphics program for the first SIMD group (that is not executed by one or more other SIMD groups that reach the gate instruction after the first SIMD group). For example, the particular set of instructions may be a utility program that performs one or more operations for the entire thread group but is only executed by a subset of the SIMD groups.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: December 21, 2021
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Anjana Rajendran, Jeffrey A. Lohman, Terence M. Potter
  • Publication number: 20210224072
    Abstract: Techniques are disclosed relating to context switching in a SIMD processor. In some embodiments, an apparatus includes pipeline circuitry configured to execute graphics instructions included in threads of a group of single-instruction multiple-data (SIMD) threads in a thread group. In some embodiments, context switch circuitry is configured to atomically: save, for the SIMD group, a program counter and information that indicates whether threads in the SIMD group are active using one or more context switch registers, set all threads to an active state for the SIMD group, and branch to handler code for the SIMD group. In some embodiments, the pipeline circuitry is configured to execute the handler code to save context information for the SIMD group and subsequently execute threads of another thread group. Disclosed techniques may allow instruction-level context switching even when some SIMD threads are non-active.
    Type: Application
    Filed: January 22, 2020
    Publication date: July 22, 2021
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Jeffrey T. Brady, Brian K. Reynolds, Jeffrey A. Lohman
  • Patent number: 7113969
    Abstract: A floating point unit (FPU) for processing denormal numbers in floating point notation, a method of processing such numbers in an FPU and a computer system employing the FPU or the method. In one embodiment, the FPU includes: (1) a load unit that receives a denormal number having an exponent portion of a standard length from a source without the FPU and transforms the denormal number into a normalized number having an exponent portion of an expanded length greater than the standard length, (2) a floating point execution core, coupled to the load unit, that processes the normalized number at least once to yield a processed normalized number, the expanded length of the exponent portion allowing the processed normalized number to remain normal during processing thereof and (3) a store unit, coupled to the floating point execution core, that receives the processed normalized number and transforms the processed normalized number back into a denormal number having an exponent portion of the standard length.
    Type: Grant
    Filed: October 4, 2004
    Date of Patent: September 26, 2006
    Assignee: National Semiconductor Corporation
    Inventors: Daniel W. Green, Atul Dhablania, Jeffrey A. Lohman
  • Patent number: 6907518
    Abstract: For use in a processor having a first number of decode units for decoding an ordered stream of floating point instructions, a floating point unit (FPU) for receiving decoded ones of the floating point instructions and a method of processing the decoded ones of the floating point instructions. In one embodiment, the FPU includes: (1) a second number of floating point pipelines that execute the floating point instructions, the second number being at least one and less than the first number, the floating point pipeline having a load unit, an execution core and a store unit, (2) a floating point checkpoint buffer, coupled to the decode units, that queues the decoded ones of the floating point instructions for allocation to the floating point pipelines and (3) a floating point register file, coupled to and cooperable with the floating point checkpoint buffer, that preserves states of the execution core to allow the floating point pipelines to execute the floating point instructions out of order.
    Type: Grant
    Filed: June 16, 2003
    Date of Patent: June 14, 2005
    Assignee: National Semiconductor Corporation
    Inventors: Jeffrey Lohman, Nicholas Samra, Ram Gummadi
  • Patent number: 6801924
    Abstract: A floating point unit (FPU) for processing denormal numbers in floating point notation, a method of processing such numbers in an FPU and a computer system employing the FPU or the method. In one embodiment, the FPU includes: (1) a load unit that receives a denormal number having an exponent portion of a standard length from a source without the FPU and transforms the denormal number into a normalized number having an exponent portion of an expanded length greater than the standard length, (2) a floating point execution core, coupled to the load unit, that processes the normalized number at least once to yield a processed normalized number, the expanded length of the exponent portion allowing the processed normalized number to remain normal during processing thereof and (3) a store unit, coupled to the floating point execution core, that receives the processed normalized number and transforms the processed normalized number back into a denormal number having an exponent portion of the standard length.
    Type: Grant
    Filed: August 19, 1999
    Date of Patent: October 5, 2004
    Assignee: National Semiconductor Corporation
    Inventors: Daniel W. Green, Atul Dhablania, Jeffrey A. Lohman
  • Patent number: 6757812
    Abstract: For use in a processor having a floating point unit (FPU) capable of managing denormalized numbers in floating point notation, logic circuitry for, and a method of adding or subtracting two floating point numbers. In one embodiment, the logic circuitry includes: (1) an adder that receives the two floating point numbers and, based on a received instruction, adds or subtracts the two floating point numbers to yield a denormal sum or difference thereof, (2) a leading bit predictor that receives the two floating point numbers and performs logic operations thereon to yield predictive shift data denoting an extent to which the denormal sum or difference is required to be shifted to normalize the denormal sum or difference, the predictive shift data subject to being erroneous and (3) predictor corrector logic that receives the two floating point numbers and performs logic operations thereon to yield shift compensation data denoting an extent to which the predictive shift is erroneous.
    Type: Grant
    Filed: June 10, 2002
    Date of Patent: June 29, 2004
    Assignee: National Semiconductor Corporation
    Inventors: Daniel W. Green, Atul Dhablania, Jeffrey A. Lohman, Bang Nguyen
  • Patent number: 6714957
    Abstract: There is disclosed a denormal handling circuit for use in a pipelined floating point unit containing an addition pipe and/or a multiplication pipe. The denormal result handling circuit comprises a denormal condition detection circuit associated with at the addition pipe and/or the multiplication pipe for examining a first operand and a second operand loaded into the addition pipe and/or the multiplication pipe and detecting a potential denormal condition. The denormal condition indicates that a calculated result generated from the first and second operands may be a denormal result. The denormal condition detection circuit, in response to detection of a potential denormal condition, prevents an additional operation from being loaded into the addition pipe and/or the multiplication pipe.
    Type: Grant
    Filed: January 4, 2000
    Date of Patent: March 30, 2004
    Assignee: National Semiconductor Corporation
    Inventor: Jeffrey A. Lohman
  • Patent number: 6629231
    Abstract: There is disclosed a pipelined floating point unit comprising: a) a first plurality of pipelined functional units for processing operands conforming to a single instruction-multiple data stream (SIMD) instruction set architecture (ISA); b) a second plurality of pipelined functional units for processing operands conforming to a scalar instruction set architecture (ISA); and c) a first format fault detection circuit associated with at least one of the first plurality of pipelined functional units for determining whether a first operand is a denormal number and, in response to the determination, generating a first fault signal. The first fault signal causes a number conversion circuit associated with the pipelined floating point unit to modify a significand and an exponent of at least one operand in a data register associated with the pipelined floating point unit to thereby convert the at least one operand to a denormal number.
    Type: Grant
    Filed: January 4, 2000
    Date of Patent: September 30, 2003
    Assignee: National Semiconductor Corporation
    Inventor: Jeffrey A. Lohman
  • Patent number: 6581155
    Abstract: For use in a processor having a first number of decode units for decoding an ordered stream of floating point instructions, a floating point unit (FPU) for receiving decoded ones of the floating point instructions and a method of processing the decoded ones of the floating point instructions. In one embodiment, the FPU includes: (1) a second number of floating point pipelines that execute the floating point instructions, the second number being at least one and less than the first number, the floating point pipeline having a load unit, an execution core and a store unit, (2) a floating point checkpoint buffer, coupled to the decode units, that queues the decoded ones of the floating point instructions for allocation to the floating point pipelines and (3) a floating point register file, coupled to and cooperable with the floating point checkpoint buffer, that preserves states of the execution core to allow the floating point pipelines to execute the floating point instructions out of order.
    Type: Grant
    Filed: August 25, 1999
    Date of Patent: June 17, 2003
    Assignee: National Semiconductor Corporation
    Inventors: Jeffrey Lohman, Nicholas Samra, Ram Gummadi
  • Patent number: 6523050
    Abstract: For use in a processor having a floating point execution core, logic circuitry for, and a method of, converting negative numbers from integer notation to floating point notation.
    Type: Grant
    Filed: August 19, 1999
    Date of Patent: February 18, 2003
    Assignee: National Semiconductor Corporation
    Inventors: Atul Dhablania, Jeffrey A. Lohman
  • Patent number: 6405232
    Abstract: For use in a processor having a floating point unit (FPU) capable of managing denormalized numbers in floating point notation, logic circuitry for, and a method of adding or subtracting two floating point numbers. In one embodiment, the logic circuitry includes: (1) an adder that receives the two floating point numbers and, based on a received instruction, adds or subtracts the two floating point numbers to yield a denormal sum or difference thereof, (2) a leading bit predictor that receives the two floating point numbers and performs logic operations thereon to yield predictive shift data denoting an extent to which the denormal sum or difference is required to be shifted to normalize the denormal sum or difference, the predictive shift data subject to being erroneous and (3) predictor corrector logic that receives the two floating point numbers and performs logic operations thereon to yield shift compensation data denoting an extent to which the predictive shift is erroneous.
    Type: Grant
    Filed: August 19, 1999
    Date of Patent: June 11, 2002
    Assignee: National Semiconductor Corporation
    Inventors: Daniel W. Green, Atul Dhablania, Jeffrey A. Lohman, Bang Nguyen
  • Patent number: 5745721
    Abstract: A scalar/vector processor capable of concurrent scaler and vector operations includes scalar resources to process scalar instructions, and vector resources adapted to be operated concurrently with the scalar resources and with one another to process vector instructions. The scalar resources include scalar registers, and the vector resources include vector registers. Decoding means decodes each of a number of address fields. Each field represents a register address to access alternatively one of the scalar registers or one of the vector registers depending on a value of the register address being above or below a selected moveable address value within a range of addresses encompassed by the address field.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: April 28, 1998
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5717881
    Abstract: An improved high performance hardwired supercomputer data processing apparatus includes instruction means adpated to issue one and two parcel instructions. Instruction fetch means provides an instruction stream of two parcel items in sequence. Instruction decode means is responsive to each two parcel item for determining in one clock cycle whether the two parcel item is a single two parcel instruction or two one parcel instructions, for issuing each two parcel instruction for execution during the one clock cycle, and for issuing one then the other of the two one parcel instructions for execution in sequence during the one clock cycle and the next succeeding clock cycle.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: February 10, 1998
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5706490
    Abstract: A delayed branch mechanism maintains the flow of an instruction pipeline in a scalar/vector processor having an instruction cache and including instruction fetch means, a program counter, and instruction decode/issue means coupled to the instruction cache by means of the instruction pipeline. Conditional branch instructions are rated as likely conditional branch instructions or unlikely conditional branch instructions based on a probability that their branch conditions will be met. A number of pipeline clock periods required for testing the branch conditions are determined. The likely conditional branch instructions are issued and executed including transferring a branch-to-address to the program counter during the number of pipeline clock periods irrespective of a successful meeting of the branch conditions. A number of useful instructions sufficient to issue within the number of pipeline clock periods are placed into the instruction stream following the likely conditional branch instructions.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: January 6, 1998
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5659706
    Abstract: The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: August 19, 1997
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5640524
    Abstract: A vector processing system includes a main memory, vector registers, vector resources for accessing memory to transfer vector data between main memory and the vector registers and to perform operations on the vector data. Data words stored in non-consecutive address locations of a segment of main memory are accessed for processing. Offset address values of a number of the data words are stored in consecutive elements of a first vector register. A vector gather instruction is executed which adds each offset address value to a base address value to calculate main memory addresses representing main memory storage locations of the data words, retrieves the data words from the main memory, and stores the data words in consecutive elements of a second vector register in an order corresponding to that in which the offset address values are stored in the first vector register.
    Type: Grant
    Filed: February 28, 1995
    Date of Patent: June 17, 1997
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5623650
    Abstract: A sequence of conditional vector IF statements is processed by employing a mask register and a condition register. Each conditional vector IF statement is typically performed on two vector registers, each having vector elements. A first conditional vector IF statement in the sequence is processed for those vector elements corresponding to set bits in the mask register. Bits are set in the condition register to reflect those vector elements which correspond to the set bits in the mask register for which the conditional vector IF statement is satisfied. The contents of the condition register are then moved into the mask register. A next conditional vector IF statement in the sequence is then processed for those vector elements corresponding to the new set bits in the mask register. Bits are then set in the condition register to reflect those vector elements which correspond to the new set bits in the mask register for which the conditional vector IF statement is satisfied.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: April 22, 1997
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5598547
    Abstract: A vector processor includes functional unit paths, each having an input and an output, and with at least one functional unit path including a plurality of pipelined functional elements coupled to the respective path input and output in parallel. The functional elements have different pipeline lengths to complete processing of operands applied to the path input. Program instruction initiation means responds to a first instruction to initiate processing of first operand data in a first of the functional elements, and responds to a second instruction to initiate the processing of second operand data in a second of the functional elements dependent upon completion of the first instruction but only if the second functional element has a pipeline length equal to or greater than the pipeline length of the first functional element.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: January 28, 1997
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke
  • Patent number: 5544337
    Abstract: The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: August 6, 1996
    Assignee: Cray Research, Inc.
    Inventors: Douglas R. Beard, Andrew E. Phelps, Michael A. Woodmansee, Richard G. Blewett, Jeffrey A. Lohman, Alexander A. Silbey, George A. Spix, Frederick J. Simmons, Don A. Van Dyke