Patents by Inventor Michael Alan Fetterman

Michael Alan Fetterman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10699427
    Abstract: Methods and apparatuses are disclosed for reporting texture footprint information. A texture footprint identifies the portion of a texture that will be utilized in rendering a pixel in a scene. The disclosed methods and apparatuses advantageously improve system efficiency in decoupled shading systems by first identifying which texels in a given texture map are needed for subsequently rendering a scene. Therefore, the number of texels that are generated and stored may be reduced to include the identified texels. Texels that are not identified need not be rendered and/or stored.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: June 30, 2020
    Assignee: NVIDIA Corporation
    Inventors: Yury Uralsky, Henry Packard Moreton, Eric Brian Lum, Jonathan J. Dunaisky, Steven James Heinrich, Stefano Pescador, Shirish Gadre, Michael Alan Fetterman
  • Publication number: 20200013174
    Abstract: Methods and apparatuses are disclosed for reporting texture footprint information. A texture footprint identifies the portion of a texture that will be utilized in rendering a pixel in a scene. The disclosed methods and apparatuses advantageously improve system efficiency in decoupled shading systems by first identifying which texels in a given texture map are needed for subsequently rendering a scene. Therefore, the number of texels that are generated and stored may be reduced to include the identified texels. Texels that are not identified need not be rendered and/or stored.
    Type: Application
    Filed: August 12, 2019
    Publication date: January 9, 2020
    Inventors: Yury Uralsky, Henry Packard Moreton, Eric Brian Lum, Jonathan J. Dunaisky, Steven James Heinrich, Stefano Pescador, Shirish Gadre, Michael Alan Fetterman
  • Patent number: 10503513
    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well as a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.
    Type: Grant
    Filed: October 23, 2013
    Date of Patent: December 10, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: David Conrad Tannenbaum, Srinivasan (Vasu) Iyer, Stuart F. Oberman, Ming Y. Siu, Michael Alan Fetterman, John Matthew Burgess, Shirish Gadre
  • Patent number: 10489200
    Abstract: One embodiment of the present invention is a computer-implemented method for scheduling a thread group for execution on a processing engine that includes identifying a first thread group included in a first set of thread groups that can be issued for execution on the processing engine, where the first thread group includes one or more threads. The method also includes transferring the first thread group from the first set of thread groups to a second set of thread groups, allocating hardware resources to the first thread group, and selecting the first thread group from the second set of thread groups for execution on the processing engine. One advantage of the disclosed technique is that a scheduler only allocates limited hardware resources to thread groups that are, in fact, ready to be issued for execution, thereby conserving those resources in a manner that is generally more efficient than conventional techniques.
    Type: Grant
    Filed: October 23, 2013
    Date of Patent: November 26, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Olivier Giroux, Jack Hilaire Choquette, Robert J. Stoll, Xiaogang Qiu, Michael Alan Fetterman
  • Patent number: 10424074
    Abstract: Methods and apparatuses are disclosed for reporting texture footprint information. A texture footprint identifies the portion of a texture that will be utilized in rendering a pixel in a scene. The disclosed methods and apparatuses advantageously improve system efficiency in decoupled shading systems by first identifying which texels in a given texture map are needed for subsequently rendering a scene. Therefore, the number of texels that are generated and stored may be reduced to include the identified texels. Texels that are not identified need not be rendered and/or stored.
    Type: Grant
    Filed: July 3, 2018
    Date of Patent: September 24, 2019
    Assignee: NVIDIA Corporation
    Inventors: Yury Uralsky, Henry Packard Moreton, Eric Brian Lum, Jonathan J. Dunaisky, Steven James Heinrich, Stefano Pescador, Shirish Gadre, Michael Alan Fetterman
  • Patent number: 10067768
    Abstract: A method, system, and computer program product for executing divergent threads using a convergence barrier are disclosed. A first instruction in a program is executed by a plurality of threads, where the first instruction, when executed by a particular thread, indicates to a scheduler unit that the thread participates in a convergence barrier. A first path through the program is executed by a first divergent portion of the participating threads and a second path through the program is executed by a second divergent portion of the participating threads. The first divergent portion of the participating threads executes a second instruction in the program and transitions to a blocked state at the convergence barrier. The scheduler unit determines that all of the participating threads are synchronized at the convergence barrier and the convergence barrier is cleared.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: September 4, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Gregory Frederick Diamos, Richard Craig Johnson, Vinod Grover, Olivier Giroux, Jack H. Choquette, Michael Alan Fetterman, Ajay S. Tirumala, Peter Nelson, Ronny Meir Krashinsky
  • Patent number: 9612836
    Abstract: A system, method, and computer program product are provided for implementing a software-based scoreboarding mechanism. The method includes the steps of receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register and, based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units.
    Type: Grant
    Filed: February 3, 2014
    Date of Patent: April 4, 2017
    Assignee: NVIDIA Corporation
    Inventors: Robert Ohannessian, Jr., Michael Alan Fetterman, Olivier Giroux, Jack H. Choquette, Xiaogang Qiu, Shirish Gadre, Meenaradchagan Vishnu
  • Patent number: 9600235
    Abstract: One embodiment of the present invention includes a method for performing arithmetic operations on arbitrary width integers using fixed width elements. The method includes receiving a plurality of input operands, segmenting each input operand into multiple sectors, performing a plurality of multiply-add operations based on the multiple sectors to generate a plurality of multiply-add operation results, and combining the multiply-add operation results to generate a final result. One advantage of the disclosed embodiments is that, by using a common fused floating point multiply-add unit to perform arithmetic operations on integers of arbitrary width, the method avoids the area and power penalty of having additional dedicated integer units.
    Type: Grant
    Filed: September 13, 2013
    Date of Patent: March 21, 2017
    Assignee: NVIDIA Corporation
    Inventors: Srinivasan Iyer, Michael Alan Fetterman, David Conrad Tannenbaum
  • Patent number: 9477480
    Abstract: A system, method, and computer program product are provided for scheduling interruptible hatches of instructions for execution by one or more functional units of a processor. The method includes the steps of receiving a batch of instructions that includes a plurality of instructions and dispatching at least one instruction from the batch of instructions to one or more functional units for execution. The method further includes the step of receiving an interrupt request that causes an interrupt routine to be dispatched to the one or more functional units prior to all instructions in the batch of instructions being dispatched to the one or more functional units. When the interrupt request is received, the method further includes the step of storing batch-level resources in a memory to resume execution of the batch of instructions once the interrupt routine has finished execution.
    Type: Grant
    Filed: January 30, 2014
    Date of Patent: October 25, 2016
    Assignee: NVIDIA Corporation
    Inventors: Olivier Giroux, Robert Ohannessian, Jr., Jack H. Choquette, Michael Alan Fetterman
  • Patent number: 9477482
    Abstract: A system, method, and computer program product are provided for implementing a multi-cycle register file bypass mechanism. The method includes the steps of receiving a set of control bits, combining the set of control bits with a set of valid bits associated with previously issued instructions, and enabling a bypass path for each thread based on the set of control bits and the set of valid bits. Each valid bit in the set of valid bits indicates whether execution of an instruction of the previously issued instructions was enabled for a thread in a thread block.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: October 25, 2016
    Assignee: NVIDIA Corporation
    Inventors: Xiaogang Qiu, Ian Chi Yan Kwong, Ming Yiu Siu, Jack H. Choquette, Michael Alan Fetterman
  • Patent number: 9471307
    Abstract: A system and apparatus are provided that include an implementation for decoupled pipelines. The apparatus includes a scheduler configured to issue instructions to one or more functional units and a functional unit coupled to a queue having a number of slots for storing instructions. The instructions issued to the functional unit are stored in the queue until the functional unit is available to process the instructions.
    Type: Grant
    Filed: January 3, 2014
    Date of Patent: October 18, 2016
    Assignee: NVIDIA Corporation
    Inventors: Olivier Giroux, Michael Alan Fetterman, Robert Ohannessian, Jr., Shirish Gadre, Jack H. Choquette, Xiaogang Qiu, Jeffrey Scott Tuckey, Robert James Stoll
  • Publication number: 20160019066
    Abstract: A method, system, and computer program product for executing divergent threads using a convergence barrier are disclosed. A first instruction in a program is executed by a plurality of threads, where the first instruction, when executed by a particular thread, indicates to a scheduler unit that the thread participates in a convergence barrier. A first path through the program is executed by a first divergent portion of the participating threads and a second path through the program is executed by a second divergent portion of the participating threads. The first divergent portion of the participating threads executes a second instruction in the program and transitions to a blocked state at the convergence barrier. The scheduler unit determines that all of the participating threads are synchronized at the convergence barrier and the convergence barrier is cleared.
    Type: Application
    Filed: July 13, 2015
    Publication date: January 21, 2016
    Inventors: Gregory Frederick Diamos, Richard Craig Johnson, Vinod Grover, Olivier Giroux, Jack H. Choquette, Michael Alan Fetterman, Ajay S. Tirumala, Peter Nelson, Ronny Meir Krashinsky
  • Publication number: 20150220341
    Abstract: A system, method, and computer program product are provided for implementing a software-based scoreboarding mechanism. The method includes the steps of receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register and, based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units.
    Type: Application
    Filed: February 3, 2014
    Publication date: August 6, 2015
    Applicant: NVIDIA Corporation
    Inventors: Robert Ohannessian, JR., Michael Alan Fetterman, Olivier Giroux, Jack H. Choquette, Xiaogang Qiu, Shirish Gadre, Meenaradchagan Vishnu
  • Publication number: 20150212819
    Abstract: A system, method, and computer program product are provided for scheduling interruptible hatches of instructions for execution by one or more functional units of a processor. The method includes the steps of receiving a batch of instructions that includes a plurality of instructions and dispatching at least one instruction from the batch of instructions to one or more functional units for execution. The method further includes the step of receiving an interrupt request that causes an interrupt routine to be dispatched to the one or more functional units prior to all instructions in the batch of instructions being dispatched to the one or more functional units. When the interrupt request is received, the method further includes the step of storing batch-level resources in a memory to resume execution of the batch of instructions once the interrupt routine has finished execution.
    Type: Application
    Filed: January 30, 2014
    Publication date: July 30, 2015
    Applicant: NVIDIA Corporation
    Inventors: Olivier Giroux, Robert Ohannessian, JR., Jack H. Choquette, Michael Alan Fetterman
  • Publication number: 20150193272
    Abstract: A system and apparatus are provided that include an implementation for decoupled pipelines. The apparatus includes a scheduler configured to issue instructions to one or more functional units and a functional unit coupled to a queue having a number of slots for storing instructions. The instructions issued to the functional unit are stored in the queue until the functional unit is available to process the instructions.
    Type: Application
    Filed: January 3, 2014
    Publication date: July 9, 2015
    Applicant: NVIDIA Corporation
    Inventors: Olivier Giroux, Michael Alan Fetterman, Robert Ohannessian, JR., Shirish Gadre, Jack H. Choquette, Xiaogang Qiu, Jeffrey Scott Tuckey, Robert James Stoll
  • Publication number: 20150113538
    Abstract: One embodiment of the present invention is a computer-implemented method for scheduling a thread group for execution on a processing engine that includes identifying a first thread group included in a first set of thread groups that can be issued for execution on the processing engine, where the first thread group includes one or more threads. The method also includes transferring the first thread group from the first set of thread groups to a second set of thread groups, allocating hardware resources to the first thread group, and selecting the first thread group from the second set of thread groups for execution on the processing engine. One advantage of the disclosed technique is that a scheduler only allocates limited hardware resources to thread groups that are, in fact, ready to be issued for execution, thereby conserving those resources in a manner that is generally more efficient than conventional techniques.
    Type: Application
    Filed: October 23, 2013
    Publication date: April 23, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Olivier GIROUX, Jack Hilaire CHOQUETTE, Robert J. STOLL, Xiaogang QIU, Michael Alan FETTERMAN
  • Publication number: 20150113254
    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.
    Type: Application
    Filed: October 23, 2013
    Publication date: April 23, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: David Conrad TANNENBAUM, Srinivasan (Vasu) IYER, Stuart F. OBERMAN, Ming Y. SIU, Michael Alan FETTERMAN, John Matthew BURGESS, Shirish GADRE
  • Publication number: 20150089202
    Abstract: A system, method, and computer program product are provided for implementing a multi-cycle register file bypass mechanism. The method includes the steps of receiving a set of control bits, combining the set of control bits with a set of valid bits associated with previously issued instructions, and enabling a bypass path for each thread based on the set of control bits and the set of valid bits. Each valid bit in the set of valid bits indicates whether execution of an instruction of the previously issued instructions was enabled for a thread in a thread block.
    Type: Application
    Filed: September 26, 2013
    Publication date: March 26, 2015
    Applicant: NVIDIA Corporation
    Inventors: Xiaogang Qiu, Ian Chi Yan Kwong, Ming Yiu Siu, Jack H. Choquette, Michael Alan Fetterman
  • Publication number: 20150081753
    Abstract: One embodiment of the present invention includes a method for performing arithmetic operations on arbitrary width integers using fixed width elements. The method includes receiving a plurality of input operands, segmenting each input operand into multiple sectors, performing a plurality of multiply-add operations based on the multiple sectors to generate a plurality of multiply-add operation results, and combining the multiply-add operation results to generate a final result. One advantage of the disclosed embodiments is that, by using a common fused floating point multiply-add unit to perform arithmetic operations on integers of arbitrary width, the method avoids the area and power penalty of having additional dedicated integer units.
    Type: Application
    Filed: September 13, 2013
    Publication date: March 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Srinivasan (Vasu) IYER, Michael Alan FETTERMAN, David Conrad TANNENBAUM
  • Patent number: 5944817
    Abstract: A Branch Target Buffer Circuit in a computer processor that predicts branch instructions with a stream of computer instructions is disclosed. The Branch Target Buffer Circuit uses a Branch Target Buffer Cache that stores branch information about previously executed branch instructions. The branch information stored in the Branch Target Buffer Cache is addressed by the last byte of each branch instruction. When an Instruction Fetch Unit in the computer processor fetches a block of instructions it sends the Branch Target Buffer Circuit an instruction pointer. Based on the instruction pointer, the Branch Target Buffer Circuit looks in the Branch Target Buffer Cache to see if any of the instructions in the block being fetched is a branch instruction. When the Branch Target Buffer Circuit finds an upcoming branch instruction in the Branch Target Buffer Cache, the Branch Target Buffer Circuit informs an Instruction Fetch Unit about the upcoming branch instruction.
    Type: Grant
    Filed: October 7, 1998
    Date of Patent: August 31, 1999
    Assignee: Intel Corporation
    Inventors: Bradley D. Hoyt, Glenn I. Hinton, David B. Papworth, Ashwani Kumar Gupta, Michael Alan Fetterman, Subramanian Natarajan, Sunil Shenoy, Reynold V. D'Sa