Patents by Inventor Andrew Evan Gruber

Andrew Evan Gruber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10133572
    Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.
    Type: Grant
    Filed: May 2, 2014
    Date of Patent: November 20, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Lin Chen, Yun Du, Alexei Vladimirovich Bourd
  • Patent number: 10115175
    Abstract: A method for processing data in a graphics processing unit including receiving an indication that all threads of a warp in a graphics processing unit (GPU) are to execute a same branch in a first set of instructions, storing one or more predicate bits in a memory as a single set of predicate bits, wherein the single set of predicate bits applies to all of the threads in the warp, and executing a portion of the first set of instructions in accordance with the single set of predicate bits. Executing the first set of instructions may include executing the first set of instruction in accordance with the single set of predicate bits using a single instruction, multiple data (SIMD) processing core and/or executing the first set of instruction in accordance with the single set of predicate bits using a scalar processing unit.
    Type: Grant
    Filed: February 19, 2016
    Date of Patent: October 30, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Pramod Vasant Argade, Jing Wu
  • Patent number: 10089708
    Abstract: A texture unit of a graphics processing unit (GPU) may receive a texture data. The texture unit may receive the texture data from the memory. The texture unit may also multiply, by a multiplier circuit of the texture unit, the texture data by at least one constant, where the constant is not associated with a filtering operation, and where the texture data comprises at least one texel. The texture unit may also output, by the texture unit, a result of multiplying the texture data by the at least one constant.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: October 2, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Lin Chen, Liang Li, Chunhui Mei
  • Patent number: 10062139
    Abstract: This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: August 28, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Maxim Kazakov, Andrew Evan Gruber
  • Publication number: 20180232846
    Abstract: A GPU may be configured to detect and nullify unnecessary instructions. Nullifying unnecessary instructions include overwriting a detected unnecessary instruction with a no operation (NOP) instruction. In another example, nullifying unnecessary instructions may include writing a value to a 1-bit instruction memory. Each bit of the 1-bit instruction memory may be associated with a particular instruction of the draw call. If the 1-bit instruction memory has a true value (e.g., 1), the GPU is configured to not execute the particular instruction.
    Type: Application
    Filed: February 14, 2017
    Publication date: August 16, 2018
    Inventors: Andrew Evan Gruber, Lin Chen
  • Publication number: 20180165789
    Abstract: Techniques are described in which a device is configured to retrieve a metadata buffer for rendering a sub-frame of a set of sub-frames for a frame. A data block of a data buffer is configured to store image data for rendering the sub-frame. In response to determining, based on the metadata buffer for rendering the sub-frame, that the sub-frame includes a color pattern, fixed color value, or combination thereof, the device refrains from retrieving the image data from the data block of the data buffer and determines the image data for rendering the sub-frame based on the metadata buffer.
    Type: Application
    Filed: December 13, 2016
    Publication date: June 14, 2018
    Inventors: Andrew Evan Gruber, Serag GadelRab, Zhenbiao Ma, Meghal Varia, Tao Wang, Tom Longo, Mark Sternberg, Paul Chow
  • Publication number: 20180040095
    Abstract: This disclosure describes techniques for compressing a graphical state object. In one example, a central processing unit may be configured to receive, for output to the GPU, a set of instructions to render a scene. Responsive to receiving the set of instructions to render the scene, the central processing unit may be further configured to determine whether the set of instructions includes a state object that is registered as corresponding to an identifier. Responsive to determining that the set of instructions includes the state object that is registered as corresponding to the identifier, the central processing unit may be further configured to output, to the GPU, the identifier that is registered as corresponding to the state object.
    Type: Application
    Filed: August 2, 2016
    Publication date: February 8, 2018
    Inventors: Avinash Seetharamaiah, Christopher Paul Frascati, Jonnala Gadda Nagendra Kumar, Andrew Evan Gruber, Colin Christopher Sharp, Eric Demers
  • Publication number: 20180025463
    Abstract: This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.
    Type: Application
    Filed: July 25, 2016
    Publication date: January 25, 2018
    Inventors: Maxim Kazakov, Andrew Evan Gruber
  • Patent number: 9824458
    Abstract: A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: November 21, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Shambhoo Khandelwal, Yang Xia, Xuefeng Tang, Jian Liang, Tao Wang, Andrew Evan Gruber, Eric Demers
  • Patent number: 9818170
    Abstract: This disclosure describes techniques for processing unaligned block transfer (BLT) commands. The techniques of this disclosure may involve converting an unaligned BLT command into multiple aligned BLT commands, where the multiple aligned BLT commands may collectively produce the same resulting memory state as that which would have been produced by the unaligned BLT command. The techniques of this disclosure may allow the benefits of relatively low-power GPU-accelerated BLT processing may be achieved for unaligned BLT commands without requiring a CPU to pre-process and/or post-process the underlying unaligned surfaces. In this way, the performance and/or power consumption associated with processing unaligned BLT commands in an alignment-constrained GPU-based system may be improved.
    Type: Grant
    Filed: December 10, 2014
    Date of Patent: November 14, 2017
    Assignee: QUALCOMM Incorporated
    Inventor: Andrew Evan Gruber
  • Publication number: 20170316540
    Abstract: A texture unit of a graphics processing unit (GPU) may receive a texture data. The texture unit may receive the texture data from the memory. The texture unit may also multiply, by a multiplier circuit of the texture unit, the texture data by at least one constant, where the constant is not associated with a filtering operation, and where the texture data comprises at least one texel. The texture unit may also output, by the texture unit, a result of multiplying the texture data by the at least one constant.
    Type: Application
    Filed: April 28, 2016
    Publication date: November 2, 2017
    Inventors: Andrew Evan Gruber, Lin Chen, Liang Li, Chunhui Mei
  • Patent number: 9799094
    Abstract: A method for processing data in a graphics processing unit (GPU) including receiving an instance identifier for an instance and a shader program comprising a preamble code block and a main shader code block, assigning, the instance identifier to a general purpose register at wave creation, allocating address space within the constant memory for instance uniforms, and determining the preamble code block has not been executed and the wave is a first wave of the instance to be executed, based on determining the preamble code block has not been executed and the wave is the first wave to be executed, executing the preamble code block to store the plurality of instance uniforms in the constant memory and based, at least in part, on executing the preamble code block, executing the wave of the plurality of waves using at least one of the plurality of instance constants stored inconstant memory.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: October 24, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Richard Hammerstone, Jiaji Liu, Chihong Zhang, Andrew Evan Gruber, Yun Du
  • Patent number: 9799089
    Abstract: A method for processing data in a graphics processing unit including receiving a code block of instructions common to a plurality of groups of threads of a shader, executing the code block of instructions common to the plurality of groups of threads of the shader creating a result by a first group of threads of the plurality of groups of threads, storing the result of the code block of instructions common to the plurality of groups of threads of the shader in on-chip random access memory (RAM), the on-chip RAM accessible by each of the plurality of groups of threads, and upon a determination that storing the result of the code block of instructions common to the plurality of groups of threads of the shader has completed, returning the result of the code block of instructions common to the plurality of groups of threads of the shader from on-chip RAM.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: October 24, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Andrew Evan Gruber, Guofang Jiao, Chun Yu, David Rigel Garcia Garcia
  • Publication number: 20170293995
    Abstract: A graphics processing unit (GPU) may rasterize a primitive into a plurality of samples, wherein vertices of the primitive are associated with VRS parameters. The GPU may determine a VRS quality group that comprises one or more sub regions of the plurality of samples based at least in part on the VRS parameters. The GPU may fragment shade a VRS tile that represents the VRS quality group, wherein the VRS tile comprises fewer samples than the VRS quality group. The GPU may amplify the stored VRS tile into shaded fragments that correspond to the VRS quality group.
    Type: Application
    Filed: February 16, 2017
    Publication date: October 12, 2017
    Inventors: Skyler Jonathon Saleh, Vineet Goel, Maurice Franklin Ribble, Andrew Evan Gruber
  • Patent number: 9747104
    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
  • Publication number: 20170243320
    Abstract: A method for processing data in a graphics processing unit including receiving an indication that all threads of a warp in a graphics processing unit (GPU) are to execute a same branch in a first set of instructions, storing one or more predicate bits in a memory as a single set of predicate bits, wherein the single set of predicate bits applies to all of the threads in the warp, and executing a portion of the first set of instructions in accordance with the single set of predicate bits. Executing the first set of instructions may include executing the first set of instruction in accordance with the single set of predicate bits using a single instruction, multiple data (SIMD) processing core and/or executing the first set of instruction in accordance with the single set of predicate bits using a scalar processing unit.
    Type: Application
    Filed: February 19, 2016
    Publication date: August 24, 2017
    Inventors: Andrew Evan Gruber, Pramod Vasant Argade, Jing Wu
  • Patent number: 9697580
    Abstract: This disclosure describes an apparatus configured to process graphics data. The apparatus may include a fixed hardware pipeline configured to execute one or more functions on a current set of graphics data. The fixed hardware pipeline may include a plurality of stages including a bypassable portion of the plurality of stages. The apparatus may further include a shortcut circuit configured to route the current set of graphics data around the bypassable portion of the plurality of stages, and a controller positioned before the bypassable portion of the plurality of stages, the controller configured to selectively route the current set of graphics data to one of the shortcut circuit or the bypassable portion of the plurality of stages.
    Type: Grant
    Filed: November 10, 2014
    Date of Patent: July 4, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Liang Li, Andrew Evan Gruber, Guofang Jiao, Zhenyu Qi, Gregory Steve Pitarys, Scott William Nolan
  • Patent number: 9665370
    Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.
    Type: Grant
    Filed: August 19, 2014
    Date of Patent: May 30, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Yun Du, Lin Chen, Andrew Evan Gruber, Chihong Zhang, Chun Yu
  • Patent number: 9645792
    Abstract: At least one processor may emulate a fused multiply-add operation for a first operand, a second operand, and a third operand. The at least one processor may determine an intermediate value based at least in part on multiplying the first operand with the second operand, determine at least one of an upper intermediate value or a lower intermediate value, wherein determining the upper intermediate value comprises rounding, towards zero, the intermediate value by a specified number of bits, and wherein determining the lower intermediate value comprises subtracting the intermediate value by the upper intermediate value, determine an upper value and a lower value based at least in part on adding or subtracting the third operand to one of the upper intermediate value or the lower intermediate value, and determine an emulated fused multiply-add result by adding the upper value and the lower value.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: May 9, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Pramod Vasant Argade, Andrew Evan Gruber, Chiente Ho, Stewart Griffin Hall, Lin Chen
  • Patent number: 9633411
    Abstract: Techniques are described for determining whether data of a variable for each of a plurality of graphics items is same. If determined that the data is the same, the techniques store the data in a storage location of a specialized shared general purpose register that is associated with the variable.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: April 25, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Yun Du, Andrew Evan Gruber, Lin Chen, Guofang Jiao, Chun Yu