Patents by Inventor Andrew Evan Gruber
Andrew Evan Gruber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20180165789Abstract: Techniques are described in which a device is configured to retrieve a metadata buffer for rendering a sub-frame of a set of sub-frames for a frame. A data block of a data buffer is configured to store image data for rendering the sub-frame. In response to determining, based on the metadata buffer for rendering the sub-frame, that the sub-frame includes a color pattern, fixed color value, or combination thereof, the device refrains from retrieving the image data from the data block of the data buffer and determines the image data for rendering the sub-frame based on the metadata buffer.Type: ApplicationFiled: December 13, 2016Publication date: June 14, 2018Inventors: Andrew Evan Gruber, Serag GadelRab, Zhenbiao Ma, Meghal Varia, Tao Wang, Tom Longo, Mark Sternberg, Paul Chow
-
Publication number: 20180040095Abstract: This disclosure describes techniques for compressing a graphical state object. In one example, a central processing unit may be configured to receive, for output to the GPU, a set of instructions to render a scene. Responsive to receiving the set of instructions to render the scene, the central processing unit may be further configured to determine whether the set of instructions includes a state object that is registered as corresponding to an identifier. Responsive to determining that the set of instructions includes the state object that is registered as corresponding to the identifier, the central processing unit may be further configured to output, to the GPU, the identifier that is registered as corresponding to the state object.Type: ApplicationFiled: August 2, 2016Publication date: February 8, 2018Inventors: Avinash Seetharamaiah, Christopher Paul Frascati, Jonnala Gadda Nagendra Kumar, Andrew Evan Gruber, Colin Christopher Sharp, Eric Demers
-
Publication number: 20180025463Abstract: This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.Type: ApplicationFiled: July 25, 2016Publication date: January 25, 2018Inventors: Maxim Kazakov, Andrew Evan Gruber
-
Patent number: 9824458Abstract: A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.Type: GrantFiled: September 23, 2015Date of Patent: November 21, 2017Assignee: QUALCOMM IncorporatedInventors: Shambhoo Khandelwal, Yang Xia, Xuefeng Tang, Jian Liang, Tao Wang, Andrew Evan Gruber, Eric Demers
-
Patent number: 9818170Abstract: This disclosure describes techniques for processing unaligned block transfer (BLT) commands. The techniques of this disclosure may involve converting an unaligned BLT command into multiple aligned BLT commands, where the multiple aligned BLT commands may collectively produce the same resulting memory state as that which would have been produced by the unaligned BLT command. The techniques of this disclosure may allow the benefits of relatively low-power GPU-accelerated BLT processing may be achieved for unaligned BLT commands without requiring a CPU to pre-process and/or post-process the underlying unaligned surfaces. In this way, the performance and/or power consumption associated with processing unaligned BLT commands in an alignment-constrained GPU-based system may be improved.Type: GrantFiled: December 10, 2014Date of Patent: November 14, 2017Assignee: QUALCOMM IncorporatedInventor: Andrew Evan Gruber
-
Publication number: 20170316540Abstract: A texture unit of a graphics processing unit (GPU) may receive a texture data. The texture unit may receive the texture data from the memory. The texture unit may also multiply, by a multiplier circuit of the texture unit, the texture data by at least one constant, where the constant is not associated with a filtering operation, and where the texture data comprises at least one texel. The texture unit may also output, by the texture unit, a result of multiplying the texture data by the at least one constant.Type: ApplicationFiled: April 28, 2016Publication date: November 2, 2017Inventors: Andrew Evan Gruber, Lin Chen, Liang Li, Chunhui Mei
-
Patent number: 9799094Abstract: A method for processing data in a graphics processing unit (GPU) including receiving an instance identifier for an instance and a shader program comprising a preamble code block and a main shader code block, assigning, the instance identifier to a general purpose register at wave creation, allocating address space within the constant memory for instance uniforms, and determining the preamble code block has not been executed and the wave is a first wave of the instance to be executed, based on determining the preamble code block has not been executed and the wave is the first wave to be executed, executing the preamble code block to store the plurality of instance uniforms in the constant memory and based, at least in part, on executing the preamble code block, executing the wave of the plurality of waves using at least one of the plurality of instance constants stored inconstant memory.Type: GrantFiled: May 23, 2016Date of Patent: October 24, 2017Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Richard Hammerstone, Jiaji Liu, Chihong Zhang, Andrew Evan Gruber, Yun Du
-
Patent number: 9799089Abstract: A method for processing data in a graphics processing unit including receiving a code block of instructions common to a plurality of groups of threads of a shader, executing the code block of instructions common to the plurality of groups of threads of the shader creating a result by a first group of threads of the plurality of groups of threads, storing the result of the code block of instructions common to the plurality of groups of threads of the shader in on-chip random access memory (RAM), the on-chip RAM accessible by each of the plurality of groups of threads, and upon a determination that storing the result of the code block of instructions common to the plurality of groups of threads of the shader has completed, returning the result of the code block of instructions common to the plurality of groups of threads of the shader from on-chip RAM.Type: GrantFiled: May 23, 2016Date of Patent: October 24, 2017Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Yun Du, Andrew Evan Gruber, Guofang Jiao, Chun Yu, David Rigel Garcia Garcia
-
Publication number: 20170293995Abstract: A graphics processing unit (GPU) may rasterize a primitive into a plurality of samples, wherein vertices of the primitive are associated with VRS parameters. The GPU may determine a VRS quality group that comprises one or more sub regions of the plurality of samples based at least in part on the VRS parameters. The GPU may fragment shade a VRS tile that represents the VRS quality group, wherein the VRS tile comprises fewer samples than the VRS quality group. The GPU may amplify the stored VRS tile into shaded fragments that correspond to the VRS quality group.Type: ApplicationFiled: February 16, 2017Publication date: October 12, 2017Inventors: Skyler Jonathon Saleh, Vineet Goel, Maurice Franklin Ribble, Andrew Evan Gruber
-
Patent number: 9747104Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.Type: GrantFiled: May 12, 2014Date of Patent: August 29, 2017Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
-
Publication number: 20170243320Abstract: A method for processing data in a graphics processing unit including receiving an indication that all threads of a warp in a graphics processing unit (GPU) are to execute a same branch in a first set of instructions, storing one or more predicate bits in a memory as a single set of predicate bits, wherein the single set of predicate bits applies to all of the threads in the warp, and executing a portion of the first set of instructions in accordance with the single set of predicate bits. Executing the first set of instructions may include executing the first set of instruction in accordance with the single set of predicate bits using a single instruction, multiple data (SIMD) processing core and/or executing the first set of instruction in accordance with the single set of predicate bits using a scalar processing unit.Type: ApplicationFiled: February 19, 2016Publication date: August 24, 2017Inventors: Andrew Evan Gruber, Pramod Vasant Argade, Jing Wu
-
Patent number: 9697580Abstract: This disclosure describes an apparatus configured to process graphics data. The apparatus may include a fixed hardware pipeline configured to execute one or more functions on a current set of graphics data. The fixed hardware pipeline may include a plurality of stages including a bypassable portion of the plurality of stages. The apparatus may further include a shortcut circuit configured to route the current set of graphics data around the bypassable portion of the plurality of stages, and a controller positioned before the bypassable portion of the plurality of stages, the controller configured to selectively route the current set of graphics data to one of the shortcut circuit or the bypassable portion of the plurality of stages.Type: GrantFiled: November 10, 2014Date of Patent: July 4, 2017Assignee: QUALCOMM IncorporatedInventors: Liang Li, Andrew Evan Gruber, Guofang Jiao, Zhenyu Qi, Gregory Steve Pitarys, Scott William Nolan
-
Patent number: 9665370Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.Type: GrantFiled: August 19, 2014Date of Patent: May 30, 2017Assignee: QUALCOMM IncorporatedInventors: Yun Du, Lin Chen, Andrew Evan Gruber, Chihong Zhang, Chun Yu
-
Patent number: 9645792Abstract: At least one processor may emulate a fused multiply-add operation for a first operand, a second operand, and a third operand. The at least one processor may determine an intermediate value based at least in part on multiplying the first operand with the second operand, determine at least one of an upper intermediate value or a lower intermediate value, wherein determining the upper intermediate value comprises rounding, towards zero, the intermediate value by a specified number of bits, and wherein determining the lower intermediate value comprises subtracting the intermediate value by the upper intermediate value, determine an upper value and a lower value based at least in part on adding or subtracting the third operand to one of the upper intermediate value or the lower intermediate value, and determine an emulated fused multiply-add result by adding the upper value and the lower value.Type: GrantFiled: August 18, 2014Date of Patent: May 9, 2017Assignee: QUALCOMM IncorporatedInventors: Pramod Vasant Argade, Andrew Evan Gruber, Chiente Ho, Stewart Griffin Hall, Lin Chen
-
Patent number: 9633411Abstract: Techniques are described for determining whether data of a variable for each of a plurality of graphics items is same. If determined that the data is the same, the techniques store the data in a storage location of a specialized shared general purpose register that is associated with the variable.Type: GrantFiled: June 26, 2014Date of Patent: April 25, 2017Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Lin Chen, Guofang Jiao, Chun Yu
-
Publication number: 20170084043Abstract: A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.Type: ApplicationFiled: September 23, 2015Publication date: March 23, 2017Inventors: Shambhoo Khandelwal, Yang Xia, Xuefeng Tang, Jian Liang, Tao Wang, Andrew Evan Gruber, Eric Demers
-
Publication number: 20170083997Abstract: A computing device may allocate a plurality of blocks in the memory, wherein each of the plurality of blocks is of a uniform fixed size in the memory. The computing device may further store a plurality of bandwidth-compressed graphics data into the respective plurality of blocks in the memory, wherein one or more of the plurality of bandwidth-compressed graphics data each has a size that is smaller than the fixed size. The computing device may further store data associated with the plurality of bandwidth-compressed graphics data into unused space of one or more of the plurality of blocks that contains the respective one or more of the plurality of bandwidth-compressed graphics data.Type: ApplicationFiled: September 17, 2015Publication date: March 23, 2017Inventors: Andrew Evan Gruber, Rexford Alan Hill, Shambhoo Khandelwal
-
Patent number: 9569811Abstract: In an example, a method for rendering graphics data includes rendering pixels of a first bin of a plurality of bins, wherein the pixels of the first bin are associated with a first portion of an image, and rendering, to the first bin, one or more pixels that are located outside the first portion of the image and associated with a second, different bin of the plurality of bins. The method also includes rendering the one or more pixels associated with the second bin to the second bin, such that the one or more pixels are rendered to both the first bin and the second bin.Type: GrantFiled: June 26, 2014Date of Patent: February 14, 2017Assignee: QUALCOMM IncorporatedInventors: Andrew Evan Gruber, Tao Wang, Chunhui Mei, Gang Zhong, Feng Ge
-
Patent number: 9489313Abstract: The present disclosure provides for systems and methods to process a non-resident page that may include attempting to access the non-resident page, an address for the non-resident page pointing to a memory page containing default values, determining that the non-resident page should not cause a page fault based on an indicator indicating that a particular non-resident page should not generate a page fault, returning an indication that a memory read did not translate and returning the default value when the access of the non-resident page is a read and the non-resident page should not cause a page fault. Another example may discontinue a write when the access of the non-resident page is a write and the non-resident page should not cause a page fault.Type: GrantFiled: September 24, 2013Date of Patent: November 8, 2016Assignee: QUALCOMM IncorporatedInventors: David A. Gotwalt, Thomas Edwin Frisinger, Andrew Evan Gruber, Eric Demers, Colin Christopher Sharp
-
Patent number: 9483861Abstract: This disclosure describes techniques for using bounding regions to perform tile-based rendering with a graphics processing unit (GPU) that supports an on-chip, tessellation-enabled graphics rendering pipeline. Instead of generating binning data based on rasterized versions of the actual primitives to be rendered, the techniques of this disclosure may generate binning data based on a bounding region that encompasses one or more of the primitives to be rendered. Moreover, the binning data may be generated based on data that is generated by at least one tessellation processing stage of an on-chip, tessellation-enabled graphics rendering pipeline that is implemented by the GPU. The techniques of this disclosure may, in some examples, be used to improve the performance of an on-chip, tessellation-enabled GPU when performing tile-based rendering without sacrificing the quality of the resulting rendered image.Type: GrantFiled: March 15, 2013Date of Patent: November 1, 2016Assignee: QUALCOMM IncorporatedInventors: Christopher Paul Frascati, Avinash Seetharamaiah, Andrew Evan Gruber