Patents by Inventor Chihong Zhang
Chihong Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12067666Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.Type: GrantFiled: May 18, 2022Date of Patent: August 20, 2024Assignee: QUALCOMM IncorporatedInventors: Yun Du, Eric Demers, Andrew Evan Gruber, Chun Yu, Chihong Zhang, Baoguang Yang, Yuehai Du, Gang Zhong, Avinash Seetharamaiah, Jonnala Gadda Nagendra Kumar
-
Patent number: 12056790Abstract: The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.Type: GrantFiled: January 31, 2020Date of Patent: August 6, 2024Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Chun Yu, Chihong Zhang, Thomas Edwin Frisinger, Richard Hammerstone, Zilin Ying, Heng Qi, Quanquan Xu, Sheng Gu
-
Publication number: 20240046543Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.Type: ApplicationFiled: August 5, 2022Publication date: February 8, 2024Inventors: Yun DU, Eric DEMERS, Andrew Evan GRUBER, Chun YU, Baoguang YANG, Chihong ZHANG, Yuehai DU, Avinash SEETHARAMAIAH, Jonnala Gadda NAGENDRA KUMAR, Gang ZHONG, Zilin YING, Fei WEI
-
Publication number: 20230377240Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.Type: ApplicationFiled: May 18, 2022Publication date: November 23, 2023Inventors: Yun DU, Eric DEMERS, Andrew Evan GRUBER, Chun YU, Chihong ZHANG, Baoguang YANG, Yuehai DU, Gang ZHONG, Avinash SEETHARAMAIAH, Jonnala Gadda NAGENDRA KUMAR
-
Patent number: 11657471Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may generate a table including a plurality of entries to store data associated with at least one of a constant value or an immediate value. The apparatus may also process, upon generating the table, first data including at least one of a constant value or an immediate value. Further, the apparatus may store, in the generated table, at least one of the constant value or the immediate value of the first data. The apparatus may also transmit, upon storing at least one of the constant value or the immediate value in the table, the table including the stored at least one of the constant value or the immediate value of the first data.Type: GrantFiled: June 23, 2021Date of Patent: May 23, 2023Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Chihong Zhang, Jian Jiang, Gang Zhong, Baoguang Yang, Yang Xia, Chun Yu, Eric Demers
-
Publication number: 20230019763Abstract: The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.Type: ApplicationFiled: January 31, 2020Publication date: January 19, 2023Inventors: Yun DU, Andrew Evan GRUBER, Chun YU, Chihong ZHANG, Thomas Edwin FRISINGER, Richard HAMMERSTONE, Zilin YING, Heng QI, Quanquan XU, Sheng GU
-
Publication number: 20220414814Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may generate a table including a plurality of entries to store data associated with at least one of a constant value or an immediate value. The apparatus may also process, upon generating the table, first data including at least one of a constant value or an immediate value. Further, the apparatus may store, in the generated table, at least one of the constant value or the immediate value of the first data. The apparatus may also transmit, upon storing at least one of the constant value or the immediate value in the table, the table including the stored at least one of the constant value or the immediate value of the first data.Type: ApplicationFiled: June 23, 2021Publication date: December 29, 2022Inventors: Yun DU, Andrew Evan GRUBER, Chihong ZHANG, Jian JIANG, Gang ZHONG, Baoguang YANG, Yang XIA, Chun YU, Eric DEMERS
-
Patent number: 11204765Abstract: A graphics processing unit (GPU) utilizes block general purpose registers (bGPRs) to load multiple waves of samples for an instruction group into a processing pipeline and receive processed samples from the pipeline. The GPU acquires a credit for the bGPR for execution of the instruction group for a first wave using a persistent GPR and the bGPR. The GPU refunds the credit upon loading the first wave into the pipeline. The GPU executes a subsequent wave for the instruction group to load samples to the pipeline when at least one credit is available and the pipeline is processing the first wave. The GPU stores an indication of each wave that has been loaded into the pipeline in a queue. The GPU returns samples for a next wave in the queue from the pipeline to the bGPR for further processing when the physical slot of the bGPR is available.Type: GrantFiled: August 26, 2020Date of Patent: December 21, 2021Assignee: QUALCOMM IncorporatedInventors: Yun Du, Fei Wei, Gang Zhong, Minjie Huang, Jian Jiang, Zilin Ying, Baoguang Yang, Yang Xia, Jing Han, Liangxiao Hu, Chihong Zhang, Chun Yu, Andrew Evan Gruber, Eric Demers
-
Patent number: 11132760Abstract: Methods, systems, and devices for graphic processing are described. The methods, systems, and devices may include or be associated with identifying a graphics instruction, determining that the graphics instruction is alias enabled for the device, partitioning an alias lookup table into one or more slots, allocating a slot of the alias lookup table based on the partitioning and determining that the graphics instruction is alias enabled, generating an alias instruction based on allocating the slot of the alias lookup table and determining that the graphics instruction is alias enabled, and processing the alias instruction.Type: GrantFiled: December 13, 2019Date of Patent: September 28, 2021Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Chihong Zhang, Gang Zhong, Jian Jiang, Fei Wei, Minjie Huang, Zilin Ying, Yang Xia, Jing Han, Chun Yu, Eric Demers
-
Patent number: 11094103Abstract: Example techniques are described for generating graphics content by obtaining texture operation instructions corresponding to a texture operation, in response to determining at least one of insufficient general purpose register space is available for the texture operation or insufficient wave slots are available for the texture operation, generating an indication that the texture operation corresponds to a deferred wave, executing the texture operation, sending, to a texture processor, initial texture sample instructions corresponding to the texture operation that was executed, and receiving texture mapped data corresponding to the initial texture sample instructions.Type: GrantFiled: March 26, 2019Date of Patent: August 17, 2021Assignee: QUALCOMM IncorporatedInventors: Yun Du, Andrew Evan Gruber, Chun Yu, Chihong Zhang, Hongjiang Shang, Zilin Ying, Fei Wei
-
Publication number: 20210183005Abstract: Methods, systems, and devices for graphic processing are described. The methods, systems, and devices may include or be associated with identifying a graphics instruction, determining that the graphics instruction is alias enabled for the device, partitioning an alias lookup table into one or more slots, allocating a slot of the alias lookup table based on the partitioning and determining that the graphics instruction is alias enabled, generating an alias instruction based on allocating the slot of the alias lookup table and determining that the graphics instruction is alias enabled, and processing the alias instruction.Type: ApplicationFiled: December 13, 2019Publication date: June 17, 2021Inventors: Yun Du, Andrew Evan Gruber, Chihong Zhang, Gang Zhong, Jian Jiang, Fei Wei, Minjie Huang, Zilin Ying, Yang Xia, Jing Han, Chun Yu, Eric Demers
-
Publication number: 20200312006Abstract: Example techniques are described for generating graphics content by obtaining texture operation instructions corresponding to a texture operation, in response to determining at least one of insufficient general purpose register space is available for the texture operation or insufficient wave slots are available for the texture operation, generating an indication that the texture operation corresponds to a deferred wave, executing the texture operation, sending, to a texture processor, initial texture sample instructions corresponding to the texture operation that was executed, and receiving texture mapped data corresponding to the initial texture sample instructions.Type: ApplicationFiled: March 26, 2019Publication date: October 1, 2020Inventors: Yun DU, Andrew Evan GRUBER, Chun YU, Chihong ZHANG, Hongjiang SHANG, Zilin YING, Fei WEI
-
Patent number: 10558460Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.Type: GrantFiled: December 14, 2016Date of Patent: February 11, 2020Assignee: QUALCOMM IncorporatedInventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
-
Publication number: 20180165092Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.Type: ApplicationFiled: December 14, 2016Publication date: June 14, 2018Inventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
-
Patent number: 9799094Abstract: A method for processing data in a graphics processing unit (GPU) including receiving an instance identifier for an instance and a shader program comprising a preamble code block and a main shader code block, assigning, the instance identifier to a general purpose register at wave creation, allocating address space within the constant memory for instance uniforms, and determining the preamble code block has not been executed and the wave is a first wave of the instance to be executed, based on determining the preamble code block has not been executed and the wave is the first wave to be executed, executing the preamble code block to store the plurality of instance uniforms in the constant memory and based, at least in part, on executing the preamble code block, executing the wave of the plurality of waves using at least one of the plurality of instance constants stored inconstant memory.Type: GrantFiled: May 23, 2016Date of Patent: October 24, 2017Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Richard Hammerstone, Jiaji Liu, Chihong Zhang, Andrew Evan Gruber, Yun Du
-
Patent number: 9747104Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.Type: GrantFiled: May 12, 2014Date of Patent: August 29, 2017Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
-
Patent number: 9665370Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.Type: GrantFiled: August 19, 2014Date of Patent: May 30, 2017Assignee: QUALCOMM IncorporatedInventors: Yun Du, Lin Chen, Andrew Evan Gruber, Chihong Zhang, Chun Yu
-
Patent number: 9645866Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: GrantFiled: September 16, 2011Date of Patent: May 9, 2017Assignee: QUALCOMM IncorporatedInventors: Alexei V. Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Patent number: 9626234Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: GrantFiled: December 15, 2014Date of Patent: April 18, 2017Assignee: QUALCOMM IncorporatedInventors: Alexei Vladimirovich Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Publication number: 20160054998Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.Type: ApplicationFiled: August 19, 2014Publication date: February 25, 2016Inventors: Yun Du, Lin Chen, Andrew Evan Gruber, Chihong Zhang, Chun Yu