Patents by Inventor Chihong Zhang
Chihong Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20160054998Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.Type: ApplicationFiled: August 19, 2014Publication date: February 25, 2016Inventors: Yun Du, Lin Chen, Andrew Evan Gruber, Chihong Zhang, Chun Yu
-
Patent number: 9256408Abstract: Aspects of this disclosure relate to a method of compiling high-level software instructions to generate low-level software instructions. In an example, the method includes identifying, with a computing device, a set of high-level (HL) control flow (CF) instructions having one or more associated texture load instructions, wherein the set of HL CF instructions comprises one or more branches. The method also includes converting, with the computing device, the identified set of HL CF instructions to low-level (LL) instructions having a predicate structure. The method also includes outputting the converted (LL) instructions having the predicate structure.Type: GrantFiled: March 7, 2012Date of Patent: February 9, 2016Assignee: QUALCOMM IncorporatedInventors: Weifeng Zhang, Chihong Zhang
-
Publication number: 20150324196Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.Type: ApplicationFiled: May 12, 2014Publication date: November 12, 2015Applicant: QUALCOMM IncorporatedInventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
-
Publication number: 20150097849Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: ApplicationFiled: December 15, 2014Publication date: April 9, 2015Inventors: Alexei Vladimirovich Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Patent number: 8937622Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: GrantFiled: September 16, 2011Date of Patent: January 20, 2015Assignee: QUALCOMM IncorporatedInventors: Alexei V. Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Patent number: 8732679Abstract: A new computer-compiler architecture includes code analysis processes in which loops present in an intermediate instruction set are transformed into more efficient loops prior to fully executing the intermediate instruction set. The compiler architecture starts by generating the equivalent intermediate instructions for the original high level source code. For each loop in the intermediate instructions, a total cycle cost is calculated using a cycle cost table associated with the compiler. The compiler then generates intermediate code for replacement loops in which all conversion instructions are removed. The cycle costs for these new transformed loops are then compared against the total cycle cost for the original loops. If the total cycle costs exceed the new cycle costs, the compiler will replace the original loops in the intermediate instructions with the new transformed loops prior to generation of final code using the instruction set of the processor.Type: GrantFiled: March 16, 2010Date of Patent: May 20, 2014Assignee: QUALCOMM IncorporatedInventors: Sumesh Udayakumaran, Chihong Zhang
-
Publication number: 20130191816Abstract: Aspects of this disclosure relate to a method of compiling high-level software instructions to generate low-level software instructions. In an example, the method includes identifying, with a computing device, a set of high-level (HL) control flow (CF) instructions having one or more associated texture load instructions, wherein the set of HL CF instructions comprises one or more branches. The method also includes converting, with the computing device, the identified set of HL CF instructions to low-level (LL) instructions having a predicate structure. The method also includes outputting the converted (LL) instructions having the predicate structure.Type: ApplicationFiled: March 7, 2012Publication date: July 25, 2013Applicant: QUALCOMM INCORPORATEDInventors: Weifeng Zhang, Chihong Zhang
-
Patent number: 8495602Abstract: The present disclosure includes a shader compiler system and method. In an embodiment, a shader compiler includes a decoder to translate an instruction having a vector representation to a unified instruction representation. The shader compiler also includes an encoder to translate an instruction having a unified instruction representation to a processor executable instruction.Type: GrantFiled: September 28, 2007Date of Patent: July 23, 2013Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Guofang Jiao, Chihong Zhang, Junhong Sun
-
Patent number: 8379032Abstract: The present disclosure includes system and method of mapping shader variables into physical registers. In an embodiment, a graphics processing unit (GPU) and a memory coupled to the GPU are disclosed. The memory includes a processor readable data file that has a register file portion. The register file portion has a rectangular structure including a plurality of data items. At least two of the plurality of data items corresponding to data elements of a shader program. The data elements have different data storage types.Type: GrantFiled: September 28, 2007Date of Patent: February 19, 2013Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Junhong Sun, Guofang Jiao, Chihong Zhang, Lingjun Chen
-
Publication number: 20120284701Abstract: In general techniques are described for efficient conditional flow control (CFC) compilation. An apparatus comprising a processor executing a compiler that includes at least one translation module may perform these techniques. The translation module translates a first set of high-level (HL) CFC software to a functionally equivalent but different second set of HL CFC software instructions. The compiler then compiles the first and second sets of high-level CFC software instructions to respective first and second sets of low-level (LL) CFC software instructions. An evaluation module of the compiler evaluates the first and second sets of LL CFC software instructions to determine which of the first and second sets of the low-level CFC software instructions is more efficient as measured in terms of at least one execution metric and outputs the one of the first and second low-level CFC software instructions determined to be most efficient.Type: ApplicationFiled: May 6, 2011Publication date: November 8, 2012Applicant: QUALCOMM IncorporatedInventors: Ming-Chang Tsai, Chihong Zhang
-
Publication number: 20120069035Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: ApplicationFiled: September 16, 2011Publication date: March 22, 2012Applicant: QUALCOMM IncorporatedInventors: Alexei V. Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Publication number: 20120069029Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.Type: ApplicationFiled: September 16, 2011Publication date: March 22, 2012Applicant: QUALCOMM IncorporatedInventors: Alexei V. Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
-
Publication number: 20110231830Abstract: A new computer-compiler architecture includes code analysis processes in which loops present in an intermediate instruction set are transformed into more efficient loops prior to fully executing the intermediate instruction set. The compiler architecture starts by generating the equivalent intermediate instructions for the original high level source code. For each loop in the intermediate instructions, a total cycle cost is calculated using a cycle cost table associated with the compiler. The compiler then generates intermediate code for replacement loops in which all conversion instructions are removed. The cycle costs for these new transformed loops are then compared against the total cycle cost for the original loops. If the total cycle costs exceed the new cycle costs, the compiler will replace the original loops in the intermediate instructions with the new transformed loops prior to generation of final code using the instruction set of the processor.Type: ApplicationFiled: March 16, 2010Publication date: September 22, 2011Applicant: QUALCOMM INCORPORATEDInventors: Sumesh Udayakumaran, Chihong Zhang
-
Publication number: 20090085919Abstract: The present disclosure includes system and method of mapping shader variables into physical registers. In an embodiment, a graphics processing unit (GPU) and a memory coupled to the GPU are disclosed. The memory includes a processor readable data file that has a register file portion. The register file portion has a rectangular structure including a plurality of data items. At least two of the plurality of data items corresponding to data elements of a shader program. The data elements have different data storage types.Type: ApplicationFiled: September 28, 2007Publication date: April 2, 2009Applicant: QUALCOMM INCORPORATEDInventors: Lin Chen, Junhong Sun, Guofang Jiao, Chihong Zhang, Lingjun Chen
-
Publication number: 20090089763Abstract: The present disclosure includes a shader compiler system and method. In an embodiment, a shader compiler includes a decoder to translate an instruction having a vector representation to a unified instruction representation. The shader compiler also includes an encoder to translate an instruction having a unified instruction representation to a processor executable instruction.Type: ApplicationFiled: September 28, 2007Publication date: April 2, 2009Applicant: QUALCOMM IncorporatedInventors: Lin Chen, Guofang Jiao, Chihong Zhang, Junhong Sun