Patents by Inventor Guansong Zhang

Guansong Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11935175
    Abstract: There is described a method of shading a group of pixels in a fragment shader in a raster graphics pipeline. At least one first pilot pixel of the group of pixels is shaded under a first precision. At least one second pilot pixel of the group of pixels is shaded under a second precision. An error value representing a difference between the first and second pilot pixels is calculated. At least one other pixel of the group of pixels is shaded under the first precision if the error value is greater than an error threshold. The at least one other pixel is shaded under the second precision if the error value is smaller than the error threshold.
    Type: Grant
    Filed: April 7, 2022
    Date of Patent: March 19, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Andrew Siu Doug Lee, Tyler Bryce Nowicki, Guansong Zhang, Yan Luo
  • Publication number: 20240053986
    Abstract: The present disclosure provides methods, systems and apparatus for handling control flow structures in data-parallel architectures. According to a first aspect, a method is provided. The method includes receiving, by a processing unit (PU), a program for execution. The method further includes applying, by the PU, a branching solution to the program to obtain data on control flow structures of the program. The method further includes determining, by the PU and based at least in part on the obtained data, one or more control flow structures of the program to predicate. The method further includes applying, by the PU, predication to the one or more control flow structures of the program.
    Type: Application
    Filed: August 12, 2022
    Publication date: February 15, 2024
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Kevin LIN, Guansong ZHANG
  • Publication number: 20230326117
    Abstract: There is described a method of shading a group of pixels in a fragment shader in a raster graphics pipeline. At least one first pilot pixel of the group of pixels is shaded under a first precision. At least one second pilot pixel of the group of pixels is shaded under a second precision. An error value representing a difference between the first and second pilot pixels is calculated. At least one other pixel of the group of pixels is shaded under the first precision if the error value is greater than an error threshold. The at least one other pixel is shaded under the second precision if the error value is smaller than the error threshold.
    Type: Application
    Filed: April 7, 2022
    Publication date: October 12, 2023
    Inventors: Andrew Siu Doug LEE, Tyler Bryce NOWICKI, Guansong ZHANG, Yan LUO
  • Patent number: 11556319
    Abstract: Systems and methods are described for extending a live range for a virtual scalar register during compiling of a program, comprising: receiving an intermediate representation (IR) of a source code configured for implementing single-instruction-multiple-thread (SIMT) execution, the IR representing the source code as control flow graph including a plurality of basic blocks (BB); and when a virtual scalar register defined in a first BB of the IR is last used in a second BB of the IR that is a divergent BB, modifying the IR to extend the live range of the virtual scalar register.
    Type: Grant
    Filed: September 1, 2020
    Date of Patent: January 17, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Abraham Davidson Fai Chung Chan, Tyler Bryce Nowicki, Guansong Zhang, Ahmed Mohammed ElShafiey Mohammed Eltantawy
  • Publication number: 20220374236
    Abstract: The disclosed systems, structures, and methods are directed to optimizing address calculations in a computer. This is achieved in a compiler that identifies an address calculation in code that is being compiled and transforms the code by splitting the address calculation into a first portion in which an offset is determined and a second portion, in which the offset is combined with a base pointer to generate an address. The address and the base pointer have a first bit-length, and the offset has a second bit-length shorter than the first bit-length. The offset is determined using an operation performed at the second bit-length. In some implementations the first bit-length is 64 bits and the second bit-length is 32 bits.
    Type: Application
    Filed: May 20, 2021
    Publication date: November 24, 2022
    Inventors: Weiwei LI, Guansong ZHANG
  • Patent number: 11483241
    Abstract: Systems and methods for network traffic metering credit distribution and packet processing in a network device having multiple processing units are provided. According to an embodiment, management of multiple meters is distributed among multiple processing units of a network device. Each meter is implemented in a form of a master entry and a slave entry. Responsive to receipt by one of the processing units of a packet subject to rate-limiting by a meter, an action to be taken on the packet is made with reference to a slave entry managed by the processing unit based on available credit of the slave entry. When the action indicates the packet is to be passed: (i) credits associated with passing the packet are deducted from the available credit; and (ii) the packet is passed to a subsequent stage of packet processing; otherwise, the packet is dropped.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: October 25, 2022
    Assignee: Fortinet, Inc.
    Inventors: Mengchen Yu, Guansong Zhang
  • Patent number: 11474798
    Abstract: The disclosed systems, structures, and methods are directed to optimizing memory access to constants in heterogeneous parallel computers, including systems that support OpenCL. This is achieved in an optimizing compiler that transforms program scope constants and constants at the outermost scope of kernels into implicit constant pointer arguments. The optimizing compiler also attempts to determine access patterns for constants at compile-time and places the constants in a variety of memory types available in a compute device architecture based on these access patterns.
    Type: Grant
    Filed: August 24, 2020
    Date of Patent: October 18, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Guansong Zhang, Weiwei Li
  • Publication number: 20220103474
    Abstract: Systems and methods for network traffic metering credit distribution and packet processing in a network device having multiple processing units are provided. According to an embodiment, management of multiple meters is distributed among multiple processing units of a network device. Each meter is implemented in a form of a master entry and a slave entry. Responsive to receipt by one of the processing units of a packet subject to rate-limiting by a meter, an action to be taken on the packet is made with reference to a slave entry managed by the processing unit based on available credit of the slave entry. When the action indicates the packet is to be passed: (i) credits associated with passing the packet are deducted from the available credit; and (ii) the packet is passed to a subsequent stage of packet processing; otherwise, the packet is dropped.
    Type: Application
    Filed: September 30, 2020
    Publication date: March 31, 2022
    Applicant: Fortinet, Inc.
    Inventors: Mengchen Yu, Guansong Zhang
  • Publication number: 20220066783
    Abstract: Systems and methods are described for extending a live range for a virtual scalar register during compiling of a program, comprising: receiving an intermediate representation (IR) of a source code configured for implementing single-instruction-multiple-thread (SIMT) execution, the IR representing the source code as control flow graph including a plurality of basic blocks (BB); and when a virtual scalar register defined in a first BB of the IR is last used in a second BB of the IR that is a divergent BB, modifying the IR to extend the live range of the virtual scalar register.
    Type: Application
    Filed: September 1, 2020
    Publication date: March 3, 2022
    Inventors: Abraham Davidson Fai Chung CHAN, Tyler Bryce NOWICKI, Guansong ZHANG, Ahmed Mohammed ElShafiey Mohammed ELTANTAWY
  • Publication number: 20220058008
    Abstract: The disclosed systems, structures, and methods are directed to optimizing memory access to constants in heterogeneous parallel computers, including systems that support OpenCL. This is achieved in an optimizing compiler that transforms program scope constants and constants at the outermost scope of kernels into implicit constant pointer arguments. The optimizing compiler also attempts to determine access patterns for constants at compile-time and places the constants in a variety of memory types available in a compute device architecture based on these access patterns.
    Type: Application
    Filed: August 24, 2020
    Publication date: February 24, 2022
    Inventors: Guansong ZHANG, Weiwei LI
  • Patent number: 10009295
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, presence of outbound payload data, distributed across a first and second payload buffer, within a user memory space of a network device that has been generated by a user process is determined by a bus/memory interface or a network interface unit. The payload data is fetched by performing direct virtual memory addressing of the user memory space including mapping virtual addresses of the payload buffers to corresponding physical addresses, including: (i) when the payload buffers are noncontiguous, then retrieving the outbound payload data with reference to multiple buffer descriptors having starting virtual addresses of the payload buffers and (ii) when they are contiguous, then retrieving the outbound payload data with reference to a single buffer descriptor. The outbound payload data is then segmented across one or more TCP packets.
    Type: Grant
    Filed: November 18, 2017
    Date of Patent: June 26, 2018
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Publication number: 20180077087
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, presence of outbound payload data, distributed across a first and second payload buffer, within a user memory space of a network device that has been generated by a user process is determined by a bus/memory interface or a network interface unit. The payload data is fetched by performing direct virtual memory addressing of the user memory space including mapping virtual addresses of the payload buffers to corresponding physical addresses, including: (i) when the payload buffers are noncontiguous, then retrieving the outbound payload data with reference to multiple buffer descriptors having starting virtual addresses of the payload buffers and (ii) when they are contiguous, then retrieving the outbound payload data with reference to a single buffer descriptor. The outbound payload data is then segmented across one or more TCP packets.
    Type: Application
    Filed: November 18, 2017
    Publication date: March 15, 2018
    Applicant: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 9825885
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, presence of outbound payload data, distributed across a first and second payload buffer, within a user memory space of a network device that has been generated by a user process is determined by a bus/memory interface or a network interface unit. The payload data is fetched by performing direct virtual memory addressing of the user memory space including mapping virtual addresses of the payload buffers to corresponding physical addresses, including: (i) when the payload buffers are noncontiguous, then retrieving the outbound payload data with reference to multiple buffer descriptors having starting virtual addresses of the payload buffers and (ii) when they are contiguous, then retrieving the outbound payload data with reference to a single buffer descriptor. The outbound payload data is then segmented across one or more TCP packets.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: November 21, 2017
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Publication number: 20160352652
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, presence of outbound payload data, distributed across a first and second payload buffer, within a user memory space of a network device that has been generated by a user process is determined by a bus/memory interface or a network interface unit. The payload data is fetched by performing direct virtual memory addressing of the user memory space including mapping virtual addresses of the payload buffers to corresponding physical addresses, including: (i) when the payload buffers are noncontiguous, then retrieving the outbound payload data with reference to multiple buffer descriptors having starting virtual addresses of the payload buffers and (ii) when they are contiguous, then retrieving the outbound payload data with reference to a single buffer descriptor. The outbound payload data is then segmented across one or more TCP packets.
    Type: Application
    Filed: June 30, 2016
    Publication date: December 1, 2016
    Applicant: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Publication number: 20160234352
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of a network device is fetched by an interface of the network device by performing direct virtual memory addressing of a user memory space of a system memory of the network device on behalf of a network interface unit of the network device. The direct virtual memory addressing maps physical addresses of various portions of the payload data to corresponding virtual addresses. The payload data is segmented by the network interface unit across one or more packets.
    Type: Application
    Filed: April 18, 2016
    Publication date: August 11, 2016
    Applicant: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 9401976
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of a network device is fetched by an interface of the network device by performing direct virtual memory addressing of a user memory space of a system memory of the network device on behalf of a network interface unit of the network device. The direct virtual memory addressing maps physical addresses of various portions of the payload data to corresponding virtual addresses. The payload data is segmented by the network interface unit across one or more packets.
    Type: Grant
    Filed: April 18, 2016
    Date of Patent: July 26, 2016
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 9361078
    Abstract: A compiler method for exploiting data value locality for computation reuse. When a code region having single entry and exit points and in which a potential computation reuse opportunity exists is identified during runtime, a helper thread is created separate from the master thread. One of the helper thread and master thread performs a computation specified in the code region, and the other of the helper thread and master thread looks up a value of the computation previously executed and stored in a lookup table. If the value of the computation previously executed is located in the lookup table, the other thread retrieves the value from the table, and ignores the computation performed by the thread. If the value of the computation is not located, the other thread obtains a result of the computation performed by the thread and stores the result in the lookup table for future computation reuse.
    Type: Grant
    Filed: March 19, 2007
    Date of Patent: June 7, 2016
    Assignee: International Business Machines Corporation
    Inventors: Yaoqing Gao, Liangxiao Hu, Guansong Zhang, Peng Zhao
  • Publication number: 20160134724
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of a network device is fetched by an interface of the network device by performing direct virtual memory addressing of a user memory space of a system memory of the network device on behalf of a network interface unit of the network device. The direct virtual memory addressing maps physical addresses of various portions of the payload data to corresponding virtual addresses. The payload data is segmented by the network interface unit across one or more packets.
    Type: Application
    Filed: January 14, 2016
    Publication date: May 12, 2016
    Applicant: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 9319490
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of the computer system is fetched by an interface of the computer system by performing direct virtual memory addressing of a user memory space of a system memory of the computer system on behalf of a network processor of the computer system. The direct virtual memory addressing maps a physical address of the payload data to a virtual address. The payload data is segmented by the network processor across one or more packets.
    Type: Grant
    Filed: December 12, 2014
    Date of Patent: April 19, 2016
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang
  • Patent number: 9319491
    Abstract: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, payload data originated by a user process running on a host processor of a network device is fetched by an interface of the network device by performing direct virtual memory addressing of a user memory space of a system memory of the network device on behalf of a network interface unit of the network device. The direct virtual memory addressing maps physical addresses of various portions of the payload data to corresponding virtual addresses. The payload data is segmented by the network interface unit across one or more packets.
    Type: Grant
    Filed: January 14, 2016
    Date of Patent: April 19, 2016
    Assignee: Fortinet, Inc.
    Inventors: Xu Zhou, David Chen, Lin Huang, Guansong Zhang