Patents by Inventor Guofang Jiao

Guofang Jiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for secondary offloads in graphics processing unit

Patent number: 12236277

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The CU is arranged operably to: fetch execution codes; when each execution code is suitable to be executed by the CU, execute the execution code; and when each execution code is not suitable to be executed by the CU, generate a corresponding entry, and send a request with the corresponding entry to the engine for instructing the engine to allow a component inside or outside of the GPU to complete an operation in accordance with the corresponding entry.

Type: Grant

Filed: April 21, 2023

Date of Patent: February 25, 2025

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
Heterogeneous scheduling for sequential compute DAG

Patent number: 12197955

Abstract: Embodiments of this disclosure provide techniques for splitting a DAG computation model and constructing sub-DAG computation models for inter-node parallel processing. In particular, a method is provided where a plurality of processors split the DAG computation into a plurality of non-interdependent sub-nodes within each respective node of the DAG computation model. The plurality of processors includes at least two different processing unit types. The plurality of processors construct a plurality of sub-DAG computations, each sub-DAG computation including at least a non-interdependent sub-node from different nodes of the DAG computation. The plurality of processors process each of the plurality of sub-DAG computations in parallel.

Type: Grant

Filed: April 28, 2019

Date of Patent: January 14, 2025

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Shouwen Lai, Guofang Jiao
Artificial intelligence chip and data operation method

Patent number: 11809221

Abstract: An artificial intelligence chip and a data operation method are provided. The artificial intelligence chip receives a command carrying first data and address information and includes a chip memory, a computing processor, a base address register, and an extended address processor. The base address register is configured to access an extended address space in the chip memory. The extended address processor receives the command. The extended address processor determines an operation mode of the first data according to the address information. When the address information points to a first section of the extended address space, the extended address processor performs a first operation on the first data. When the address information points to a section other than the first section of the extended address space, the extended address processor notifies the computing processor of the operation mode and the computing processor performs a second operation on the first data.

Type: Grant

Filed: September 8, 2021

Date of Patent: November 7, 2023

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: Zhou Hong, Qin Zheng, ChengPing Luo, GuoFang Jiao, Song Zhao, XiangLiang Yu
Apparatus and method and computer program product for compiling code adapted for secondary offloads in graphics processing unit

Patent number: 11748077

Abstract: The invention relates to a method for compiling code adapted for secondary offloads in a graphics processing unit (GPU). The method, performed by a processing unit, includes: reconstructing execution codes in a first kernel into a second kernel. The second kernel includes an operation table including entries, and computation codes. The computation codes include a portion of the execution codes, and synchronization hooks, and each synchronization hook includes information indicating one entry of the operation table. An order of the portion of the execution codes and the synchronization hooks in the computation codes matches an order of the execution codes in the first kernel, thereby enabling a compute unit (CU) in the GPU to execute the computation codes, and an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation in accordance with content of each entry in the operation table.

Type: Grant

Filed: July 2, 2021

Date of Patent: September 5, 2023

Assignee: SHANGHAI BIREN TECHNOLOGY CO., LTD

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20230267011

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The CU is arranged operably to: fetch execution codes; when each execution code is suitable to be executed by the CU, execute the execution code; and when each execution code is not suitable to be executed by the CU, generate a corresponding entry, and send a request with the corresponding entry to the engine for instructing the engine to allow a component inside or outside of the GPU to complete an operation in accordance with the corresponding entry.

Type: Application

Filed: April 21, 2023

Publication date: August 24, 2023

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG
Apparatus and method for secondary offloads in graphics processing unit

Patent number: 11663044

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The engine is arranged operably to store an operation table including entries. The CU is arranged operably to fetch computation codes including execution codes, and synchronization requests; execute each execution code; and send requests to the engine in accordance with the synchronization requests for instructing the engine to allow components inside or outside of the GPU to complete operations in accordance with the entries of the operation table.

Type: Grant

Filed: July 2, 2021

Date of Patent: May 30, 2023

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
ARTIFICIAL INTELLIGENCE CHIP AND DATA OPERATION METHOD

Publication number: 20220398102

Abstract: An artificial intelligence chip and a data operation method are provided. The artificial intelligence chip receives a command carrying first data and address information and includes a chip memory, a computing processor, a base address register, and an extended address processor. The base address register is configured to access an extended address space in the chip memory. The extended address processor receives the command. The extended address processor determines an operation mode of the first data according to the address information. When the address information points to a first section of the extended address space, the extended address processor performs a first operation on the first data. When the address information points to a section other than the first section of the extended address space, the extended address processor notifies the computing processor of the operation mode and the computing processor performs a second operation on the first data.

Type: Application

Filed: September 8, 2021

Publication date: December 15, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Zhou HONG, Qin ZHENG, ChengPing LUO, GuoFang JIAO, Song ZHAO, XiangLiang YU
METHOD FOR MANAGING RESOURCES, COMPUTING DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20220164232

Abstract: A method for managing resources, a computing device, and a computer-readable storage medium are provided. The method includes obtaining device information of multiple physical devices included in a computing node to confirm physical devices supporting a predetermined hardware resource management method; initializing at least one physical device among the physical devices supporting the predetermined hardware resource management method as a unified device view device; allocating a virtual storage address of the unified device view device, where the virtual storage address is mapped to a physical storage address of the physical device participating in the unified device view; transmitting data to the virtual storage address of the unified device view device; and issuing a computing task to the unified device view device via a task queue for using the physical device participating in the unified device view to execute the computing task.

Type: Application

Filed: November 4, 2021

Publication date: May 26, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Long CHEN, HaiChuan WANG, GuoFang JIAO
APPARATUS AND METHOD AND COMPUTER PROGRAM PRODUCT FOR COMPILING CODE ADAPTED FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20220129255

Abstract: The invention relates to a method for compiling code adapted for secondary offloads in a graphics processing unit (GPU). The method, performed by a processing unit, includes: reconstructing execution codes in a first kernel into a second kernel. The second kernel includes an operation table including entries, and computation codes. The computation codes include a portion of the execution codes, and synchronization hooks, and each synchronization hook includes information indicating one entry of the operation table. An order of the portion of the execution codes and the synchronization hooks in the computation codes matches an order of the execution codes in the first kernel, thereby enabling a compute unit (CU) in the GPU to execute the computation codes, and an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation in accordance with content of each entry in the operation table.

Type: Application

Filed: July 2, 2021

Publication date: April 28, 2022

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG
APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20220129272

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The engine is arranged operably to store an operation table including entries. The CU is arranged operably to fetch computation codes including execution codes, and synchronization requests; execute each execution code; and send requests to the engine in accordance with the synchronization requests for instructing the engine to allow components inside or outside of the GPU to complete operations in accordance with the entries of the operation table.

Type: Application

Filed: July 2, 2021

Publication date: April 28, 2022

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG
Heterogeneous Scheduling for Sequential Compute Dag

Publication number: 20220043688

Abstract: Embodiments of this disclosure provide techniques for splitting a DAG computation model and constructing sub-DAG computation models for inter-node parallel processing. In particular, a method is provided where a plurality of processors split the DAG computation into a plurality of non-interdependent sub-nodes within each respective node of the DAG computation model. The plurality of processors includes at least two different processing unit types. The plurality of processors construct a plurality of sub-DAG computations, each sub-DAG computation including at least a non-interdependent sub-node from different nodes of the DAG computation. The plurality of processors process each of the plurality of sub-DAG computations in parallel.

Type: Application

Filed: April 28, 2019

Publication date: February 10, 2022

Inventors: Shouwen Lai, Guofang Jiao
Heterogenous 3D chip stack for a mobile processor

Patent number: 10658335

Abstract: An integrated circuit package and a system including the integrated circuit package as well as a process for assembling the integrated circuit package are provided. The integrated circuit package includes a first die manufactured on a first wafer utilizing a first node size, a second die manufactured on a second wafer utilizing a second node size, and a substrate coupled to the second die at a plurality of bump sites on a bottom surface of the second die. The first die may be mounted on a top surface of the second die utilizing a hybrid wafer bonding technique, micro bumps, or electrode-less plating.

Type: Grant

Filed: January 25, 2018

Date of Patent: May 19, 2020

Assignee: Futurewei Technologies, Inc.

Inventors: Shiqun Gu, Yu Lin, Jinghua Zhu, Guofang Jiao
General purpose register allocation in streaming processor

Patent number: 10558460

Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

Type: Grant

Filed: December 14, 2016

Date of Patent: February 11, 2020

Assignee: QUALCOMM Incorporated

Inventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
System and method for multi-view rendering

Patent number: 10388060

Abstract: According to one aspect of the present disclosure, there is provided a method that includes: determining a block size according to capabilities of a processor; dividing a first view into a plurality of first pixel blocks having the block size and a second view into a plurality of second pixel blocks having the block size; rasterizing a primitive object to produce a subset of the first pixel blocks for the first view and a subset of the second pixel blocks for the second view; and rendering the subsets of the first and second pixel blocks produced for the primitive object to produce a first image for the first view and a second image for the second view, where the rendering is interleaved between the subsets of the first and second pixel blocks occupied by the primitive object in the first and second views.

Type: Grant

Filed: August 28, 2017

Date of Patent: August 20, 2019

Assignee: Futurewei Technologies, Inc.

Inventor: Guofang Jiao
METHOD AND APPARATUS FOR TENSOR AND CONVOLUTION OPERATIONS

Publication number: 20190179635

Abstract: Aspects of the disclosure provide a circuit that includes a processing circuit, a memory directly coupled to the processing circuit via a dedicated data bus and a control circuit. The processing circuit includes a dot product engine. The dot product engine is configured to perform, in response to an instruction, an operation that includes dot product calculations on a weight input and a pixel sample input, and to store a result of the operation into the memory. The control circuit is configured to control the dot product engine to perform arithmetic operations that include the dot product calculations, and control the dot product engine to perform an accumulation of outputs of the dot product calculations and data received from the memory via the dedicated data bus to generate the result of the operation.

Type: Application

Filed: December 11, 2017

Publication date: June 13, 2019

Applicant: FUTUREWEI TECHNOLOGIES, INC.

Inventors: Guofang Jiao, Zhou Hong, Chengkun Sun
Out-of-order command execution with sliding windows to maintain completion statuses

Patent number: 10241799

Abstract: Techniques are described for reordering commands to improve the speed at which at least one command stream may execute. Prior to distributing commands in the at least one command stream to multiple pipelines, a multimedia processor analyzes any inter-pipeline dependencies and determines the current execution state of the pipelines. The processor may, based on this information, reorder the at least one command stream by prioritizing commands that lack any current dependencies and therefore may be executed immediately by the appropriate pipeline. Such out of order execution of commands in the at least one command stream may increase the throughput of the multimedia processor by increasing the rate at which the command stream is executed.

Type: Grant

Filed: July 16, 2010

Date of Patent: March 26, 2019

Assignee: QUALCOMM Incorporated

Inventors: Alexei V. Bourd, Guofang Jiao
System and Method for Multi-View Rendering

Publication number: 20190066360

Abstract: According to one aspect of the present disclosure, there is provided a method that includes: determining a block size according to capabilities of a processor; dividing a first view into a plurality of first pixel blocks having the block size and a second view into a plurality of second pixel blocks having the block size; rasterizing a primitive object to produce a subset of the first pixel blocks for the first view and a subset of the second pixel blocks for the second view; and rendering the subsets of the first and second pixel blocks produced for the primitive object to produce a first image for the first view and a second image for the second view, where the rendering is interleaved between the subsets of the first and second pixel blocks occupied by the primitive object in the first and second views.

Type: Application

Filed: August 28, 2017

Publication date: February 28, 2019

Inventor: Guofang Jiao
HETEROGENOUS 3D CHIP STACK FOR A MOBILE PROCESSOR

Publication number: 20180366442

Abstract: An integrated circuit package and a system including the integrated circuit package as well as a process for assembling the integrated circuit package are provided. The integrated circuit package includes a first die manufactured on a first wafer utilizing a first node size, a second die manufactured on a second wafer utilizing a second node size, and a substrate coupled to the second die at a plurality of bump sites on a bottom surface of the second die. The first die may be mounted on a top surface of the second die utilizing a hybrid wafer bonding technique, micro bumps, or electrode-less plating.

Type: Application

Filed: January 25, 2018

Publication date: December 20, 2018

Inventors: Shiqun Gu, Yu Lin, Jinghua Zhu, Guofang Jiao
GENERAL PURPOSE REGISTER ALLOCATION IN STREAMING PROCESSOR

Publication number: 20180165092

Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

Type: Application

Filed: December 14, 2016

Publication date: June 14, 2018

Inventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
High order filtering in a graphics processing unit

Patent number: 9852536

Abstract: This disclosure describes techniques for performing high order filtering in a graphics processing unit (GPU). In examples of the disclosure, high order filtering may be implemented on a modified texture engine of a GPU using a single shader instruction. The modified texture engine may be configured to fetch all source pixels needed for the high order filtering and blend them together with pre-loaded filtering weights.

Type: Grant

Filed: August 5, 2014

Date of Patent: December 26, 2017

Assignee: QUALCOMM Incorporated

Inventors: Liang Li, Guofang Jiao, Yunshan Kong, Javier Ignacio Girado

1 2 3 4 5 … next