Patents by Inventor ChengPing LUO

ChengPing LUO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD OF REDUCING CACHE THRASHING IN A PROCESSING SYSTEM AND RELATED PROCESSING SYSTEM

Publication number: 20250252057

Abstract: A method of reducing cache thrashing in a processing system is provided. M threads are issued to process a workload, and a memory access request associated with the M threads is transmitted to a first-level cache of the processing system. The memory access request is then transmitted to a second-level cache of the processing system in response to the first cache miss at the first-level cache. The memory access request is transmitted to a main memory of the processing system in response to the second cache miss at the second-level cache. The value of M is decreased when the relationship between the hit rates of the second-level cache and the first-level cache satisfies a predetermined criterion. A storage capacity and an access latency of the second-level cache are higher than those of the first-level cache.

Type: Application

Filed: January 24, 2025

Publication date: August 7, 2025

Applicant: MEDIATEK INC.

Inventors: Yu-Sheng Lin, Yoav Harel, Chengping Luo, You-Ming Tsao, Yu Bai
Computing system, computing processor and data processing method

Patent number: 12289227

Abstract: The present disclosure provides a computing system, a computing processor and a data processing method for the computing processor. The computing system includes: multiple computing clusters, each computing cluster includes multiple computing nodes, and each computing node includes multiple computing processors. At least some computing clusters among the computing clusters, at least some computing nodes in each computing cluster and at least some computing processors of each computing node are connected through direct links. Each computing processor of at least some computing processors of the computing node is configured with a local routing table, which is configured for the computing processor to determine, based on the local routing table, the next direct link through which a data packet performs routing from a data source to a data destination, and the computing processor forwards the data packet through the next direct link.

Type: Grant

Filed: May 11, 2022

Date of Patent: April 29, 2025

Assignee: Shanghai Biren Technology Co.,Ltd

Inventors: Zhou Hong, Qin Zheng, ChengPing Luo
Method for processing data using computing array and computing system

Patent number: 12277450

Abstract: A method for processing data using a computing array is provided. In the method, source data is allocated to each of multiple computing nodes in a computing array. The source data includes multiple blocks. At a computing node among the computing nodes, in at least one iteration process, multiple blocks are respectively received from multiple other computing nodes other than the computing node among the computing nodes using multiple first type computing devices among a set of computing devices included in the computing node. A processing operation is executed on the received blocks using the first type computing devices respectively to generate multiple intermediate results. The processing operation is executed on the intermediate results to obtain a first part of a final result of executing the processing operation on the source data. A corresponding computer system is also provided.

Type: Grant

Filed: May 18, 2022

Date of Patent: April 15, 2025

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: Zhou Hong, ChengPing Luo, Qin Zheng
Apparatus and method for secondary offloads in graphics processing unit

Patent number: 12236277

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The CU is arranged operably to: fetch execution codes; when each execution code is suitable to be executed by the CU, execute the execution code; and when each execution code is not suitable to be executed by the CU, generate a corresponding entry, and send a request with the corresponding entry to the engine for instructing the engine to allow a component inside or outside of the GPU to complete an operation in accordance with the corresponding entry.

Type: Grant

Filed: April 21, 2023

Date of Patent: February 25, 2025

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
METHOD AND SYSTEM OF PROCESSING GRAPHICS DATA WITH TILE-BASED RENDERING PIPELINE

Publication number: 20240346741

Abstract: In aspects of the disclosure, a method, a system, and a computer-readable medium, are provided. The method for processing graphics data with a graphics rendering pipeline comprising a mesh shader and a tiler, comprising outputting, by the mesh shader in response to an input of the graphics data, legacy mesh shader output parameters including vertices and primitives, and additional data with a meshlet bounding-box, or axis-aligned bounding box (AABB) structure; sending the AABB to the tiler as an input, and generating, by the tiler, a visibility stream according to the AABB, wherein each entity of the visibility stream indicates that the AABB is fully visible, partially visible, or invisible in the view frustum; and sending the visibility stream back to the tiler as a further input along with the legacy mesh shader output parameters for coming rasterization in a fragment pass.

Type: Application

Filed: April 4, 2024

Publication date: October 17, 2024

Inventors: Chengping Luo, You-Ming Tsao, Bozhan Chen, Sheng-Wen Huang
Application Programming Interface to Indicate Bounding Volume Output

Publication number: 20240338881

Abstract: An application programming interface includes a mesh shader, a rasterizer, and a fragment shader. The mesh shader is used to process 3-dimensional objects and output vertices, primitives, and a plurality of bounding volumes of the 3-dimensional objects. The rasterizer is linked to the mesh shader, and used to project the vertices, the primitives, and the plurality of bounding volumes into 2-dimensional fragments. The fragment shader is linked to the rasterizer, and used to output a 2-dimensional image according to the 2-dimensional fragments.

Type: Application

Filed: April 2, 2024

Publication date: October 10, 2024

Applicant: MEDIATEK INC.

Inventors: Chengping Luo, You-Ming Tsao, Bozhan Chen, Sheng-Wen Huang
Interconnection device

Patent number: 12095654

Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.

Type: Grant

Filed: October 15, 2023

Date of Patent: September 17, 2024

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: Qin Zheng, Zhou Hong, YuFei Zhang, Lin Chen, ChengKun Sun, Tong Sun, ChengPing Luo, HaiChuan Wang
INTERCONNECTION DEVICE

Publication number: 20240048475

Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.

Type: Application

Filed: October 15, 2023

Publication date: February 8, 2024

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Qin ZHENG, Zhou HONG, YuFei ZHANG, Lin CHEN, ChengKun SUN, Tong SUN, ChengPing LUO, HaiChuan WANG
Information processing method, interconnection device and computer-readable storage medium

Patent number: 11855878

Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.

Type: Grant

Filed: November 11, 2021

Date of Patent: December 26, 2023

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: Qin Zheng, Zhou Hong, YuFei Zhang, Lin Chen, ChengKun Sun, Tong Sun, ChengPing Luo, HaiChuan Wang
Artificial intelligence chip and data operation method

Patent number: 11809221

Abstract: An artificial intelligence chip and a data operation method are provided. The artificial intelligence chip receives a command carrying first data and address information and includes a chip memory, a computing processor, a base address register, and an extended address processor. The base address register is configured to access an extended address space in the chip memory. The extended address processor receives the command. The extended address processor determines an operation mode of the first data according to the address information. When the address information points to a first section of the extended address space, the extended address processor performs a first operation on the first data. When the address information points to a section other than the first section of the extended address space, the extended address processor notifies the computing processor of the operation mode and the computing processor performs a second operation on the first data.

Type: Grant

Filed: September 8, 2021

Date of Patent: November 7, 2023

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: Zhou Hong, Qin Zheng, ChengPing Luo, GuoFang Jiao, Song Zhao, XiangLiang Yu
Apparatus and method and computer program product for compiling code adapted for secondary offloads in graphics processing unit

Patent number: 11748077

Abstract: The invention relates to a method for compiling code adapted for secondary offloads in a graphics processing unit (GPU). The method, performed by a processing unit, includes: reconstructing execution codes in a first kernel into a second kernel. The second kernel includes an operation table including entries, and computation codes. The computation codes include a portion of the execution codes, and synchronization hooks, and each synchronization hook includes information indicating one entry of the operation table. An order of the portion of the execution codes and the synchronization hooks in the computation codes matches an order of the execution codes in the first kernel, thereby enabling a compute unit (CU) in the GPU to execute the computation codes, and an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation in accordance with content of each entry in the operation table.

Type: Grant

Filed: July 2, 2021

Date of Patent: September 5, 2023

Assignee: SHANGHAI BIREN TECHNOLOGY CO., LTD

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20230267011

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The CU is arranged operably to: fetch execution codes; when each execution code is suitable to be executed by the CU, execute the execution code; and when each execution code is not suitable to be executed by the CU, generate a corresponding entry, and send a request with the corresponding entry to the engine for instructing the engine to allow a component inside or outside of the GPU to complete an operation in accordance with the corresponding entry.

Type: Application

Filed: April 21, 2023

Publication date: August 24, 2023

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG
Apparatus and method for secondary offloads in graphics processing unit

Patent number: 11663044

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The engine is arranged operably to store an operation table including entries. The CU is arranged operably to fetch computation codes including execution codes, and synchronization requests; execute each execution code; and send requests to the engine in accordance with the synchronization requests for instructing the engine to allow components inside or outside of the GPU to complete operations in accordance with the entries of the operation table.

Type: Grant

Filed: July 2, 2021

Date of Patent: May 30, 2023

Assignee: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan Wang, Song Zhao, GuoFang Jiao, ChengPing Luo, Zhou Hong
ARTIFICIAL INTELLIGENCE CHIP AND DATA OPERATION METHOD

Publication number: 20220398102

Abstract: An artificial intelligence chip and a data operation method are provided. The artificial intelligence chip receives a command carrying first data and address information and includes a chip memory, a computing processor, a base address register, and an extended address processor. The base address register is configured to access an extended address space in the chip memory. The extended address processor receives the command. The extended address processor determines an operation mode of the first data according to the address information. When the address information points to a first section of the extended address space, the extended address processor performs a first operation on the first data. When the address information points to a section other than the first section of the extended address space, the extended address processor notifies the computing processor of the operation mode and the computing processor performs a second operation on the first data.

Type: Application

Filed: September 8, 2021

Publication date: December 15, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Zhou HONG, Qin ZHENG, ChengPing LUO, GuoFang JIAO, Song ZHAO, XiangLiang YU
METHOD FOR PROCESSING DATA USING COMPUTING ARRAY AND COMPUTING SYSTEM

Publication number: 20220374280

Abstract: A method for processing data using a computing array is provided. In the method, source data is allocated to each of multiple computing nodes in a computing array. The source data includes multiple blocks. At a computing node among the computing nodes, in at least one iteration process, multiple blocks are respectively received from multiple other computing nodes other than the computing node among the computing nodes using multiple first type computing devices among a set of computing devices included in the computing node. A processing operation is executed on the received blocks using the first type computing devices respectively to generate multiple intermediate results. The processing operation is executed on the intermediate results to obtain a first part of a final result of executing the processing operation on the source data. A corresponding computer system is also provided.

Type: Application

Filed: May 18, 2022

Publication date: November 24, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Zhou HONG, ChengPing LUO, Qin ZHENG
COMPUTING SYSTEM, COMPUTING PROCESSOR AND DATA PROCESSING METHOD

Publication number: 20220368619

Abstract: The present disclosure provides a computing system, a computing processor and a data processing method for the computing processor. The computing system includes: multiple computing clusters, each computing cluster includes multiple computing nodes, and each computing node includes multiple computing processors. At least some computing clusters among the computing clusters, at least some computing nodes in each computing cluster and at least some computing processors of each computing node are connected through direct links. Each computing processor of at least some computing processors of the computing node is configured with a local routing table, which is configured for the computing processor to determine, based on the local routing table, the next direct link through which a data packet performs routing from a data source to a data destination, and the computing processor forwards the data packet through the next direct link.

Type: Application

Filed: May 11, 2022

Publication date: November 17, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Zhou HONG, Qin ZHENG, ChengPing LUO
INFORMATION PROCESSING METHOD, INTERCONNECTION DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20220158929

Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.

Type: Application

Filed: November 11, 2021

Publication date: May 19, 2022

Applicant: Shanghai Biren Technology Co.,Ltd

Inventors: Qin ZHENG, Zhou HONG, YuFei ZHANG, Lin CHEN, ChengKun SUN, Tong SUN, ChengPing LUO, HaiChuan WANG
APPARATUS AND METHOD AND COMPUTER PROGRAM PRODUCT FOR COMPILING CODE ADAPTED FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20220129255

Abstract: The invention relates to a method for compiling code adapted for secondary offloads in a graphics processing unit (GPU). The method, performed by a processing unit, includes: reconstructing execution codes in a first kernel into a second kernel. The second kernel includes an operation table including entries, and computation codes. The computation codes include a portion of the execution codes, and synchronization hooks, and each synchronization hook includes information indicating one entry of the operation table. An order of the portion of the execution codes and the synchronization hooks in the computation codes matches an order of the execution codes in the first kernel, thereby enabling a compute unit (CU) in the GPU to execute the computation codes, and an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation in accordance with content of each entry in the operation table.

Type: Application

Filed: July 2, 2021

Publication date: April 28, 2022

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG
APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

Publication number: 20220129272

Abstract: The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The engine is arranged operably to store an operation table including entries. The CU is arranged operably to fetch computation codes including execution codes, and synchronization requests; execute each execution code; and send requests to the engine in accordance with the synchronization requests for instructing the engine to allow components inside or outside of the GPU to complete operations in accordance with the entries of the operation table.

Type: Application

Filed: July 2, 2021

Publication date: April 28, 2022

Applicant: Shanghai Biren Technology Co., Ltd

Inventors: HaiChuan WANG, Song ZHAO, GuoFang JIAO, ChengPing LUO, Zhou HONG