Patents by Inventor Quanlu Zhang

Quanlu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PERFORMING DYNAMIC SPARSE COMPUTATION ON DENSE COMPUTATION-EFFICIENT COMPUTING DEVICES

Publication number: 20240403618

Abstract: Embodiments of the present disclosure include techniques processing dynamically sparse neural networks as dense computations. A permutation is performed to translate an input tensor from a sparse format into a dense format. Once in a dense format, dense computation can be performed to generate output data that is also in the dense format. A reverse permutation may then be performed to translate the output data back into the sparse format. An analysis of the operator is performed prior to runtime to determine the one or more dimensions of the tensor expression associated with the operator that are permutation invariant. The permutation may permutate the input tensor across dimensions that are permutation invariant.

Type: Application

Filed: May 30, 2023

Publication date: December 5, 2024

Inventors: Ningxin ZHENG, Huiqiang JIANG, Quanlu ZHANG, Yuqing YANG, Lingxiao MA, Zhenhua HAN, Lili QIU, Fan YANG, Mao YANG, Lidong ZHOU
PARALLELIZATION PLAN GENERATION FOR A NEURAL NETWORK

Publication number: 20240403598

Abstract: Embodiments of the present disclosure include techniques for designing and generating a parallelization plan for a neural network so that workloads in the neural network may be split amongst multiple devices. Operators and tensors in the neural network are transformed into a set of functionally equivalent operators and tensors. These functionally equivalent operators and tensors are then scheduled to separate devices for execution.

Type: Application

Filed: June 1, 2023

Publication date: December 5, 2024

Inventors: Youshan MIAO, Fan YANG, Quanlu ZHANG, Saeed MALEKI, Xu CAO, Yi ZHU, Mao YANG, Lidong ZHOU, Zhiqi LIN
SPARSITY FOR NEURAL NETWORK MODELS BASED ON SPARSITY ATTRIBUTES

Publication number: 20230419116

Abstract: Embodiments of the present disclosure include systems and methods for providing sparsity for neural network models based on sparsity attributes. A first neural network model definition is received. The first neural network model definition specifies a neural network model comprising a set of tensors and a set of sparsity attribute values for elements of a tensor in the set of tensors. The set of sparsity attribute values for the tensor are propagated to elements of a subset of the set of tensors to form a second neural network model definition. The neural network model is generated based on the second neural network model definition.

Type: Application

Filed: June 27, 2022

Publication date: December 28, 2023

Inventors: Ningxin ZHENG, Quanlu ZHANG, Yuqing YANG, Lingxiao MA, Fan YANG, Yang WANG, Mao YANG, Lidong ZHOU
DYNAMIC ALLOCATION OF COMPUTING RESOURCES

Publication number: 20220229701

Abstract: According to implementations of the subject matter, a solution of dynamic management of computing resource is provided. In the solution, a first request for using a target number of computing resource in a set of computing resources is received, wherein at least one free computing resource of the set of computing resources is organized into at least one free resource group. When it is determined that a free matching resource group is absent from the first resource group and a free redundant resource group is present in at least one free resource group, the target number of computing resources are allocated for the first request by splitting the free redundant resource group, wherein the number of resources in the free redundant resource group is greater than the target number. Therefore, the dynamic allocation of computing resources is enabled.

Type: Application

Filed: May 4, 2020

Publication date: July 21, 2022

Inventors: Quanlu Zhang, Lidong Zhou, Mao Yang, Fan Yang, Hanyu Zhao, Zhenhua Han

PERFORMING DYNAMIC SPARSE COMPUTATION ON DENSE COMPUTATION-EFFICIENT COMPUTING DEVICES

PARALLELIZATION PLAN GENERATION FOR A NEURAL NETWORK

SPARSITY FOR NEURAL NETWORK MODELS BASED ON SPARSITY ATTRIBUTES

DYNAMIC ALLOCATION OF COMPUTING RESOURCES