Patents by Inventor Quanlu Zhang

Quanlu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240403618
    Abstract: Embodiments of the present disclosure include techniques processing dynamically sparse neural networks as dense computations. A permutation is performed to translate an input tensor from a sparse format into a dense format. Once in a dense format, dense computation can be performed to generate output data that is also in the dense format. A reverse permutation may then be performed to translate the output data back into the sparse format. An analysis of the operator is performed prior to runtime to determine the one or more dimensions of the tensor expression associated with the operator that are permutation invariant. The permutation may permutate the input tensor across dimensions that are permutation invariant.
    Type: Application
    Filed: May 30, 2023
    Publication date: December 5, 2024
    Inventors: Ningxin ZHENG, Huiqiang JIANG, Quanlu ZHANG, Yuqing YANG, Lingxiao MA, Zhenhua HAN, Lili QIU, Fan YANG, Mao YANG, Lidong ZHOU
  • Publication number: 20240403598
    Abstract: Embodiments of the present disclosure include techniques for designing and generating a parallelization plan for a neural network so that workloads in the neural network may be split amongst multiple devices. Operators and tensors in the neural network are transformed into a set of functionally equivalent operators and tensors. These functionally equivalent operators and tensors are then scheduled to separate devices for execution.
    Type: Application
    Filed: June 1, 2023
    Publication date: December 5, 2024
    Inventors: Youshan MIAO, Fan YANG, Quanlu ZHANG, Saeed MALEKI, Xu CAO, Yi ZHU, Mao YANG, Lidong ZHOU, Zhiqi LIN
  • Publication number: 20230419116
    Abstract: Embodiments of the present disclosure include systems and methods for providing sparsity for neural network models based on sparsity attributes. A first neural network model definition is received. The first neural network model definition specifies a neural network model comprising a set of tensors and a set of sparsity attribute values for elements of a tensor in the set of tensors. The set of sparsity attribute values for the tensor are propagated to elements of a subset of the set of tensors to form a second neural network model definition. The neural network model is generated based on the second neural network model definition.
    Type: Application
    Filed: June 27, 2022
    Publication date: December 28, 2023
    Inventors: Ningxin ZHENG, Quanlu ZHANG, Yuqing YANG, Lingxiao MA, Fan YANG, Yang WANG, Mao YANG, Lidong ZHOU
  • Publication number: 20220229701
    Abstract: According to implementations of the subject matter, a solution of dynamic management of computing resource is provided. In the solution, a first request for using a target number of computing resource in a set of computing resources is received, wherein at least one free computing resource of the set of computing resources is organized into at least one free resource group. When it is determined that a free matching resource group is absent from the first resource group and a free redundant resource group is present in at least one free resource group, the target number of computing resources are allocated for the first request by splitting the free redundant resource group, wherein the number of resources in the free redundant resource group is greater than the target number. Therefore, the dynamic allocation of computing resources is enabled.
    Type: Application
    Filed: May 4, 2020
    Publication date: July 21, 2022
    Inventors: Quanlu Zhang, Lidong Zhou, Mao Yang, Fan Yang, Hanyu Zhao, Zhenhua Han