Patents by Inventor Lingjie Xu

Lingjie Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240037179
    Abstract: A data processing method and data processing apparatus are provided. The data processing method includes: acquiring multiple input tensors as input parameters for calculation process; for each input tensor, using M input sub-tensors that are combined to represent the input tensor; for each of the input tensors, replacing the input tensors with the M input sub-tensors that are combined to represent the input tensor, and performing the calculation process to obtain a calculation result. The data processing method increases the applicable scenarios of calculation process, effectively utilizes the powerful calculation ability of the originally provided low-accuracy floating points, and greatly improves the overall calculation efficiency.
    Type: Application
    Filed: November 10, 2022
    Publication date: February 1, 2024
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: Shuangshuang WU, Yunpeng WANG, Jun PENG, Liucheng DUAN, Hang YANG, Xiaoyang LI, Lingjie XU, HaiChuan WANG, Shu CHEN
  • Publication number: 20230125700
    Abstract: The embodiments of the disclosure relate to a data processing method and a computing system. For each die: a first reduction engine of multiple reduction engines corresponding to multiple computing cores included in a current die is determined; each computing core sends data to be reduced and a synchronization indicator to the first reduction engines in multiple dies; in response to receiving the data to be reduced and the synchronization indicators from the computing cores in multiple dies, the first reduction engine in the current die performs a reduction operation on the data to be reduced to generate a reduction computing result, and sends synchronization acknowledgments to the computing cores in the current die; and in response to receiving the synchronization acknowledgment, each computing core in the current die reads the reduction computing result from the first reduction engine in the current die.
    Type: Application
    Filed: October 19, 2022
    Publication date: April 27, 2023
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: Zhou HONG, Lingjie XU, Chengkun SUN, Hao SHU, Lin CHEN, Wei LIANG, Chao MENG
  • Patent number: 11609792
    Abstract: The present disclosure relates to a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each includes multiple computational layers, evaluating computational resources of the accelerator for executing each computational layer of the first and second neural networks, and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network.
    Type: Grant
    Filed: March 19, 2019
    Date of Patent: March 21, 2023
    Assignee: Alibaba Group Holding Limited
    Inventors: Lingjie Xu, Wei Wei
  • Patent number: 11579680
    Abstract: A method for power management based on synthetic machine learning benchmarks, including generating a record of synthetic machine learning benchmarks for synthetic machine learning models that are obtained by changing machine learning network topology parameters, receiving hardware information from a client device executing a machine learning program or preparing to execute a machine learning program, selecting a synthetic machine learning benchmark based on the correlation of the hardware information with the synthetic machine learning models, and determining work schedules based on the selected synthetic machine learning benchmark.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: February 14, 2023
    Assignee: Alibaba Group Holding Limited
    Inventors: Wei Wei, Lingjie Xu, Lingling Jin, Wei Zhang
  • Patent number: 11410016
    Abstract: Selective performance of deterministic computations for neural networks is disclosed, including: obtaining a statistical model for a selection layer of the neural network, the statistical model indicating probabilities that corresponding values are selected by the selection layer, the statistical model being generated using historical data; selectively performing a subset of a plurality of deterministic computations on new input data to the neural network, the plurality of deterministic computations being associated with the deterministic computation layer, the selective performance of the deterministic computations being based at least in part on the statistical model and generating a computation result; and outputting the computation result to another layer in the neural network.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: August 9, 2022
    Inventors: Lingjie Xu, Wei Wei
  • Patent number: 11223838
    Abstract: A video processing apparatus includes a programmable hardware encoder configured to execute an encoding process on a plurality of input video frames. The video processing apparatus further includes a controller coupled with the programmable hardware encoder. The controller is configured to execute a set of instructions to cause the video processing apparatus to: determine first information of the plurality of input video frames, and adjust the encoding process based on the first information.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: January 11, 2022
    Assignee: Alibaba Group Holding Limited
    Inventors: Yen-kuang Chen, Lingjie Xu, Minghai Qin, Ping Chen, Xinyang Yu, Qinggang Zhou
  • Publication number: 20210264220
    Abstract: The present disclosure relates to a method for updating a machine learning model. The method includes selecting a first column to be removed from a first embedding table to obtain a first reduced number of columns for the first embedding table; obtaining a first accuracy result determined by applying a plurality of vectors into the machine learning model, the plurality of vectors including a first vector having a number of numeric values that are converted using the first embedding table with the first reduced number of columns; and determining whether to remove the first column from the first embedding table in in accordance with an evaluation of the first accuracy result against a first predetermined criterion.
    Type: Application
    Filed: February 21, 2020
    Publication date: August 26, 2021
    Inventors: Wei WEI, Wei ZHANG, Lingjie XU, Lingling JIN
  • Publication number: 20200374534
    Abstract: AI-assisted programmable hardware video codec is disclosed. According to certain embodiments, a video processing apparatus includes a programmable hardware encoder configured to execute an encoding process on a plurality of input video frames. The video processing apparatus further includes a controller coupled with the programmable hardware encoder. The controller is configured to execute a set of instructions to cause the video processing apparatus to: determine first information of the plurality of input video frames, and adjust the encoding process based on the first information.
    Type: Application
    Filed: May 6, 2020
    Publication date: November 26, 2020
    Inventors: Yen-kuang CHEN, Lingjie XU, Minghai QIN, Ping CHEN, Xinyang YU, Qinggang ZHOU
  • Publication number: 20200342287
    Abstract: Selective performance of deterministic computations for neural networks is disclosed, including: obtaining a statistical model for a selection layer of the neural network, the statistical model indicating probabilities that corresponding values are selected by the selection layer, the statistical model being generated using historical data; selectively performing a subset of a plurality of deterministic computations on new input data to the neural network, the plurality of deterministic computations being associated with the deterministic computation layer, the selective performance of the deterministic computations being based at least in part on the statistical model and generating a computation result; and outputting the computation result to another layer in the neural network.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Inventors: Lingjie Xu, Wei Wei
  • Publication number: 20200301739
    Abstract: The present disclosure relates to a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each includes multiple computational layers, evaluating computational resources of the accelerator for executing each computational layer of the first and second neural networks, and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network.
    Type: Application
    Filed: March 19, 2019
    Publication date: September 24, 2020
    Inventors: Lingjie XU, Wei WEI
  • Publication number: 20200272896
    Abstract: The present disclosure provides systems and methods for deep learning training using edge devices. The methods can include identifying one or more edge devices, determining characteristics of the identified edge devices, evaluating a deep learning workload to determine an amount of resources for processing, assigning the deep learning workload to one or more identified edge devices based on the characteristics of the one or more identified edge devices, and facilitating communication between the one or more identified edge devices for completing the deep learning workload.
    Type: Application
    Filed: April 30, 2019
    Publication date: August 27, 2020
    Inventors: Wei WEI, Lingjie XU, Lingling JIN, Wei ZHANG
  • Publication number: 20200249740
    Abstract: A method for power management based on synthetic machine learning benchmarks, including generating a record of synthetic machine learning benchmarks for synthetic machine learning models that are obtained by changing machine learning network topology parameters, receiving hardware information from a client device executing a machine learning program or preparing to execute a machine learning program, selecting a synthetic machine learning benchmark based on the correlation of the hardware information with the synthetic machine learning models, and determining work schedules based on the selected synthetic machine learning benchmark.
    Type: Application
    Filed: February 1, 2019
    Publication date: August 6, 2020
    Inventors: Wei WEI, Lingjie XU, Lingling JIN, Wei ZHANG
  • Publication number: 20200218985
    Abstract: Embodiments described herein provide a system for facilitating efficient benchmarking of a piece of hardware configured to process artificial intelligence (AI) related operations. During operation, the system determines the workloads of a set of AI models based on layer information associated with a respective layer of a respective AI model. The set of AI models are representative of applications that run on the piece of hardware. The system forms a set of workload clusters from the workloads and determines a representative workload for a workload cluster. The system then determines, using a meta-heuristic, an input size that corresponds to the representative workload. The system determines, based on the set of workload clusters, a synthetic AI model configured to generate a workload that represents statistical properties of the workloads on the piece of hardware. The input size can generate the representative workload at a computational layer of the synthetic AI model.
    Type: Application
    Filed: January 3, 2019
    Publication date: July 9, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Wei Wei, Lingjie Xu, Lingling Jin
  • Publication number: 20200042419
    Abstract: Embodiments described herein provide a system for facilitating efficient benchmarking of a piece of hardware for artificial intelligence (AI) models. During operation, the system determines a set of AI models that are representative of applications that run on the piece of hardware. The piece of hardware can be configured to process AI-related operations. The system can determine workloads of the set of AI models based on layer information associated with a respective layer of a respective AI model in the set of AI models and form a set of workload clusters from the determined workloads. The system then determines, based on the set of workload clusters, a synthetic AI model configured to generate a workload that represents statistical properties of the determined workload.
    Type: Application
    Filed: July 31, 2018
    Publication date: February 6, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Wei Wei, Lingjie Xu, Lingling Jin
  • Patent number: 10061591
    Abstract: A method for reducing execution of redundant threads in a processing environment. The method includes detecting threads that include redundant work among many different threads. Multiple threads from the detected threads are grouped into one or more thread clusters based on determining same thread computation results. Execution of all but a particular one thread in each of the one or more thread clusters is suppressed. The particular one thread in each of the one or more thread clusters is executed. Results determined from execution of the particular one thread in each of the one or more thread clusters are broadcasted to other threads in each of the one or more thread clusters.
    Type: Grant
    Filed: February 26, 2015
    Date of Patent: August 28, 2018
    Assignee: Samsung Electronics Company, Ltd.
    Inventors: Boris Beylin, John Brothers, Santosh Abraham, Lingjie Xu, Maxim Lukyanov, Alex Grosul
  • Publication number: 20150378733
    Abstract: A method for reducing execution of redundant threads in a processing environment. The method includes detecting threads that include redundant work among many different threads. Multiple threads from the detected threads are grouped into one or more thread clusters based on determining same thread computation results. Execution of all but a particular one thread in each of the one or more thread clusters is suppressed. The particular one thread in each of the one or more thread clusters is executed. Results determined from execution of the particular one thread in each of the one or more thread clusters are broadcasted to other threads in each of the one or more thread clusters.
    Type: Application
    Filed: February 26, 2015
    Publication date: December 31, 2015
    Inventors: Boris Beylin, John Brothers, Santosh Abraham, Lingjie Xu, Maxim Lukyanov, Alex Grosul