Patents Assigned to MOFFETT TECHNOLOGIES CO. LIMITED
  • Patent number: 11379724
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; and training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network, wherein the second neural network is applicable for inferencing in the one or more domains, and the training comprises: training the one or more branches based respectively on the one or more second training datasets and an output of the first neural network.
    Type: Grant
    Filed: July 12, 2021
    Date of Patent: July 5, 2022
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Jiachao Liu, Enxu Yan
  • Patent number: 11200497
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for knowledge-preserving sparse pruning on neural networks are described. An exemplary method includes obtaining a pre-trained machine learning model trained based on a plurality of general-purpose training data; training a task-specific machine learning model by tuning the pre-trained machine learning model based on a plurality of task-specific training data corresponding to a task; constructing a student network based on the task-specific machine learning model; simultaneously performing (1) knowledge distillation from the trained task-specific machine learning model as a teacher network to the student network and (2) network pruning on the student network; and obtaining the trained student network for serving the task.
    Type: Grant
    Filed: March 16, 2021
    Date of Patent: December 14, 2021
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Enxu Yan, Dongkuan Xu, Zhibin Xiao
  • Patent number: 11144823
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a filter at a convolution layer of a neural network; segmenting the filter into a plurality of sub-filters; generating a hierarchical bit representation of the filter representing a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a first layer, the first layer comprising a plurality of bits respectively corresponding to the plurality of sub-filters in the filter, each of the plurality of bits indicating whether the corresponding sub-filter includes at least one non-zero weight; and performing multiply-and-accumulate (MAC) operations based on the hierarchical bit representation of the filter and the input tensor.
    Type: Grant
    Filed: April 5, 2021
    Date of Patent: October 12, 2021
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
  • Patent number: 11113601
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing.
    Type: Grant
    Filed: June 30, 2020
    Date of Patent: September 7, 2021
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
  • Patent number: 11068786
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network. The one or more branches respectively correspond to the one or more domains, and each comprises one or more layers trained based on one of the one or more second training datasets. The method may further include: pruning the second neural network by reducing a number of active neurons; and applying the pruned second neural network for inferencing in the one or more domains.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: July 20, 2021
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Jiachao Liu, Enxu Yan
  • Patent number: 10970619
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a plurality of filters at a convolution layer of a neural network; segmenting the input tensor into a plurality of sub-tensors and assigning the plurality of sub-tensors to a plurality of processors; generating, for each of the plurality of filters, a hierarchical bit representation of a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a plurality of bits indicating whether a sub-filter has at least one non-zero weight, and a plurality of key-value pairs corresponding to the plurality of non-zero weights in the filter; identifying, based on the hierarchical bit representation, one or more of the plurality of non-zero weights and corresponding input values from the assigned sub-tensor to perform multiply-and-accumulate (MAC) operations.
    Type: Grant
    Filed: August 21, 2020
    Date of Patent: April 6, 2021
    Assignee: MOFFETT TECHNOLOGIES CO., LIMITED
    Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
  • Patent number: 10832139
    Abstract: Systems, methods and computer-readable medium for (i) accelerating the inference speed of a deep neural network (DNN), and (ii) compressing the vector representations produced by the DNN out of a variety of input data, such as image, audio, video and text. A method embodiment takes as inputs a neural network architecture and a task-dependent loss function, measuring how well a neural network performs on a training data set, and outputs a deep neural network with sparse neuron activations. The invented procedure augments an existing training objective function of a DNN with regularization terms that encourage sparse activation of neurons, and compresses the DNN by solving the optimization problem with a variety of algorithms. The present disclosure also shows how to utilize the sparsity of activations during the inference of DNNs so the number of arithmetic operations can be reduced proportionately, and how to use the sparse representations produced by the DNNs to build an efficient search engine.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: November 10, 2020
    Assignee: MOFFETT TECHNOLOGIES CO. LIMITED
    Inventors: Enxu Yan, Wei Wang