Patents Assigned to MOFFETT TECHNOLOGIES CO. LIMITED
-
Patent number: 11379724Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; and training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network, wherein the second neural network is applicable for inferencing in the one or more domains, and the training comprises: training the one or more branches based respectively on the one or more second training datasets and an output of the first neural network.Type: GrantFiled: July 12, 2021Date of Patent: July 5, 2022Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Jiachao Liu, Enxu Yan
-
Patent number: 11200497Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for knowledge-preserving sparse pruning on neural networks are described. An exemplary method includes obtaining a pre-trained machine learning model trained based on a plurality of general-purpose training data; training a task-specific machine learning model by tuning the pre-trained machine learning model based on a plurality of task-specific training data corresponding to a task; constructing a student network based on the task-specific machine learning model; simultaneously performing (1) knowledge distillation from the trained task-specific machine learning model as a teacher network to the student network and (2) network pruning on the student network; and obtaining the trained student network for serving the task.Type: GrantFiled: March 16, 2021Date of Patent: December 14, 2021Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Enxu Yan, Dongkuan Xu, Zhibin Xiao
-
Patent number: 11144823Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a filter at a convolution layer of a neural network; segmenting the filter into a plurality of sub-filters; generating a hierarchical bit representation of the filter representing a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a first layer, the first layer comprising a plurality of bits respectively corresponding to the plurality of sub-filters in the filter, each of the plurality of bits indicating whether the corresponding sub-filter includes at least one non-zero weight; and performing multiply-and-accumulate (MAC) operations based on the hierarchical bit representation of the filter and the input tensor.Type: GrantFiled: April 5, 2021Date of Patent: October 12, 2021Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
-
Patent number: 11113601Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing.Type: GrantFiled: June 30, 2020Date of Patent: September 7, 2021Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
-
Patent number: 11068786Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network. The one or more branches respectively correspond to the one or more domains, and each comprises one or more layers trained based on one of the one or more second training datasets. The method may further include: pruning the second neural network by reducing a number of active neurons; and applying the pruned second neural network for inferencing in the one or more domains.Type: GrantFiled: December 17, 2020Date of Patent: July 20, 2021Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Jiachao Liu, Enxu Yan
-
Patent number: 10970619Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a plurality of filters at a convolution layer of a neural network; segmenting the input tensor into a plurality of sub-tensors and assigning the plurality of sub-tensors to a plurality of processors; generating, for each of the plurality of filters, a hierarchical bit representation of a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a plurality of bits indicating whether a sub-filter has at least one non-zero weight, and a plurality of key-value pairs corresponding to the plurality of non-zero weights in the filter; identifying, based on the hierarchical bit representation, one or more of the plurality of non-zero weights and corresponding input values from the assigned sub-tensor to perform multiply-and-accumulate (MAC) operations.Type: GrantFiled: August 21, 2020Date of Patent: April 6, 2021Assignee: MOFFETT TECHNOLOGIES CO., LIMITEDInventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
-
Patent number: 10832139Abstract: Systems, methods and computer-readable medium for (i) accelerating the inference speed of a deep neural network (DNN), and (ii) compressing the vector representations produced by the DNN out of a variety of input data, such as image, audio, video and text. A method embodiment takes as inputs a neural network architecture and a task-dependent loss function, measuring how well a neural network performs on a training data set, and outputs a deep neural network with sparse neuron activations. The invented procedure augments an existing training objective function of a DNN with regularization terms that encourage sparse activation of neurons, and compresses the DNN by solving the optimization problem with a variety of algorithms. The present disclosure also shows how to utilize the sparsity of activations during the inference of DNNs so the number of arithmetic operations can be reduced proportionately, and how to use the sparse representations produced by the DNNs to build an efficient search engine.Type: GrantFiled: June 20, 2019Date of Patent: November 10, 2020Assignee: MOFFETT TECHNOLOGIES CO. LIMITEDInventors: Enxu Yan, Wei Wang