Patents Assigned to MOFFETT TECHNOLOGIES CO. LIMITED

System and method for domain specific neural network pruning

Patent number: 11379724

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; and training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network, wherein the second neural network is applicable for inferencing in the one or more domains, and the training comprises: training the one or more branches based respectively on the one or more second training datasets and an output of the first neural network.

Type: Grant

Filed: July 12, 2021

Date of Patent: July 5, 2022

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Jiachao Liu, Enxu Yan
System and method for knowledge-preserving neural network pruning

Patent number: 11200497

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for knowledge-preserving sparse pruning on neural networks are described. An exemplary method includes obtaining a pre-trained machine learning model trained based on a plurality of general-purpose training data; training a task-specific machine learning model by tuning the pre-trained machine learning model based on a plurality of task-specific training data corresponding to a task; constructing a student network based on the task-specific machine learning model; simultaneously performing (1) knowledge distillation from the trained task-specific machine learning model as a teacher network to the student network and (2) network pruning on the student network; and obtaining the trained student network for serving the task.

Type: Grant

Filed: March 16, 2021

Date of Patent: December 14, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Enxu Yan, Dongkuan Xu, Zhibin Xiao
Method and system for hierarchical weight-sparse convolution processing

Patent number: 11144823

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a filter at a convolution layer of a neural network; segmenting the filter into a plurality of sub-filters; generating a hierarchical bit representation of the filter representing a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a first layer, the first layer comprising a plurality of bits respectively corresponding to the plurality of sub-filters in the filter, each of the plurality of bits indicating whether the corresponding sub-filter includes at least one non-zero weight; and performing multiply-and-accumulate (MAC) operations based on the hierarchical bit representation of the filter and the input tensor.

Type: Grant

Filed: April 5, 2021

Date of Patent: October 12, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
Method and system for balanced-weight sparse convolution processing

Patent number: 11113601

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing.

Type: Grant

Filed: June 30, 2020

Date of Patent: September 7, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
System and method for domain specific neural network pruning

Patent number: 11068786

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for domain-specific pruning of neural networks are described. An exemplary method includes obtaining a first neural network trained based on a first training dataset; obtaining one or more second training datasets respectively from one or more domains; training, based on the first neural network and the one or more second training datasets, a second neural network comprising the first neural network and one or more branches extended from the first neural network. The one or more branches respectively correspond to the one or more domains, and each comprises one or more layers trained based on one of the one or more second training datasets. The method may further include: pruning the second neural network by reducing a number of active neurons; and applying the pruned second neural network for inferencing in the one or more domains.

Type: Grant

Filed: December 17, 2020

Date of Patent: July 20, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Jiachao Liu, Enxu Yan
Method and system for hierarchical weight-sparse convolution processing

Patent number: 10970619

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a plurality of filters at a convolution layer of a neural network; segmenting the input tensor into a plurality of sub-tensors and assigning the plurality of sub-tensors to a plurality of processors; generating, for each of the plurality of filters, a hierarchical bit representation of a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a plurality of bits indicating whether a sub-filter has at least one non-zero weight, and a plurality of key-value pairs corresponding to the plurality of non-zero weights in the filter; identifying, based on the hierarchical bit representation, one or more of the plurality of non-zero weights and corresponding input values from the assigned sub-tensor to perform multiply-and-accumulate (MAC) operations.

Type: Grant

Filed: August 21, 2020

Date of Patent: April 6, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
Neural network acceleration and embedding compression systems and methods with activation sparsification

Patent number: 10832139

Abstract: Systems, methods and computer-readable medium for (i) accelerating the inference speed of a deep neural network (DNN), and (ii) compressing the vector representations produced by the DNN out of a variety of input data, such as image, audio, video and text. A method embodiment takes as inputs a neural network architecture and a task-dependent loss function, measuring how well a neural network performs on a training data set, and outputs a deep neural network with sparse neuron activations. The invented procedure augments an existing training objective function of a DNN with regularization terms that encourage sparse activation of neurons, and compresses the DNN by solving the optimization problem with a variety of algorithms. The present disclosure also shows how to utilize the sparsity of activations during the inference of DNNs so the number of arithmetic operations can be reduced proportionately, and how to use the sparse representations produced by the DNNs to build an efficient search engine.

Type: Grant

Filed: June 20, 2019

Date of Patent: November 10, 2020

Assignee: MOFFETT TECHNOLOGIES CO. LIMITED

Inventors: Enxu Yan, Wei Wang

System and method for domain specific neural network pruning

System and method for knowledge-preserving neural network pruning

Method and system for hierarchical weight-sparse convolution processing

Method and system for balanced-weight sparse convolution processing

System and method for domain specific neural network pruning

Method and system for hierarchical weight-sparse convolution processing

Neural network acceleration and embedding compression systems and methods with activation sparsification