Patents by Inventor Jiaqing FU

Jiaqing FU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method for automatically compressing multitask-oriented pre-trained language model and platform thereof

Patent number: 11526774

Abstract: Disclosed is a method for automatically compressing multi-task oriented pre-trained language model and a platform thereof. According to the method, a meta-network of a structure generator is designed, a knowledge distillation coding vector is constructed based on a knowledge distillation method of Transformer layer sampling, and a distillation structure model corresponding to a currently input coding vector is generated by using the structure generator; at the same time, a Bernoulli distribution sampling method is provided for training the structure generator; in each iteration, each encoder unit is transferred by Bernoulli distribution sampling to form a corresponding coding vector; by changing the coding vector input to the structure generator and a small batch of training data, the structure generator and the corresponding distillation structure are jointly trained, and a structure generator capable of generating weights for different distillation structures can be acquired.

Type: Grant

Filed: December 28, 2021

Date of Patent: December 13, 2022

Assignee: ZHEJIANG LAB

Inventors: Hongsheng Wang, Haijun Shan, Jiaqing Fu
METHOD FOR AUTOMATICALLY COMPRESSING MULTITASK-ORIENTED PRE-TRAINED LANGUAGE MODEL AND PLATFORM THEREOF

Publication number: 20220188658

Abstract: Disclosed is a method for automatically compressing multi-task oriented pre-trained language model and a platform thereof. According to the method, a meta-network of a structure generator is designed, a knowledge distillation coding vector is constructed based on a knowledge distillation method of Transformer layer sampling, and a distillation structure model corresponding to a currently input coding vector is generated by using the structure generator; at the same time, a Bernoulli distribution sampling method is provided for training the structure generator; in each iteration, each encoder unit is transferred by Bernoulli distribution sampling to form a corresponding coding vector; by changing the coding vector input to the structure generator and a small batch of training data, the structure generator and the corresponding distillation structure are jointly trained, and a structure generator capable of generating weights for different distillation structures can be acquired.

Type: Application

Filed: December 28, 2021

Publication date: June 16, 2022

Inventors: Hongsheng WANG, Haijun SHAN, Jiaqing FU

Method for automatically compressing multitask-oriented pre-trained language model and platform thereof

METHOD FOR AUTOMATICALLY COMPRESSING MULTITASK-ORIENTED PRE-TRAINED LANGUAGE MODEL AND PLATFORM THEREOF