Patents by Inventor Jiaqing FU

Jiaqing FU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11526774
    Abstract: Disclosed is a method for automatically compressing multi-task oriented pre-trained language model and a platform thereof. According to the method, a meta-network of a structure generator is designed, a knowledge distillation coding vector is constructed based on a knowledge distillation method of Transformer layer sampling, and a distillation structure model corresponding to a currently input coding vector is generated by using the structure generator; at the same time, a Bernoulli distribution sampling method is provided for training the structure generator; in each iteration, each encoder unit is transferred by Bernoulli distribution sampling to form a corresponding coding vector; by changing the coding vector input to the structure generator and a small batch of training data, the structure generator and the corresponding distillation structure are jointly trained, and a structure generator capable of generating weights for different distillation structures can be acquired.
    Type: Grant
    Filed: December 28, 2021
    Date of Patent: December 13, 2022
    Assignee: ZHEJIANG LAB
    Inventors: Hongsheng Wang, Haijun Shan, Jiaqing Fu
  • Publication number: 20220188658
    Abstract: Disclosed is a method for automatically compressing multi-task oriented pre-trained language model and a platform thereof. According to the method, a meta-network of a structure generator is designed, a knowledge distillation coding vector is constructed based on a knowledge distillation method of Transformer layer sampling, and a distillation structure model corresponding to a currently input coding vector is generated by using the structure generator; at the same time, a Bernoulli distribution sampling method is provided for training the structure generator; in each iteration, each encoder unit is transferred by Bernoulli distribution sampling to form a corresponding coding vector; by changing the coding vector input to the structure generator and a small batch of training data, the structure generator and the corresponding distillation structure are jointly trained, and a structure generator capable of generating weights for different distillation structures can be acquired.
    Type: Application
    Filed: December 28, 2021
    Publication date: June 16, 2022
    Inventors: Hongsheng WANG, Haijun SHAN, Jiaqing FU