Patents by Inventor BODONG ZHANG

BODONG ZHANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD OF EDGE-CLOUD FUSION-AWARE VISUAL PROMPT LARGE LANGUAGE MODEL

Publication number: 20250086952

Abstract: A method for running an edge-cloud fusion-aware visual prompt large language model includes training a large language model feature encoder and a small feature extraction model, inputting knowledge-based text prompts to the large language model feature encoder in an edge device to generate a plurality of knowledge-based text embeddings, building a large language model database in the edge device according to the plurality of knowledge-based text embeddings, inputting a text prompt to the large language model feature encoder in the edge device to generate a text query embedding, comparing the text query embedding with the large language model database to generate a first similarity score, and if the first similarity score is larger than a first threshold, then inputting the text query embedding to the small feature extraction model to generate a first answer.

Type: Application

Filed: January 11, 2024

Publication date: March 13, 2025

Applicant: Kneron (Taiwan) Co., Ltd.

Inventors: JIE WU, BODONG ZHANG, Gen Sun, JUNJIE SU, Hsiang-Tsun Li, Chun-Chen Liu
Lossless model compression by batch normalization layer pruning in deep neural networks

Patent number: 11488019

Abstract: A method of pruning a batch normalization layer from a pre-trained deep neural network model is proposed. The pre-trained deep neural network model is inputted as a candidate model. The candidate model is pruned by removing the at least one batch normalization layer from the candidate model to form a pruned candidate model only when the at least one batch normalization layer is connected to and adjacent to a corresponding linear operation layer. The corresponding linear operation layer may be at least one of a convolution layer, a dense layer, a depthwise convolution layer, and a group convolution layer. Weights of the corresponding linear operation layer are adjusted to compensate for the removal of the at least one batch normalization. The pruned candidate model is then output and utilized for inference.

Type: Grant

Filed: January 24, 2019

Date of Patent: November 1, 2022

Assignee: Kneron (Taiwan) Co., Ltd.

Inventors: Bike Xie, Junjie Su, Bodong Zhang, Chun-Chen Liu
Self-tuning incremental model compression solution in deep neural network with guaranteed accuracy performance

Patent number: 11403528

Abstract: A method of compressing a pre-trained deep neural network model includes inputting the pre-trained deep neural network model as a candidate model. The candidate model is compressed by increasing sparsity of the candidate, removing at least one batch normalization layer present in the candidate model, and quantizing all remaining weights into fixed-point representation to form a compressed model. Accuracy of the compressed model is then determined utilizing an end-user training and validation data set. Compression of the candidate model is repeated when the accuracy improves. Hyper parameters for compressing the candidate model are adjusted, then compression of the candidate model is repeated when the accuracy declines. The compressed model is output for inference utilization when the accuracy meets or exceeds the end-user performance metric and target.

Type: Grant

Filed: April 18, 2019

Date of Patent: August 2, 2022

Assignee: Kneron (Taiwan) Co., Ltd.

Inventors: Bike Xie, Junjie Su, Jie Wu, Bodong Zhang, Chun-Chen Liu
Lossless Model Compression by Batch Normalization Layer Pruning in Deep Neural Networks

Publication number: 20190370656

Abstract: A method of pruning a batch normalization layer from a pre-trained deep neural network model is proposed. The pre-trained deep neural network model is inputted as a candidate model. The candidate model is pruned by removing the at least one batch normalization layer from the candidate model to form a pruned candidate model only when the at least one batch normalization layer is connected to and adjacent to a corresponding linear operation layer. The corresponding linear operation layer may be at least one of a convolution layer, a dense layer, a depthwise convolution layer, and a group convolution layer. Weights of the corresponding linear operation layer are adjusted to compensate for the removal of the at least one batch normalization. The pruned candidate model is then output and utilized for inference.

Type: Application

Filed: January 24, 2019

Publication date: December 5, 2019

Inventors: Bike Xie, JUNJIE SU, BODONG ZHANG, Chun-Chen Liu
Self-Tuning Incremental Model Compression Solution in Deep Neural Network with Guaranteed Accuracy Performance

Publication number: 20190370658

Abstract: A method of compressing a pre-trained deep neural network model includes inputting the pre-trained deep neural network model as a candidate model. The candidate model is compressed by increasing sparsity of the candidate, removing at least one batch normalization layer present in the candidate model, and quantizing all remaining weights into fixed-point representation to form a compressed model. Accuracy of the compressed model is then determined utilizing an end-user training and validation data set. Compression of the candidate model is repeated when the accuracy improves. Hyper parameters for compressing the candidate model are adjusted, then compression of the candidate model is repeated when the accuracy declines. The compressed model is output for inference utilization when the accuracy meets or exceeds the end-user performance metric and target.

Type: Application

Filed: April 18, 2019

Publication date: December 5, 2019

Inventors: Bike Xie, JUNJIE SU, JIE WU, BODONG ZHANG, Chun-Chen Liu

METHOD OF EDGE-CLOUD FUSION-AWARE VISUAL PROMPT LARGE LANGUAGE MODEL

Lossless model compression by batch normalization layer pruning in deep neural networks

Self-tuning incremental model compression solution in deep neural network with guaranteed accuracy performance

Lossless Model Compression by Batch Normalization Layer Pruning in Deep Neural Networks

Self-Tuning Incremental Model Compression Solution in Deep Neural Network with Guaranteed Accuracy Performance