Patents by Inventor Bolei HE

Bolei HE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

QUESTION ANSWERING METHOD BASED ON LARGE MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20260087055

Abstract: A question answering method based on a large model is performed by an electronic device. The method includes: determining a first result corresponding to a query statement based on a first large model, in which the first result includes a first response corresponding to the query statement, a first response logic corresponding to the first response, and first context knowledge corresponding to the first response; determining a retrieval result corresponding to the query statement by retrieving in a database based on the query statement; determining a type of a knowledge conflict between the first result and the retrieval result; and determining a target response corresponding to the query statement based on the first result, the retrieval result, and the type of the knowledge conflict.

Type: Application

Filed: December 12, 2024

Publication date: March 26, 2026

Applicant: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventor: Bolei He
METHOD FOR TRAINING TEXT QUESTION AND ANSWER MODEL, AND ELECTRONIC DEVICE

Publication number: 20250390683

Abstract: A method for training a text question and answer (Q&A) model is performed by an electronic device. The method includes: determining a sample question text set and a sample answer text corresponding to a sample question text in the sample question text set; inputting the sample question text into a text Q&A model to be trained, and obtaining a predicted answer text output by the text Q&A model and at least one prediction probability of at least one reference character on each character position in the predicted answer text; determining an uncertainty degree of the predicted answer text; and obtaining a trained text Q&A model by adjusting a parameter of the text Q&A model based on the sample answer text, the predicted answer text and the uncertainty degree of the predicted answer text.

Type: Application

Filed: December 12, 2024

Publication date: December 25, 2025

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Xinran He, Bolei He
METHOD FOR MODEL TRAINING BASED ON LARGE MODEL, QUESTION ANSWERING METHOD, AND ELECTRONIC DEVICE

Publication number: 20250117668

Abstract: A method for model training based on a large model includes: determining a first large model as a teacher model of a language model, and performing distillation learning on the language model based on the first large model; inputting a first prompt text into the language model, and obtaining a plurality of first response texts for the first prompt text output by the language model; determining a reference response text for the first prompt text from the plurality of first response texts; and training the language model based on the reference response text for the first prompt text.

Type: Application

Filed: December 19, 2024

Publication date: April 10, 2025

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Xinran He, Xianwei Xue, Bolei He, Kunbin Chen, Jinchang Luo, Ruigao Li
METHOD AND APPARATUS FOR TARGET BUSINESS MODEL GENERATION AND DATA PROCESSING BASED ON LARGE MODEL

Publication number: 20250117734

Abstract: Method and apparatus for target business model generation and data processing based on large language model are disclosed, which relates to the field of artificial intelligence technology, specifically in the areas of intelligent office, big data, and large models. A method for generating a target business model based on large language model includes: performing knowledge distillation on at least two pre-trained large models to obtain a base model of a target scenario, wherein each pre-trained model corresponds to one of at least two business types included in the target scenario; performing knowledge distillation on the base model to obtain a target business model of a target business type among the at least two business types, wherein the target business model is used for processing data of the target business type.

Type: Application

Filed: December 17, 2024

Publication date: April 10, 2025

Applicant: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventor: Bolei HE
METHOD FOR GENERATING TEXT TRAINING SAMPLE BASED ON LARGE MODEL, AND ELECTRONIC DEVICE

Publication number: 20250117714

Abstract: A method for generating a text training sample based on a large model includes: obtaining at least two query clusters by clustering at least two queries; obtaining a first query from each query cluster; generating at least two second queries under a set theme through a first large model by taking the first query as an example; and generating a first text training sample for fine-tuning a second large model based on the second query.

Type: Application

Filed: December 19, 2024

Publication date: April 10, 2025

Applicant: BAIDU INTERNATIONAL TECHNOLOGY (SHENZHEN) CO., LTD.

Inventors: Jinchang Luo, Bolei He, Kunbin Chen, Wei He
METHOD FOR INFORMATION PROCESSING BASED ON LARGE LANGUAGE MODEL

Publication number: 20250013676

Abstract: A computer-implemented method for information processing based on a large language model is provided. The method includes obtaining query information provided by a user. The method further includes determining memory information related to the query information. The method further includes determining, based on the query information and the memory information, a tool for processing the query information. The method further includes invoking the tool to obtain auxiliary information. The method further includes generating, based on the query information and the auxiliary information, a result of processing the query information.

Type: Application

Filed: September 19, 2024

Publication date: January 9, 2025

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Jinchang LUO, Bolei HE, Kunbin CHEN, Wei HE
METHOD AND APPARATUS FOR TRAINING A LARGE LANGUAGE MODEL, AND MEDIUM

Publication number: 20250013876

Abstract: An apparatus for training a large language model includes: at least one sample text instruction is input into a target large language model to obtain at least one standard response text, and the at least one sample text instruction is input into a large language model to be trained to obtain at least one predicted response text. A first sample response text is determined from the at least one standard response text according to the score difference between a first quality score of a standard response text and a second quality score of a predicted response text. A first target training sample is generated according to the first sample response text and a sample text instruction corresponding to the first sample response text, and a training dataset is constructed according to the first target training sample.

Type: Application

Filed: September 19, 2024

Publication date: January 9, 2025

Inventors: Xianwei XUE, Qiutong PAN, Jinchang LUO, Bolei HE, Wei HE
Method for denoising click data, electronic device and storage medium

Patent number: 12174824

Abstract: A method for denoising click data includes: acquiring a set of click data including pieces of first click data and a real label corresponding to each piece of first click data; extracting feature vectors of each piece of first click data with a graph model; dividing the feature vectors into sets of feature vectors; obtaining trained binary classification models by training binary classification models with the sets of feature vectors; for each of the feature vectors, obtaining prediction values corresponding to the feature vector by predicting the feature vector with the trained binary classification models, and calculating a prediction label of the feature vector based on the prediction values of the feature vector; and removing noise data in the pieces of first click data, based on the pieces of first click data, the real label and the prediction label of each piece of first click data.

Type: Grant

Filed: December 29, 2022

Date of Patent: December 24, 2024

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Wei Xu, Xiaoling Xia, Junxiang Jiang, Chengtai Cao, Bolei He, Kunbin Chen, Wei He
Method and apparatus for constructing quality evaluation model, device and storage medium

Patent number: 11797607

Abstract: Embodiments of the present disclosure disclose a method and apparatus for constructing a quality evaluation model, an electronic device and a computer-readable storage medium. A specific implementation mode of the method comprises: acquiring samples of knowledge contents; extracting statistical features, semantic features, and image features respectively from the samples of knowledge contents; and constructing a quality evaluation model for knowledge according to the statistical features, the semantic features, and the image features. On the basis of the prior art, this implementation mode additionally uses semantic features and image features of knowledge contents to construct a more accurate quality evaluation model based on multi-dimensional features that characterize the actual quality of a knowledge, which may well discover some brief but very useful summary knowledge in an enterprise and may recommend high-quality knowledge more accurately for employees in the enterprise.

Type: Grant

Filed: March 24, 2021

Date of Patent: October 24, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Huan Liu, Mingquan Cheng, Kunbin Chen, Zhun Liu, Bolei He, Wei He
METHOD FOR DENOISING CLICK DATA, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20230132618

Abstract: A method for denoising click data includes: acquiring a set of click data including pieces of first click data and a real label corresponding to each piece of first click data; extracting feature vectors of each piece of first click data with a graph model; dividing the feature vectors into sets of feature vectors; obtaining trained binary classification models by training binary classification models with the sets of feature vectors; for each of the feature vectors, obtaining prediction values corresponding to the feature vector by predicting the feature vector with the trained binary classification models, and calculating a prediction label of the feature vector based on the prediction values of the feature vector; and removing noise data in the pieces of first click data, based on the pieces of first click data, the real label and the prediction label of each piece of first click data.

Type: Application

Filed: December 29, 2022

Publication date: May 4, 2023

Inventors: Wei XU, Xiaoling XIA, Junxiang JIANG, Chengtai CAO, Bolei HE, Kunbin CHEN, Wei HE
Pre-training method for sentiment analysis model, and electronic device

Patent number: 11537792

Abstract: The present disclosure provides a pre-training method for a sentiment analysis model and an electronic device, which relates to a field of artificial intelligence technologies. The method includes: based on a given seed sentiment dictionary, performing sentimental knowledge detection on a training corpus in a training corpus set, and determining a detection sentiment word and a detection word pair of the training corpus; according to preset mask processing rules, performing mask process on the training corpus to generate a masked corpus; performing encoding and decoding on the masked corpus by using a preset encoder and decoder to determine the detection sentiment word and the detection word pair of the training corpus; and updating the preset encoder and decoder according to a difference between prediction sentiment word and the detection sentiment word, and a difference between prediction word pair and the detection word pair.

Type: Grant

Filed: July 21, 2020

Date of Patent: December 27, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Can Gao, Hao Liu, Bolei He, Xinyan Xiao, Hao Tian
Comment information processing method and apparatus, and medium

Patent number: 11507751

Abstract: The present disclosure discloses a comment information processing method and apparatus, and a medium. The specific implementation solution is: in response to a user operation, determining an opinion category corresponding to each opinion phrase in a comment opinion dictionary; obtaining a target corpus matching each opinion phrase from a plurality of comment corpora; for each opinion phrase, using a corresponding opinion category to label the target corpus matching each opinion phrase to obtain a first training sample; and training a classification model with the first training sample to identify the opinion category of a comment by using a trained classification model.

Type: Grant

Filed: July 24, 2020

Date of Patent: November 22, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Hao Liu, Bolei He, Xinyan Xiao
Method for generating tag of video, electronic device, and storage medium

Patent number: 11508153

Abstract: A method for generating a tag of a video, an electronic device, and a storage medium are related to a field of natural language processing and deep learning technologies. The detailed implementing solution includes: obtaining multiple candidate tags and video information of the video; determining first correlation information between the video information and each of the multiple candidate tags; sorting the multiple candidate tags based on the first correlation information to obtain a sort result; and generating the tag of the video based on the sort result.

Type: Grant

Filed: December 8, 2020

Date of Patent: November 22, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Chengxiang Liu, Hao Liu, Bolei He
METHOD FOR SEARCHING INSTANT MESSAGING OBJECT, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20220365941

Abstract: The disclosure provides a method for searching an instant messaging object, an electronic device and a storage medium. The method includes: receiving a search request of a first object, and determining a type of the search request; obtaining at least one recall set of the first object based on a client-side search engine in an instant messaging system in response to the type of the search request being a first type; obtaining at least one candidate object corresponding to a search keyword in the search request based on the search keyword and the at least one recall set; obtaining feature information of each candidate object; and responding to the search request by sorting the at least one candidate object based on the feature information.

Type: Application

Filed: July 11, 2022

Publication date: November 17, 2022

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventors: Qiutong Pan, Ruigao Li, Yanan Li, Bolei He
METHOD AND APPARATUS FOR GENERATING ACCOUNT INTIMACY

Publication number: 20220286416

Abstract: A method for generating an account intimacy includes: obtaining a set of accounts in an instant messaging (IM) group; obtaining a communication frequency between two accounts in the set of accounts within a preset time period; generating a communication network graph based on the communication frequency; obtaining an embedding vector of each account output by a graph model, in which the graph model is trained based on the communication network graph; and generating an intimacy between two accounts based on the embedding vectors of the two accounts.

Type: Application

Filed: May 25, 2022

Publication date: September 8, 2022

Inventors: Shijie CAO, Yanan LI, Bolei HE, Kunbin CHEN, Wei HE, Feng HE
Cross-modality processing method and apparatus, and computer storage medium

Patent number: 11341366

Abstract: A cross-modality processing method is related to a field of natural language processing technologies. The method includes: obtaining a sample set, wherein the sample set includes a plurality of corpus and a plurality of images; generating a plurality of training samples according to the sample set, in which each of the plurality of the training samples is a combination of at least one of the plurality of the corpus and at least one of the plurality of the images corresponding to the at least one of the plurality of the corpus; adopting the plurality of the training samples to train a semantic model, so that the semantic model learns semantic vectors containing combinations of the corpus and the images.

Type: Grant

Filed: August 10, 2020

Date of Patent: May 24, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Guocheng Niu, Bolei He, Xinyan Xiao
METHOD FOR RECOMMENDING DOCUMENT, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20220121668

Abstract: The present disclosure provides a method of recommending a document, an electronic device, and a storage medium, relating to fields of intelligent recommendation, deep learning etc. The method of recommending a document includes: acquiring a document operated by a user, as a reference document; determining, from a plurality of initial documents, at least one candidate document for the reference document, wherein a document content of each candidate document is associated with a document content of the reference document, based on preset knowledge system data; and recommending a target document in the at least one candidate document to the user, the target document including a document that the user is currently interested in and a document that the user is interested in after a preset time period.

Type: Application

Filed: December 29, 2021

Publication date: April 21, 2022

Inventors: Wei XU, Xiaoling XIA, Bolei HE, Kunbin CHEN, Zhun LIU, Wei HE
Document recommendation method and device based on semantic tag

Patent number: 11216504

Abstract: A document recommendation method based on a semantic tag and a document recommendation device. The method includes: for each document, acquiring a first candidate tag set corresponding to the document, and processing each first candidate tag in the first candidate tag set corresponding to the document to obtain a second candidate tag set corresponding to the document; performing normalization processing on each second candidate tag in the second candidate tag set corresponding to the document to obtain a third candidate tag set corresponding to the document; performing expanding process on each third candidate tag in the third candidate tag set corresponding to the document, and acquiring a fourth candidate tag set corresponding to the document, to form a document library having semantic tags; and recommending a target document obtained from the document library having semantic tags to the user, according to historical semantic tag.

Type: Grant

Filed: December 6, 2019

Date of Patent: January 4, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Guocheng Niu, Bolei He, Chengxiang Liu, Xinyan Xiao, Yajuan Lyu
METHOD FOR GENERATING TAG OF VIDEO, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20210383121

Abstract: A method for generating a tag of a video, an electronic device, and a storage medium are related to a field of natural language processing and deep learning technologies. The detailed implementing solution includes: obtaining multiple candidate tags and video information of the video; determining first correlation information between the video information and each of the multiple candidate tags; sorting the multiple candidate tags based on the first correlation information to obtain a sort result; and generating the tag of the video based on the sort result.

Type: Application

Filed: December 8, 2020

Publication date: December 9, 2021

Inventors: Chengxiang LIU, Hao LIU, Bolei HE
INFORMATION PROCESSING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

Publication number: 20210374195

Abstract: The present disclosure provides an information processing method, an electronic device and a computer storage medium, and relates to a field of information processing. The method includes: obtaining a first content based on a first search keyword indicating a first event and a second search keyword indicating an object related to the first event; obtaining information associated with an attribute of the object from the first content; obtaining a second content based on the first search keyword and a third search keyword indicating a result at least caused by the first event; and generating statistical data associated with the first event based on the information and the second content.

Type: Application

Filed: November 18, 2020

Publication date: December 2, 2021

Inventors: Lei CHEN, Bolei HE, Kai LIU, Lei HAN, Ke SUN

1 2 next