Patents by Inventor Xinyan Xiao

Xinyan Xiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11928434
    Abstract: A method for text generation, relates to a field of natural language processing, including: obtaining corpus data; labeling the corpus data to obtain a first constraint element; obtaining a first generation target; and generating a first text matching the first generation target by inputting the corpus data and the first constraint element into a generation model.
    Type: Grant
    Filed: August 9, 2021
    Date of Patent: March 12, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang
  • Publication number: 20230214423
    Abstract: A video generation method is provided. The video generation method includes: obtaining global semantic information and local semantic information of a text, where the local semantic information corresponds to a text fragment in the text, searching, based on the global semantic information, a database to obtain at least one first data corresponding to the global semantic information; searching, based on the local semantic information, the database to obtain at least one second data corresponding to the local semantic information; obtaining, based on the at least one first data and the at least one second data, a candidate data set; matching, based on a relevancy between each of at least one text fragment and corresponding candidate data in the candidate data set, target data for the at least one text fragment; and generating, based on the target data matched with each of the at least one text fragment, a video.
    Type: Application
    Filed: February 24, 2023
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Haifeng WANG, Hao TIAN, Xinyan XIAO, Xing LI, Tian WU
  • Patent number: 11687715
    Abstract: Embodiments of the application disclose a summary generation method and apparatus. A specific embodiment of the method comprises: acquiring a target article including a headline and a body of the article; determining whether a question is included in the headline; determining, in the body of the article, an information-satisfied-paragraph including an answer to the question, in response to determining that the question is included in the headline; and generating a summary of the target article based on the determined information-satisfied-paragraph. The above embodiment may generate a summary that directly satisfies the users' demand for information.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: June 27, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Moye Chen, Wei Xu, Jiachen Liu, Xinyan Xiao, Qiaoqiao She
  • Patent number: 11675983
    Abstract: A method for implementing text generation, a device and a medium are provided. The method includes: determining a target task type of a target text generation task from multiple task types supported by a pre-trained general text generation model; determining, based on a requirement of the target text generation task for a target output text, a first target output text attribute for the target text generation task from multiple output text attributes supported by the general text generation model; and fine tuning the general text generation model based on a target training data set associated with the target text generation task to obtain a task-specific text generation model, by taking task indication information for the target task type and first attribute indication information for the first target output text attribute as at least part of an input of the general text generation model.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: June 13, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Jiachen Liu, Zhe Hu, Xinyan Xiao, Hua Wu
  • Publication number: 20230114673
    Abstract: A method for recognizing a token is performed by an electronic device. The method includes: obtaining first modal data and second modal data; determining a first token of the first modal data and a second token of the second modal data; determining an associated token between the first token and the second token; and recognizing a target shared token between the first modal data and the second modal data based on the first token, the second token and the associated token.
    Type: Application
    Filed: November 29, 2022
    Publication date: April 13, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Wei Li, Xinyan Xiao, Jiachen Liu
  • Patent number: 11615242
    Abstract: A method and an apparatus for structuring data are related to information processing technologies in the field of natural language processing. By acquiring an unstructured text and inputting the unstructured text into an encoder-decoder model, an output sequence is obtained. The encoder-decoder model is trained using a training text marked with the attribute value of each attribute. A structured representation is generated based on the attributes corresponding to the attribute elements included in the output sequence and the attribute values comprised in the attribute elements.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: March 28, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Wei Jia, Dai Dai, Xinyan Xiao
  • Publication number: 20230084438
    Abstract: A method of generating a text, a method of training a text generation model, an electronic device, and a storage medium, which relate to a field of a computer technology, in particular to fields of deep learning and natural language processing technologies. A specific implementation solution includes: determining a reference feature representation of a target semantic information; determining, based on the reference feature representation and at least one predetermined logical character, at least one sentence latent representation respectively corresponding to the at least one predetermined logical character; and generating a target text content based on the at least one sentence latent representation.
    Type: Application
    Filed: November 22, 2022
    Publication date: March 16, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Zhe Hu, Jiachen Liu, Xinyan Xiao
  • Patent number: 11574133
    Abstract: The disclosure may provide a method for obtaining a document layout, an electronic device, and a storage medium. The method may include: obtaining a plurality of pieces of first sample data; extracting structured information from each of the plurality of pieces of first sample data as target structured information corresponding to each of the plurality of pieces of first sample data; inputting the plurality of pieces of first sample data into an initial text generation model to generate predicted structured information corresponding to each of the plurality of pieces of first sample data; generating a first loss value based on a difference between the predicted structured information corresponding to each of the plurality of pieces of first sample data and the corresponding target structured information; and training a phrase generation ability of the initial text generation model based on the first loss value to generate the text generation model.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: February 7, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Wei Li, Xinyan Xiao, Hua Wu, Haifeng Wang
  • Publication number: 20230004589
    Abstract: The present disclosure provides a summary generation model training method and apparatus, a device and a storage medium, and relates to the field of computer technologies, and in particular, to the field of artificial intelligence such as natural language processing and deep learning. The summary generation model training method includes: acquiring a document representation corresponding to a document sample; constructing, based on the document representation, a summary representation corresponding to the document representation, the summary representation including a positive summary representation and a negative summary representation; and constructing a total contrastive loss function based on the document representation, the positive summary representation and the negative summary representation, and training a summary generation model based on the total contrastive loss function. The present disclosure may improve accuracy of the summary generation model.
    Type: Application
    Filed: January 18, 2022
    Publication date: January 5, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Wenhao WU, Wei LI, Xinyan XIAO, Jiachen LIU
  • Publication number: 20230004721
    Abstract: Disclosed are a method for training a semantic representation model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the field of artificial intelligence, such as a natural language processing technology, a deep learning technology, or the like.
    Type: Application
    Filed: March 21, 2022
    Publication date: January 5, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Shuai Zhang, Lijie Wang, Xinyan Xiao, Yue Chang
  • Publication number: 20230004717
    Abstract: The present disclosure provides a method and apparatus for acquiring a pre-trained model, an electronic device and a storage medium, and relates to the field of artificial intelligence, such as the natural language processing field, the deep learning field, or the like. The method may include: adding, in a process of training a pre-trained model using training sentences, a learning objective corresponding to syntactic information for a self-attention module in the pre-trained model; and training the pre-trained model according to the defined learning objective. The solution of the present disclosure may improve a performance of the pre-trained model, and reduce consumption of computing resources, or the like.
    Type: Application
    Filed: January 10, 2022
    Publication date: January 5, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Lijie WANG, Shuai ZHANG, Xinyan XIAO, Yue CHANG, Tingting LI
  • Patent number: 11537792
    Abstract: The present disclosure provides a pre-training method for a sentiment analysis model and an electronic device, which relates to a field of artificial intelligence technologies. The method includes: based on a given seed sentiment dictionary, performing sentimental knowledge detection on a training corpus in a training corpus set, and determining a detection sentiment word and a detection word pair of the training corpus; according to preset mask processing rules, performing mask process on the training corpus to generate a masked corpus; performing encoding and decoding on the masked corpus by using a preset encoder and decoder to determine the detection sentiment word and the detection word pair of the training corpus; and updating the preset encoder and decoder according to a difference between prediction sentiment word and the detection sentiment word, and a difference between prediction word pair and the detection word pair.
    Type: Grant
    Filed: July 21, 2020
    Date of Patent: December 27, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Can Gao, Hao Liu, Bolei He, Xinyan Xiao, Hao Tian
  • Patent number: 11507748
    Abstract: Embodiments of the present disclosure provide methods and apparatus for outputting information. The method may include: obtaining a sentence to be identified; Performing word segmentation on the to be identified sentence to obtain a word sequence; Inputting a word sequence into a pre-trained multi-task element recognition model based on sequence labeling and entity word prediction, and outputting the identified entity words, entity categories and entity word positions, where the multi-task element recognition model includes a sequence labeling network for performing sequence labeling tasks and an entity word predicting network for performing entity word predicting task, and the sequence labeling network is fused with the entity word predicting network through a fusion module.
    Type: Grant
    Filed: June 9, 2020
    Date of Patent: November 22, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Yuan Gao, Dai Dai, Xinyan Xiao
  • Patent number: 11507751
    Abstract: The present disclosure discloses a comment information processing method and apparatus, and a medium. The specific implementation solution is: in response to a user operation, determining an opinion category corresponding to each opinion phrase in a comment opinion dictionary; obtaining a target corpus matching each opinion phrase from a plurality of comment corpora; for each opinion phrase, using a corresponding opinion category to label the target corpus matching each opinion phrase to obtain a first training sample; and training a classification model with the first training sample to identify the opinion category of a comment by using a trained classification model.
    Type: Grant
    Filed: July 24, 2020
    Date of Patent: November 22, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Hao Liu, Bolei He, Xinyan Xiao
  • Patent number: 11475055
    Abstract: The present disclosure disclosures an artificial intelligence based method and apparatus for determining regional information. A specific embodiment of the method comprises: acquiring to-be-determined information, and extracting a keyword set of the to-be-determined information; inputting the keyword set of the to-be-determined information into a pre-trained subject classification model for classifying, to obtain a subject category of the to-be-determined information, wherein the subject classification model is used for representing a corresponding relation between the keyword set of the information and the subject category of the information; selecting, from a pre-stored place name set, a place name corresponding to the subject category of the to-be-determined information as a target place name set; matching, in the to-be-determined information, the target place name set; and determining, based on a matching result, whether the to-be-determined information belongs to the regional information.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: October 18, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Liangyu Chen, Xinyan Xiao, Yajuan Lv
  • Publication number: 20220327809
    Abstract: A method for training a model based on multi-modal data joint learning, includes: obtaining multi-modal data; in which the multi-modal data include at least one type of single-modal data and at least one type of Pair multi-modal data; inputting the single-modal data and the Pair multi-modal data into a decoupling attention Transformer network model to generate respectively Token semantic representation features and cross-modal semantic representation features; and training the decoupling attention Transformer network model based on the Token semantic representation features and the cross-modal semantic representation features.
    Type: Application
    Filed: June 27, 2022
    Publication date: October 13, 2022
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Wei Li, Can Gao, Guocheng Niu, Xinyan Xiao, Hao Liu, Jiachen Liu, Hua Wu, Haifeng Wang
  • Publication number: 20220318275
    Abstract: The disclosure provides a search method, an electronic device and a storage medium. The method includes: obtaining a query statement; determining a correlation between the query statement and a candidate result by matching the query statement with a first structured data set corresponding to the candidate result in a search database, in which the first structured data set is generated by performing information extraction on the candidate result by a structured information extraction model generated by training; and determining, based on the correlation, a target search result corresponding to the query statement.
    Type: Application
    Filed: June 23, 2022
    Publication date: October 6, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Wei Jia, Dai Dai, Xinyan Xiao
  • Publication number: 20220292269
    Abstract: The present disclosure discloses a method and apparatus for acquiring a pre-trained model, and relates to natural language processing and deep learning technologies in the field of artificial intelligence technologies. An implementation includes: acquiring training data, the training data including a single-modal language material and a multi-modal language material, and the multi-modal language material including a language material pair formed by a first-modal language material and a second-modal language material; and performing a multi-task training operation on a pre-trained model using the training data, the multi-task including at least one cross-modal contrastive learning task and at least one single-modal learning task; the pre-trained language model obtained in the present disclosure may learn from different forms of language materials, i.e.
    Type: Application
    Filed: October 15, 2021
    Publication date: September 15, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Guocheng NIU, Wei LI, Can GAO, Xinyan XIAO, Hua WU
  • Publication number: 20220179889
    Abstract: The disclosure provides a method for generating a query statement. The method includes: determining a first vector representation based on known nodes in a first syntax tree corresponding to a query statement to be generated; determining a target generation strategy corresponding to a target node to be generated based on the first vector representation and a preset copy reference matrix; generating the target node based on the first vector representation or a second vector representation by performing the target generation strategy, in which the second vector representation is a vector representation corresponding to an adjacent query statement prior to the query statement to be generated; and generating the query statement based on the known nodes and a terminator in response to the target node being the terminator.
    Type: Application
    Filed: February 24, 2022
    Publication date: June 9, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Ao ZHANG, Lijie WANG, Xinyan XIAO, Tingting LI
  • Patent number: 11341366
    Abstract: A cross-modality processing method is related to a field of natural language processing technologies. The method includes: obtaining a sample set, wherein the sample set includes a plurality of corpus and a plurality of images; generating a plurality of training samples according to the sample set, in which each of the plurality of the training samples is a combination of at least one of the plurality of the corpus and at least one of the plurality of the images corresponding to the at least one of the plurality of the corpus; adopting the plurality of the training samples to train a semantic model, so that the semantic model learns semantic vectors containing combinations of the corpus and the images.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: May 24, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Guocheng Niu, Bolei He, Xinyan Xiao