Patents by Inventor Chuanqiang ZHANG

Chuanqiang ZHANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11704498
    Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: July 18, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Zhi Li, Hua Wu
  • Publication number: 20230015313
    Abstract: Disclosed are a translation method, a classification model training method, a device and a storage medium, which relate to the field of computer technologies, particularly to the field of artificial intelligence such as natural language processing and deep learning. The translation method includes: obtaining a current processing unit of a source language text based on a segmented word in the source language text; determining a classification result of the current processing unit with a classification model; and in response to determining that the classification result is the current processing unit being translatable separately, translating the current processing unit to obtain translation result in a target language corresponding to the current processing unit.
    Type: Application
    Filed: March 23, 2022
    Publication date: January 19, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Chuanqiang Zhang, Ruiqing Zhang, Zhongjun He, Zhi Li, Hua Wu
  • Publication number: 20220391594
    Abstract: A display method, a method of training a semantic unit detection model, an electronic device, and a storage medium, which relate to a field of artificial intelligence technology, in particular to fields of natural language processing and machine translation technologies. The display method includes: acquiring a language sequence to be displayed; dividing the language sequence to be displayed into a plurality of semantic units with semantics; and converting the plurality of semantic units into subtitles for display one by one.
    Type: Application
    Filed: August 18, 2022
    Publication date: December 8, 2022
    Inventors: Haifeng Wang, Zhongjun He, Hua Wu, Zhanyi Liu, Zhi Li, Xing Wan, Jingxuan Zhao, Ruiqing Zhang, Chuanqiang Zhang, Fengtao Huang, Shuangshuang Cui, Yongzheng Xin
  • Publication number: 20220391602
    Abstract: A display method, an electronic device, and a storage medium, which relate to a field of natural language processing and a field of display. The display method includes: acquiring a content to be displayed; extracting a target term from the content using a term extraction rule; acquiring an annotation information for at least one target term, responsive to an extraction of the at least one target term; and displaying the annotation information for the at least one target term and the content.
    Type: Application
    Filed: August 18, 2022
    Publication date: December 8, 2022
    Inventors: Haifeng WANG, Zhanyi LIU, Zhongjun HE, Hua WU, Zhi LI, Xing WAN, Jingxuan ZHAO, Ruiqing ZHANG, Chuanqiang ZHANG, Fengtao HUANG, Hanbing SONG, Wei DI, Shuangshuang CUI, Yongzheng XIN
  • Patent number: 11423222
    Abstract: A method for text error correction includes: obtaining a text to be corrected; obtaining a pinyin sequence of the text to be corrected; and inputting the text to be corrected and the pinyin sequence to a text error correction model, to obtain a corrected text.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: August 23, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Zhi Li, Hua Wu
  • Patent number: 11409968
    Abstract: Embodiments of the present disclosure provide a language conversion method and apparatus based on syntactic linearity and a non-transitory computer-readable storage medium. The method includes: encoding a source sentence to be converted by using a preset encoder to determine a first vector and a second vector corresponding to the source sentence; determining a current mask vector according to a preset rule, in which the mask vector is configured to modify vectors output by the preset encoder; determining a third vector according to target language characters corresponding to source characters located before a first source character; and decoding the first vector, the second vector, the mask vector, and the third vector by using a preset decoder to generate a target character corresponding to the first source character.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: August 9, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang
  • Publication number: 20220180058
    Abstract: The present disclosure provides a text error correction method, apparatus, electronic device and storage medium, and relates to the technical field of artificial intelligence such as natural language processing and deep learning. A specific implementation solution is: obtaining a current sentence and a historical sentence of the current sentence in an article to which the current sentence belongs; performing text error correction processing on the current sentence based on the current sentence and the historical sentence. According to the technical solutions of the present disclosure, text error correction can be performed on the current sentence based on the historical sentence, namely, the upper contextual information, of the current sentence in the article, so that the error correction information is richer and the error correction result is more accurate.
    Type: Application
    Filed: July 23, 2021
    Publication date: June 9, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ruiqing ZHANG, Chuanqiang ZHANG, Zhongjun HE, Zhi LI, Hua WU
  • Patent number: 11275904
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for translating a polysemy, and a medium. The method includes: obtaining a source language text; identifying and obtaining the polysemy from the source language text; inquiring related words corresponding to each interpretation of the polysemy; determining a target interpretation corresponding to the related words contained in the source language text; and translating the polysemy into the target interpretation.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: March 15, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Hao Xiong, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang
  • Publication number: 20210390266
    Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second sample training set.
    Type: Application
    Filed: March 12, 2021
    Publication date: December 16, 2021
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Zhi Li, Hua Wu
  • Publication number: 20210326538
    Abstract: A method for text translation includes obtaining a text to be translated; and inputting the text to be translated into a text translation model. The trained text translation model divides the text to be translated into a plurality of semantic units, determines N semantic units before a current semantic unit among the plurality of semantic units as local context semantic units, determines M semantic units before the local context semantic units as global context semantic units, and generates a translation result of the current semantic unit based on the local context semantic units and the global context semantic units. N is an integer, and M is an integer.
    Type: Application
    Filed: June 29, 2021
    Publication date: October 21, 2021
    Inventors: Chuanqiang ZHANG, Ruiqing ZHANG, Zhi LI, Zhongjun HE, Hua WU
  • Patent number: 11132518
    Abstract: A method and apparatus for translating speech are provided. The method may include: recognizing received to-be-recognized speech of a source language to obtain a recognized text; concatenating the obtained recognized text after a to-be-translated text, to form a concatenated to-be-translated text; inputting the concatenated to-be-translated text into a pre-trained discriminant model to obtain a discrimination result for characterizing whether the concatenated to-be-translated text is to be translated, where the discriminant model is used to characterize a corresponding relationship between a text and a discrimination result corresponding to the text; in response to the positive discrimination result being obtained, translating the concatenated to-be-translated text to obtain a translation result of a target language, and outputting the translation result.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: September 28, 2021
    Inventors: Chuanqiang Zhang, Tianchi Bi, Hao Xiong, Zhi Li, Zhongjun He, Haifeng Wang
  • Patent number: 11126800
    Abstract: Presented herein are embodiments of a prefix-to-prefix framework for simultaneous translation that implicitly learns to anticipates in a single translation. Within these frameworks are effective “wait-k” policy model embodiments that may be trained to generate a target sentence concurrently with a source sentence but lag behind by a predefined number of words. Embodiments of the prefix-to-prefix framework achieve low latency and better quality when compared to full-sentence translation in four directions: Chinese?English and German?English. Also presented herein is a novel latency metric that addresses deficiencies of previous latency metrics.
    Type: Grant
    Filed: May 10, 2019
    Date of Patent: September 21, 2021
    Assignee: Baidu USA LLC.
    Inventors: Mingbo Ma, Liang Huang, Hao Xiong, Kaibo Liu, Chuanqiang Zhang, Renjie Zheng, Zhongjun He, Hairong Liu, Xing Li, Hua Wu, Haifeng Wang, Baigong Zheng
  • Publication number: 20210248309
    Abstract: A method for text error correction includes: obtaining a text to be corrected; obtaining a pinyin sequence of the text to be corrected; and inputting the text to be corrected and the pinyin sequence to a text error correction model, to obtain a corrected text.
    Type: Application
    Filed: April 28, 2021
    Publication date: August 12, 2021
    Inventors: Ruiqing ZHANG, Chuanqiang ZHANG, Zhongjun HE, Zhi LI, Hua WU
  • Publication number: 20210200963
    Abstract: The present disclosure provides a machine translation model training method, apparatus, electronic device and storage medium, which relates to the technical field of natural language processing. A specific implementation solution is as follows: selecting, from parallel corpuses, a set of samples whose translation quality satisfies a preset requirement and which have universal-field features and/or target-field features, to constitute a first training sample set; selecting, from the parallel corpuses, a set of samples whose translation quality satisfies a preset requirement and which do not have universal-field features and target-field features, to constitute a second training sample set; training an encoder in the machine translation model in the target field, a discriminator configured in encoding layers of the encoder, and the encoder and a decoder in the machine translation model in the target field in turn with the first training sample set and second training sample set, respectively.
    Type: Application
    Filed: March 12, 2021
    Publication date: July 1, 2021
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Jiqiang Liu, Zhongjun He, Zhi Li, Hua Wu
  • Publication number: 20210192150
    Abstract: Embodiments of the present disclosure provide a language conversion method and apparatus based on syntactic linearity and a non-transitory computer-readable storage medium. The method includes: encoding a source sentence to be converted by using a preset encoder to determine a first vector and a second vector corresponding to the source sentence; determining a current mask vector according to a preset rule, in which the mask vector is configured to modify vectors output by the preset encoder; determining a third vector according to target language characters corresponding to source characters located before a first source character; and decoding the first vector, the second vector, the mask vector, and the third vector by using a preset decoder to generate a target character corresponding to the first source character.
    Type: Application
    Filed: July 10, 2020
    Publication date: June 24, 2021
    Inventors: Ruiqing ZHANG, Chuanqiang ZHANG, Hao XIONG, Zhongjun HE, Hua WU, Haifeng WANG
  • Publication number: 20210192147
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for translating a polysemy, and a medium. The method includes: obtaining a source language text; identifying and obtaining the polysemy from the source language text; inquiring related words corresponding to each interpretation of the polysemy; determining a target interpretation corresponding to the related words contained in the source language text; and translating the polysemy into the target interpretation.
    Type: Application
    Filed: May 6, 2020
    Publication date: June 24, 2021
    Inventors: Ruiqing ZHANG, Chuanqiang ZHANG, Hao XIONG, Zhongjun HE, Hua WU, Zhi LI, Haifeng WANG
  • Publication number: 20200192986
    Abstract: A method and apparatus for translating speech are provided. The method may include: recognizing received to-be-recognized speech of a source language to obtain a recognized text; concatenating the obtained recognized text after a to-be-translated text, to form a concatenated to-be-translated text; inputting the concatenated to-be-translated text into a pre-trained discriminant model to obtain a discrimination result for characterizing whether the concatenated to-be-translated text is to be translated, where the discriminant model is used to characterize a corresponding relationship between a text and a discrimination result corresponding to the text; in response to the positive discrimination result being obtained, translating the concatenated to-be-translated text to obtain a translation result of a target language, and outputting the translation result.
    Type: Application
    Filed: November 21, 2019
    Publication date: June 18, 2020
    Inventors: Chuanqiang Zhang, Tianchi Bi, Hao Xiong, Zhi Li, Zhongjun He, Haifeng Wang
  • Publication number: 20200104371
    Abstract: Presented herein are embodiments of a prefix-to-prefix framework for simultaneous translation that implicitly learns to anticipates in a single translation. Within these frameworks are effective “wait-k” policy model embodiments that may be trained to generate a target sentence concurrently with a source sentence but lag behind by a predefined number of words. Embodiments of the prefix-to-prefix framework achieve low latency and better quality when compared to full-sentence translation in four directions: Chinese?English and German?English. Also presented herein is a novel latency metric that addresses deficiencies of previous latency metrics.
    Type: Application
    Filed: May 10, 2019
    Publication date: April 2, 2020
    Applicant: Baidu USA LLC
    Inventors: Mingbo MA, Liang HUANG, Hao XIONG, Kaibo LIU, Chuanqiang ZHANG, Renjie ZHENG, Zhongjun HE, Hairong LIU, Xing LI, Hua Wu, Haifeng WANG, Baigong ZHENG