Patents Assigned to iFLYTEK Co., Ltd.

Speech recognition method, apparatus and device, and storage medium

Patent number: 12626687

Abstract: Provided in the present application are a speech recognition method, apparatus and device, and a storage medium. The method comprises: acquiring a speech feature of target mixed speech and a speaker feature of a specified speaker; taking the direction of tending to a target speech feature as an extraction direction, and according to the speech feature of the target mixed speech and a speaker feature of a target speaker, extracting a speech feature of the target speaker from the speech feature of the target mixed speech, so as to obtain an extracted speech feature of the target speaker; and acquiring a speech recognition result of the specified speaker according to an extracted speech feature of the specified speaker.

Type: Grant

Filed: November 10, 2021

Date of Patent: May 12, 2026

Assignees: UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA, IFLYTEK CO., LTD.

Inventors: Xin Fang, Junhua Liu
SPEECH RECOGNITION METHOD, SPEECH RECOGNITION MODEL TRAINING METHOD, AND ELECTRONIC DEVICE

Publication number: 20250356847

Abstract: A speech recognition method is provided. The method includes: obtaining a to-be-recognized speech and a speech recognition model, including an encoding network and a decoding network, after training; during each stage of encoding the to-be-recognized speech using the encoding network, classifying the to-be-recognized speech under a target speech attribute to obtain a predicted attribute category, and performing encoding to obtain a first encoding feature according to the predicted attribute category under the target speech attribute; decoding the first encoding feature according to the decoding network to obtain a recognition text of the to-be-recognized speech, the speech recognition model being adjusted according to at least a first loss, which represents a difference between a preset attribute category annotated in a speech sample and a sample attribute category recognized and obtained by the speech recognition model under the target speech attribute.

Type: Application

Filed: July 24, 2025

Publication date: November 20, 2025

Applicant: IFLYTEK CO., LTD.

Inventors: Wenhui ZHANG, Genshun WAN, Dingshu TIAN, Jianqing GAO, Jia PAN, Cong LIU, Guoping HU
Voice recognition method and related product

Patent number: 12451128

Abstract: A speech recognition method and related products are provided. The method includes acquiring text contents and text-associated time information transmitted by a plurality of terminals in a preset scenario and determining a shared text for the preset scenario based on the text contents and the text-associated time information, obtaining a customized language model for the preset scenario based on the shared text, and performing speech recognition for the preset scenario with the customized language model. The method provides improved speech recognition for the preset scenario due to the correlation between the customized language model and the preset scenario.

Type: Grant

Filed: December 14, 2020

Date of Patent: October 21, 2025

Assignee: IFLYTEK CO., LTD.

Inventors: Genshun Wan, Jianqing Gao, Zhiguo Wang
SUMMARY DETERMINATION METHOD AND RELATED DEVICE THEREOF

Publication number: 20250061280

Abstract: A minutes determining method is provided. The method includes acquiring a to-be-used user record and a to-be-used record text; performing sentence segmentation processing on the to-be-used record text, to obtain at least one to-be-used sentence; performing semantic matching processing on the to-be-used user record and the at least one to-be-used sentence, to obtain a to-be-used semantic matching result; and determining to-be-used minutes content according to the to-be-used user record and the to-be-used semantic matching result.

Type: Application

Filed: November 21, 2022

Publication date: February 20, 2025

Applicant: IFLYTEK CO., LTD.

Inventors: Li YAN, Ting QI, Jianqing GAO, Jingting SUN
Decoding network construction method, voice recognition method, device and apparatus, and storage medium

Patent number: 12223947

Abstract: A method for constructing a decoding network, a speech recognition method, a device, an apparatus, and a storage medium are provided. The method for constructing a decoding network includes: acquiring a general language model, a domain language model, and a general decoding network generated based on the general language model; generating a domain decoding network based on the domain language model and the general language model; and integrating the domain decoding network with the general decoding network to obtain a target decoding network. The speech recognition method includes: decoding to-be-recognized speech data by using a target decoding network to obtain a decoding path for the to-be-recognized speech data; and determining a speech recognition result for the to-be-recognized speech data based on the decoding path for the to-be-recognized speech data.

Type: Grant

Filed: December 12, 2019

Date of Patent: February 11, 2025

Assignee: IFLYTEK CO., LTD.

Inventors: Jianqing Gao, Zhiguo Wang, Guoping Hu
SPEECH RECOGNITION METHOD, APPARATUS AND DEVICE, AND STORAGE MEDIUM

Publication number: 20240395242

Abstract: Provided in the present application are a speech recognition method, apparatus and device, and a storage medium. The method comprises: acquiring a speech feature of target mixed speech and a speaker feature of a specified speaker; taking the direction of tending to a target speech feature as an extraction direction, and according to the speech feature of the target mixed speech and a speaker feature of a target speaker, extracting a speech feature of the target speaker from the speech feature of the target mixed speech, so as to obtain an extracted speech feature of the target speaker and acquiring a speech recognition result of the specified speaker according to an extracted speech feature of the specified speaker.

Type: Application

Filed: November 10, 2021

Publication date: November 28, 2024

Applicants: UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA, IFLYTEK CO., LTD.

Inventors: Xin Fang, Junhua Liu
Chapter-level text translation method and device

Patent number: 11694041

Abstract: A discourse-level text translation method and device, the method comprising: acquiring a text to be translated, the text to be translated being a unit text in a discourse-level text to be translated (S101); acquiring an associated text of the text to be translated, the associated text including at least one of a preceding source text, a following source text, and a preceding target text (S102); and translating, according to the associated text, the text to be translated (S103).

Type: Grant

Filed: April 10, 2019

Date of Patent: July 4, 2023

Assignee: IFLYTEK CO., LTD.

Inventors: Zhiqiang Ma, Junhua Liu, Si Wei, Guoping Hu
SPEECH RECOGNITION METHOD, APPARATUS AND DEVICE, AND STORAGE MEDIUM

Publication number: 20230186912

Abstract: A speech recognition method and related products are provided. The method includes acquiring a to-be-recognized speech and a configured hot word library; determining, based on the to-be-recognized speech and the hot word library, an audio-related feature used at a current decoding time instant; determining, based on the audio-related feature, a hot word-related feature used at the current decoding time instant from the hot word library; and determining, based on the audio-related feature and the hot word-related feature, a recognition result of the to-be-recognized speech at the current decoding time instant.

Type: Application

Filed: December 2, 2020

Publication date: June 15, 2023

Applicant: IFLYTEK CO., LTD.

Inventors: Shifu XIONG, Cong LIU, Si WEI, Qingfeng LIU, Jianqing GAO, Jia PAN
End-to-end modelling method and system

Patent number: 11651578

Abstract: A method and a system for end-to-end modeling are provided. The method includes: determining a topological structure of a target-based end-to-end model, where the topological structure includes an input layer, an encoding layer, an code enhancement layer, a filtering layer, a decoding layer and an output layer; the code enhancement layer adds information of a target unit to a feature sequence outputted by the encoding layer, the filtering layer filters a feature sequence added with the information of the target unit; collecting multiple pieces of training data; and training parameters of the target-based end-to-end model by using the multiple pieces of the training data.

Type: Grant

Filed: January 11, 2017

Date of Patent: May 16, 2023

Assignee: IFLYTEK CO., LTD.

Inventors: Jia Pan, Shiliang Zhang, Shifu Xiong, Si Wei, Guoping Hu
VOICE RECOGNITION METHOD AND RELATED PRODUCT

Publication number: 20230035947

Abstract: A speech recognition method and related products are provided. The method includes acquiring text contents and text-associated time information transmitted by a plurality of terminals in a preset scenario and determining a shared text for the preset scenario based on the text contents and the text-associated time information, obtaining a customized language model for the preset scenario based on the shared text, and performing speech recognition for the preset scenario with the customized language model. The method provides improved speech recognition for the preset scenario due to the correlation between the customized language model and the preset scenario.

Type: Application

Filed: December 14, 2020

Publication date: February 2, 2023

Applicant: IFLYTEK CO., LTD.

Inventors: Genshun WAN, Jianqing GAO, Zhiguo WANG
SPEECH RECOGNITION ERROR CORRECTION METHOD, RELATED DEVICES, AND READABLE STORAGE MEDIUM

Publication number: 20220383853

Abstract: A speech recognition error correction method and device, and a readable storage medium are provided. The method includes: acquiring to-be-recognized speech data and a first recognition result of the speech data, re-recognizing the speech data with reference to context information in the first recognition result to obtain a second recognition result, and determining a final recognition result based on the second recognition result. In the method, the speech data is re-recognized with reference to context information in the first recognition result, which fully considers context information in the recognition result and the application scenario of the speech data. If any error occurs in the first recognition result, the first recognition result is corrected based on the second recognition. Therefore, the accuracy of speech recognition can be improved.

Type: Application

Filed: November 17, 2020

Publication date: December 1, 2022

Applicant: IFLYTEK CO., LTD.

Inventors: Li XU, Jia PAN, Zhiguo WANG, Guoping HU
DECODING NETWORK CONSTRUCTION METHOD, VOICE RECOGNITION METHOD, DEVICE AND APPARATUS, AND STORAGE MEDIUM

Publication number: 20220375459

Abstract: A method for constructing a decoding network, a speech recognition method, a device, an apparatus, and a storage medium are provided. The method for constructing a decoding network includes: acquiring a general language model, a domain language model, and a general decoding network generated based on the general language model; generating a domain decoding network based on the domain language model and the general language model; and integrating the domain decoding network with the general decoding network to obtain a target decoding network. The speech recognition method includes: decoding to-be-recognized speech data by using a target decoding network to obtain a decoding path for the to-be-recognized speech data; and determining a speech recognition result for the to-be-recognized speech data based on the decoding path for the to-be-recognized speech data.

Type: Application

Filed: December 12, 2019

Publication date: November 24, 2022

Applicant: IFLYTEK CO., LTD.

Inventors: Jianqing GAO, Zhiguo WANG, Guoping HU
Whispering voice recovery method, apparatus and device, and readable storage medium

Patent number: 11508366

Abstract: A method, an apparatus and a device for converting a whispered speech, and a readable storage medium are provided. The method is implemented based on the whispered speech converting model. The whispered speech converting model is trained in advance by using recognition results and whispered speech training acoustic features of whispered speech training data as samples and using normal speech acoustic features of normal speech data parallel to the whispered speech training data as sample labels. A whispered speech acoustic feature and a preliminary recognition result of whispered speech data are acquired, then the whispered speech acoustic feature and the preliminary recognition result are inputted into a preset whispered speech converting model to acquire a normal speech acoustic feature outputted by the model. In this way, the whispered speech can be converted to a normal speech.

Type: Grant

Filed: June 15, 2018

Date of Patent: November 22, 2022

Assignee: IFLYTEK CO., LTD.

Inventors: Jia Pan, Cong Liu, Haikun Wang, Zhiguo Wang, Guoping Hu
Target voice detection method and apparatus

Patent number: 11308974

Abstract: A target voice detection method and a target voice detection apparatus are provided. The method includes: receiving sound signals collected by a microphone array; performing a beamforming process on the sound signals to obtain beams in different directions; extracting a detection feature of each frame based on the sound signals and the beams in different directions; inputting an extracted detection feature of a current frame into a pre-constructed target voice detection model to obtain a model output result; and obtaining a target voice detection result of the current frame based on the model output result.

Type: Grant

Filed: July 16, 2018

Date of Patent: April 19, 2022

Assignee: IFLYTEK CO., LTD.

Inventors: Feng Ma, Haikun Wang, Zhiguo Wang, Guoping Hu
Microphone array-based target voice acquisition method and device

Patent number: 11081123

Abstract: A microphone array-based target voice acquisition method and device, said method comprising: receiving voice signals acquired on the basis of a microphone array (101); determining a pre-selected target voice signal and a direction thereof (102); performing strong directional gain and weak directional gain on the pre-selected target voice signal, so as to obtain a strong gain signal and a weak gain signal (103); performing an endpoint detection on the basis of the strong gain signal, so as to obtain an endpoint detection result (104); and performing endpoint processing on the weak gain signal according to the endpoint detection result, so as to obtain a final target voice signal (105). The present invention can obtain an accurate and reliable target voice signal, thereby avoiding an adverse effect of the target voice quality on subsequent target voice processing.

Type: Grant

Filed: July 16, 2018

Date of Patent: August 3, 2021

Assignee: IFLYTEK CO., LTD.

Inventors: Dongyang Xu, Haikun Wang, Zhiguo Wang, Guoping Hu
Voice denoising method and apparatus, server and storage medium

Patent number: 11064296

Abstract: Provided are a voice denoising method and apparatus, a server and a storage medium. The voice denoising method comprises: acquiring voice signals synchronously collected by an acoustic microphone and a non-acoustic microphone (S100); carrying out voice activity detection according to the voice signal collected by the non-acoustic microphone to obtain a voice activity detection result (S110); and according to the voice activity detection result, denoising the voice signal collected by the acoustic microphone to obtain a denoised voice signal (S120). The effect of denoising can be enhanced, and the quality of voice signals can be improved.

Type: Grant

Filed: June 15, 2018

Date of Patent: July 13, 2021

Assignee: IFLYTEK CO., LTD.

Inventors: Haikun Wang, Feng Ma, Zhiguo Wang
CHAPTER-LEVEL TEXT TRANSLATION METHOD AND DEVICE

Publication number: 20210150154

Abstract: A discourse-level text translation method and device, the method comprising: acquiring a text to be translated, the text to be translated being a unit text in a discourse-level text to be translated (S101); acquiring an associated text of the text to be translated, the associated text including at least one of a preceding source text, a following source text, and a preceding target text (S102); and translating, according to the associated text, the text to be translated (S103).

Type: Application

Filed: April 10, 2019

Publication date: May 20, 2021

Applicant: IFLYTEK CO., LTD.

Inventors: Zhiqiang MA, Junhua LIU, Si WEI, Guoping HU
Method, device, and storage medium for evaluating speech quality

Patent number: 10964337

Abstract: A method, a device and a storage medium for evaluating speech quality include: receiving speech data to be evaluated; extracting evaluation features of the speech data to be evaluated; performing quality evaluation to the speech data to be evaluated according to the evaluation features of the speech data to be evaluated and a predetermined speech quality evaluation model, in which the speech quality evaluation model is an indication of a relationship between evaluation features of single-ended speech data and quality information of the single-ended speech data.

Type: Grant

Filed: February 20, 2019

Date of Patent: March 30, 2021

Assignee: Iflytek Co., Ltd.

Inventors: Bing Yin, Si Wei, Guoping Hu, Su Cheng
Voice recorder

Patent number: D1107675

Type: Grant

Filed: October 25, 2024

Date of Patent: December 30, 2025

Assignee: IFLYTEK CO., LTD.

Inventors: Binbin Fei, Chen Zhu
Language translation device

Patent number: D1117130

Type: Grant

Filed: January 6, 2025

Date of Patent: March 10, 2026

Assignee: IFLYTEK CO., LTD.

Inventors: Yali Fan, Yuling Song, Binbin Fei, Chen Zhu, Chi Zhang

1 2 next