In The Frequency Domain (epo) Patents (Class 704/E15.01)

Speech recognition method and apparatus, and computer-readable storage medium

Patent number: 12217739

Abstract: A speech recognition method, including acquiring first linear frequency spectrums corresponding to audios to be trained with different sampling rates; determining the maximum sampling rate and other sampling rates; determining the maximum frequency domain sequence number of the first linear frequency spectrums as a first frequency domain sequence number and a second frequency domain sequence number; in the first linear frequency spectrums corresponding to the other sampling rate, configuring amplitude values corresponding to each frequency domain sequence number that is greater than the first frequency domain sequence number and less than or equal to the second frequency domain sequence number to be zero to obtain second linear frequency spectrums; determining first speech features and second voice features; and using the first speech features and the second speech features to train a machine learning model.

Type: Grant

Filed: April 30, 2020

Date of Patent: February 4, 2025

Assignee: JINGDONG TECHNOLOGY HOLDING CO., LTD.

Inventor: Li Fu
Speaker-specific voice amplification

Patent number: 12148443

Abstract: A method, system and computer program product for amplifying a single voice during an audio conversation. One embodiment of the method may comprise receiving, by a computing device, an audio sample of speech from a user, and generating, by the computing device, a user-specific acoustic model for enhancement of speech by the user based upon the audio sample. The method may further comprise receiving a live audiovisual stream, the live audiovisual stream including live speech by the user during an audio conversation, wherein the live audiovisual stream includes background noise, and using, by the computing device, the user-specific acoustic model to selectively amplify the live speech during the live audiovisual stream without amplifying the background noise.

Type: Grant

Filed: December 18, 2020

Date of Patent: November 19, 2024

Assignee: International Business Machines Corporation

Inventors: Rachel Ostrand, Sundar Saranathan, Fang Lu, Carla Paola Agurto Rios
Speech personalization and federated training using real world noise

Patent number: 11741944

Abstract: A method of training a speech model includes receiving, at a voice-enabled device, a fixed set of training utterances where each training utterance in the fixed set of training utterances includes a transcription paired with a speech representation of the corresponding training utterance. The method also includes sampling noisy audio data from an environment of the voice-enabled device. For each training utterance in the fixed set of training utterances, the method further includes augmenting, using the noisy audio data sampled from the environment of the voice-enabled device, the speech representation of the corresponding training utterance to generate noisy audio samples and pairing each of the noisy audio samples with the corresponding transcription of the corresponding training utterance. The method additionally includes training a speech model on the noisy audio samples generated for each speech representation in the fixed set of training utterances.

Type: Grant

Filed: November 24, 2020

Date of Patent: August 29, 2023

Assignee: Google LLC

Inventors: Matthew Sharifi, Victor Carbune
Loudspeaker system provided with dynamic speech equalization

Patent number: 11600285

Abstract: A method for speech equalization, comprising the steps of receiving an input audio signal, processing said input audio signal in dependence on frequency and to providing an equalized electric audio signal according to an equalization function, wherein said equalization function comprises at least an actuator part configured to dynamically applying a compensation filter to the received input signal and dynamically applying a transparent filter to the received input signal, and further transmitting an output signal perceivable by a user as sound representative of said electric acoustic input signal or a processed version thereof.

Type: Grant

Filed: July 20, 2021

Date of Patent: March 7, 2023

Assignee: EPOS Group A/S

Inventors: Anders Røser Hansen, Poul Peder Hestbek, Svend Feldt, Casper Fynsk

Speech recognition method and apparatus, and computer-readable storage medium

Speaker-specific voice amplification

Speech personalization and federated training using real world noise

Loudspeaker system provided with dynamic speech equalization