Patents by Inventor Vahid MONTAZERI

Vahid MONTAZERI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Noise suppression using tandem networks

Patent number: 11805360

Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.

Type: Grant

Filed: July 21, 2021

Date of Patent: October 31, 2023

Assignee: QUALCOMM Incorporated

Inventors: Vahid Montazeri, Van Nguyen, Hannes Pessentheiner, Lae-Hoon Kim, Erik Visser, Rogerio Guedes Alves
CONTEXT-BASED SPEECH ENHANCEMENT

Publication number: 20230326477

Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

Type: Application

Filed: June 14, 2023

Publication date: October 12, 2023

Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
Context-based speech enhancement

Patent number: 11715480

Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.

Type: Grant

Filed: March 23, 2021

Date of Patent: August 1, 2023

Assignee: QUALCOMM Incorporated

Inventors: Kyungguen Byun, Shuhua Zhang, Lae-Hoon Kim, Erik Visser, Sunkuk Moon, Vahid Montazeri
Mixed adaptive and fixed coefficient neural networks for speech enhancement

Patent number: 11705147

Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.

Type: Grant

Filed: April 28, 2021

Date of Patent: July 18, 2023

Assignee: QUALCOMM Incorporated

Inventors: Erik Visser, Vahid Montazeri, Shuhua Zhang, Lae-Hoon Kim
Synthesized speech generation

Patent number: 11676571

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

Type: Grant

Filed: January 21, 2021

Date of Patent: June 13, 2023

Assignee: QUALCOMM Incorporated

Inventors: Kyungguen Byun, Sunkuk Moon, Shuhua Zhang, Vahid Montazeri, Lae-Hoon Kim, Erik Visser
NOISE SUPPRESSION USING TANDEM NETWORKS

Publication number: 20230026735

Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.

Type: Application

Filed: July 21, 2021

Publication date: January 26, 2023

Inventors: Vahid MONTAZERI, Van NGUYEN, Hannes PESSENTHEINER, Lae-Hoon KIM, Erik VISSER, Rogerio Guedes ALVES
CONTEXT-BASED SPEECH ENHANCEMENT

Publication number: 20220310108

Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.

Type: Application

Filed: March 23, 2021

Publication date: September 29, 2022

Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
SYNTHESIZED SPEECH GENERATION

Publication number: 20220230623

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

Type: Application

Filed: January 21, 2021

Publication date: July 21, 2022

Applicant: QUALCOMM Incorporated

Inventors: Kyungguen BYUN, Sunkuk MOON, Shuhua ZHANG, Vahid MONTAZERI, Lae-Hoon KIM, Erik VISSER
MIXED ADAPTIVE AND FIXED COEFFICIENT NEURAL NETWORKS FOR SPEECH ENHANCEMENT

Publication number: 20210343306

Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.

Type: Application

Filed: April 28, 2021

Publication date: November 4, 2021

Inventors: Erik VISSER, Vahid MONTAZERI, Shuhua ZHANG, Lae-Hoon KIM