Patents by Inventor Erik Visser

Erik Visser has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NOISE SUPPRESSION USING TANDEM NETWORKS

Publication number: 20230026735

Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.

Type: Application

Filed: July 21, 2021

Publication date: January 26, 2023

Inventors: Vahid MONTAZERI, Van NGUYEN, Hannes PESSENTHEINER, Lae-Hoon KIM, Erik VISSER, Rogerio Guedes ALVES
Active self-voice naturalization using a bone conduction sensor

Patent number: 11533561

Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.

Type: Grant

Filed: February 9, 2022

Date of Patent: December 20, 2022

Assignee: QUALCOMM Incorporated

Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
AUDIO ZOOM

Publication number: 20220360891

Abstract: A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.

Type: Application

Filed: May 10, 2021

Publication date: November 10, 2022

Inventors: Lae-Hoon KIM, Fatemeh SAKI, Yoon Mo YANG, Erik VISSER
CONTEXT-BASED SPEECH ENHANCEMENT

Publication number: 20220310108

Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.

Type: Application

Filed: March 23, 2021

Publication date: September 29, 2022

Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
IN-VEHICLE VOICE COMMAND CONTROL

Publication number: 20220277744

Abstract: A vehicle includes an interface device, an in-vehicle control unit, a functional unit, and a processing circuitry. The interface device receives a spoken command to identify an in-cabin vehicle zone of two or more in-cabin vehicle zones of the vehicle, and receives background audio data concurrently with a portion of the spoken command. The in-cabin vehicle control unit separates the background audio data from the spoken command, and selects which in-cabin vehicle zone of the two or more in-cabin vehicle zones is identified by the spoken command. The functional unit controls a function within the vehicle. The processing circuitry stores, to a command buffer, data processed from the received spoken command, and controls, based on the data processed from the received spoken command, the functional unit using audio input received from the selected in-cabin vehicle zone.

Type: Application

Filed: May 18, 2022

Publication date: September 1, 2022

Inventors: Asif Iqbal Mohammad, Sreekanth Narayanaswamy, Rishabh Tyagi, Erik Visser
ACTIVE SELF-VOICE NATURALIZATION USING A BONE CONDUCTION SENSOR

Publication number: 20220272451

Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.

Type: Application

Filed: February 9, 2022

Publication date: August 25, 2022

Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
Spatial audio zoom

Patent number: 11425497

Abstract: In an aspect, a lens is zoomed in to create a zoomed lens. Lens data associated with the lens includes a direction of the lens relative to an object in a field-of-view of the zoomed lens and a magnification of the object resulting from the zoomed lens. An array of microphones capture audio signals including audio produced by the object and interference produced by other objects. The audio signals are processed to identify a directional component associated with the audio produced by the object and three orthogonal components associated with the interference produced by the other objects. Stereo beamforming is used to increase a magnitude of the directional component (relative to the interference) while retaining a binaural nature of the audio signals. The increase in magnitude of the directional component is based on an amount of the magnification provided by the zoomed lens to the object.

Type: Grant

Filed: December 18, 2020

Date of Patent: August 23, 2022

Assignee: QUALCOMM Incorporated

Inventors: S M Akramus Salehin, Lae-Hoon Kim, Vasudev Nayak, Shankar Thagadur Shivappa, Isaac Garcia Munoz, Sanghyun Chi, Erik Visser
Adaptive sound event classification

Patent number: 11410677

Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.

Type: Grant

Filed: November 24, 2020

Date of Patent: August 9, 2022

Assignee: Qualcomm Incorporated

Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser
SYNTHESIZED SPEECH GENERATION

Publication number: 20220230623

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

Type: Application

Filed: January 21, 2021

Publication date: July 21, 2022

Applicant: QUALCOMM Incorporated

Inventors: Kyungguen BYUN, Sunkuk MOON, Shuhua ZHANG, Vahid MONTAZERI, Lae-Hoon KIM, Erik VISSER
SPATIAL AUDIO WIND NOISE DETECTION

Publication number: 20220199100

Abstract: A device includes one or more processors configured to obtain audio signals representing sound captured by at least three microphones and determine spatial audio data based on the audio signals. The one or more processors are further configured to determine a metric indicative of wind noise in the audio signals. The metric is based on a comparison of a first value and a second value. The first value corresponds to an aggregate signal based on the spatial audio data, and the second value corresponds to a differential signal based on the spatial audio data.

Type: Application

Filed: December 21, 2020

Publication date: June 23, 2022

Inventors: S M Akramus SALEHIN, Lae-Hoon KIM, Hannes PESSENTHEINER, Shuhua ZHANG, Sanghyun CHI, Erik VISSER, Shankar THAGADUR SHIVAPPA
SPATIAL AUDIO ZOOM

Publication number: 20220201395

Abstract: In an aspect, a lens is zoomed in to create a zoomed lens. Lens data associated with the lens includes a direction of the lens relative to an object in a field-of-view of the zoomed lens and a magnification of the object resulting from the zoomed lens. An array of microphones capture audio signals including audio produced by the object and interference produced by other objects. The audio signals are processed to identify a directional component associated with the audio produced by the object and three orthogonal components associated with the interference produced by the other objects. Stereo beamforming is used to increase a magnitude of the directional component (relative to the interference) while retaining a binaural nature of the audio signals. The increase in magnitude of the directional component is based on an amount of the magnification provided by the zoomed lens to the object.

Type: Application

Filed: December 18, 2020

Publication date: June 23, 2022

Inventors: S M Akramus SALEHIN, Lae-Hoon KIM, Vasudev NAYAK, Shankar THAGADUR SHIVAPPA, Isaac Garcia MUNOZ, Sanghyun CHI, Erik VISSER
USER SPEECH PROFILE MANAGEMENT

Publication number: 20220180859

Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.

Type: Application

Filed: December 8, 2020

Publication date: June 9, 2022

Inventors: Soo Jin PARK, Sunkuk MOON, Lae-Hoon KIM, Erik VISSER
Multi-modal user interface

Patent number: 11348581

Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.

Type: Grant

Filed: November 15, 2019

Date of Patent: May 31, 2022

Assignee: Qualcomm Incorporated

Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
SHARED SPEECH PROCESSING NETWORK FOR MULTIPLE SPEECH APPLICATIONS

Publication number: 20220165285

Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules. A first speech application module corresponds to a speaker verifier, and a second speech application module corresponds to a speech recognition network.

Type: Application

Filed: February 10, 2022

Publication date: May 26, 2022

Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
ADAPTIVE SOUND EVENT CLASSIFICATION

Publication number: 20220165292

Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
TRANSFER LEARNING FOR SOUND EVENT CLASSIFICATION

Publication number: 20220164667

Abstract: A method includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The method also includes, after training the second neural network and the one or more coupling networks, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
CONTEXT-BASED MODEL SELECTION

Publication number: 20220164662

Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
USER VOICE ACTIVITY DETECTION USING DYNAMIC CLASSIFIER

Publication number: 20220115007

Abstract: A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.

Type: Application

Filed: May 5, 2021

Publication date: April 14, 2022

Inventors: Taher SHAHBAZI MIRZAHASANLOO, Rogerio Guedes ALVES, Erik VISSER, Lae-Hoon KIM
ACTIVE SELF-VOICE NATURALIZATION USING A BONE CONDUCTION SENSOR

Publication number: 20220109930

Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.

Type: Application

Filed: November 18, 2021

Publication date: April 7, 2022

Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
Wireless control of remote devices through intention codes over a wireless connection

Patent number: 11290518

Abstract: Various embodiments provide systems and methods which disclose a command device which can be used to establish a wireless connection, through one or more wireless channels, between the command device and a remote device. An intention code may be generated, prior to, or after, the establishment of the wireless connection, and the remote device may be selected based on the intention code. The command device may initiate a wireless transfer, through one or more wireless channels of the established wireless connection, of an intention code, and receive acknowledgement that the intention code was successfully transferred to the remote device. The command device may then control the remote device, based on the intention code sent to the remote device, through the one or more wireless channels of the established wireless connection between the command device and the remote device.

Type: Grant

Filed: September 27, 2017

Date of Patent: March 29, 2022

Assignee: Qualcomm Incorporated

Inventors: Lae-Hoon Kim, Erik Visser, Yinyi Guo

prev 1 2 3 4 5 6 7 8 … next