Patents by Inventor Erik Visser

Erik Visser has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230026735
    Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.
    Type: Application
    Filed: July 21, 2021
    Publication date: January 26, 2023
    Inventors: Vahid MONTAZERI, Van NGUYEN, Hannes PESSENTHEINER, Lae-Hoon KIM, Erik VISSER, Rogerio Guedes ALVES
  • Patent number: 11533561
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Grant
    Filed: February 9, 2022
    Date of Patent: December 20, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Publication number: 20220360891
    Abstract: A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.
    Type: Application
    Filed: May 10, 2021
    Publication date: November 10, 2022
    Inventors: Lae-Hoon KIM, Fatemeh SAKI, Yoon Mo YANG, Erik VISSER
  • Publication number: 20220310108
    Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.
    Type: Application
    Filed: March 23, 2021
    Publication date: September 29, 2022
    Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
  • Publication number: 20220277744
    Abstract: A vehicle includes an interface device, an in-vehicle control unit, a functional unit, and a processing circuitry. The interface device receives a spoken command to identify an in-cabin vehicle zone of two or more in-cabin vehicle zones of the vehicle, and receives background audio data concurrently with a portion of the spoken command. The in-cabin vehicle control unit separates the background audio data from the spoken command, and selects which in-cabin vehicle zone of the two or more in-cabin vehicle zones is identified by the spoken command. The functional unit controls a function within the vehicle. The processing circuitry stores, to a command buffer, data processed from the received spoken command, and controls, based on the data processed from the received spoken command, the functional unit using audio input received from the selected in-cabin vehicle zone.
    Type: Application
    Filed: May 18, 2022
    Publication date: September 1, 2022
    Inventors: Asif Iqbal Mohammad, Sreekanth Narayanaswamy, Rishabh Tyagi, Erik Visser
  • Publication number: 20220272451
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Application
    Filed: February 9, 2022
    Publication date: August 25, 2022
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Patent number: 11425497
    Abstract: In an aspect, a lens is zoomed in to create a zoomed lens. Lens data associated with the lens includes a direction of the lens relative to an object in a field-of-view of the zoomed lens and a magnification of the object resulting from the zoomed lens. An array of microphones capture audio signals including audio produced by the object and interference produced by other objects. The audio signals are processed to identify a directional component associated with the audio produced by the object and three orthogonal components associated with the interference produced by the other objects. Stereo beamforming is used to increase a magnitude of the directional component (relative to the interference) while retaining a binaural nature of the audio signals. The increase in magnitude of the directional component is based on an amount of the magnification provided by the zoomed lens to the object.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: August 23, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: S M Akramus Salehin, Lae-Hoon Kim, Vasudev Nayak, Shankar Thagadur Shivappa, Isaac Garcia Munoz, Sanghyun Chi, Erik Visser
  • Patent number: 11410677
    Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: August 9, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser
  • Publication number: 20220230623
    Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.
    Type: Application
    Filed: January 21, 2021
    Publication date: July 21, 2022
    Applicant: QUALCOMM Incorporated
    Inventors: Kyungguen BYUN, Sunkuk MOON, Shuhua ZHANG, Vahid MONTAZERI, Lae-Hoon KIM, Erik VISSER
  • Publication number: 20220199100
    Abstract: A device includes one or more processors configured to obtain audio signals representing sound captured by at least three microphones and determine spatial audio data based on the audio signals. The one or more processors are further configured to determine a metric indicative of wind noise in the audio signals. The metric is based on a comparison of a first value and a second value. The first value corresponds to an aggregate signal based on the spatial audio data, and the second value corresponds to a differential signal based on the spatial audio data.
    Type: Application
    Filed: December 21, 2020
    Publication date: June 23, 2022
    Inventors: S M Akramus SALEHIN, Lae-Hoon KIM, Hannes PESSENTHEINER, Shuhua ZHANG, Sanghyun CHI, Erik VISSER, Shankar THAGADUR SHIVAPPA
  • Publication number: 20220201395
    Abstract: In an aspect, a lens is zoomed in to create a zoomed lens. Lens data associated with the lens includes a direction of the lens relative to an object in a field-of-view of the zoomed lens and a magnification of the object resulting from the zoomed lens. An array of microphones capture audio signals including audio produced by the object and interference produced by other objects. The audio signals are processed to identify a directional component associated with the audio produced by the object and three orthogonal components associated with the interference produced by the other objects. Stereo beamforming is used to increase a magnitude of the directional component (relative to the interference) while retaining a binaural nature of the audio signals. The increase in magnitude of the directional component is based on an amount of the magnification provided by the zoomed lens to the object.
    Type: Application
    Filed: December 18, 2020
    Publication date: June 23, 2022
    Inventors: S M Akramus SALEHIN, Lae-Hoon KIM, Vasudev NAYAK, Shankar THAGADUR SHIVAPPA, Isaac Garcia MUNOZ, Sanghyun CHI, Erik VISSER
  • Publication number: 20220180859
    Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.
    Type: Application
    Filed: December 8, 2020
    Publication date: June 9, 2022
    Inventors: Soo Jin PARK, Sunkuk MOON, Lae-Hoon KIM, Erik VISSER
  • Patent number: 11348581
    Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: May 31, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
  • Publication number: 20220165285
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules. A first speech application module corresponds to a speaker verifier, and a second speech application module corresponds to a speech recognition network.
    Type: Application
    Filed: February 10, 2022
    Publication date: May 26, 2022
    Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
  • Publication number: 20220165292
    Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 26, 2022
    Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
  • Publication number: 20220164667
    Abstract: A method includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The method also includes, after training the second neural network and the one or more coupling networks, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 26, 2022
    Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
  • Publication number: 20220164662
    Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 26, 2022
    Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
  • Publication number: 20220115007
    Abstract: A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.
    Type: Application
    Filed: May 5, 2021
    Publication date: April 14, 2022
    Inventors: Taher SHAHBAZI MIRZAHASANLOO, Rogerio Guedes ALVES, Erik VISSER, Lae-Hoon KIM
  • Publication number: 20220109930
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Application
    Filed: November 18, 2021
    Publication date: April 7, 2022
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Patent number: 11290518
    Abstract: Various embodiments provide systems and methods which disclose a command device which can be used to establish a wireless connection, through one or more wireless channels, between the command device and a remote device. An intention code may be generated, prior to, or after, the establishment of the wireless connection, and the remote device may be selected based on the intention code. The command device may initiate a wireless transfer, through one or more wireless channels of the established wireless connection, of an intention code, and receive acknowledgement that the intention code was successfully transferred to the remote device. The command device may then control the remote device, based on the intention code sent to the remote device, through the one or more wireless channels of the established wireless connection between the command device and the remote device.
    Type: Grant
    Filed: September 27, 2017
    Date of Patent: March 29, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Erik Visser, Yinyi Guo