Patents by Inventor Erik Visser

Erik Visser has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11705147
    Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: July 18, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Erik Visser, Vahid Montazeri, Shuhua Zhang, Lae-Hoon Kim
  • Patent number: 11700484
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules. A first speech application module corresponds to a speaker verifier, and a second speech application module corresponds to a speech recognition network.
    Type: Grant
    Filed: February 10, 2022
    Date of Patent: July 11, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Sunkuk Moon, Erik Visser, Prajakt Kulkarni
  • Patent number: 11676571
    Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: June 13, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Kyungguen Byun, Sunkuk Moon, Shuhua Zhang, Vahid Montazeri, Lae-Hoon Kim, Erik Visser
  • Patent number: 11671752
    Abstract: A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: June 6, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Fatemeh Saki, Yoon Mo Yang, Erik Visser
  • Patent number: 11664044
    Abstract: A device includes a processor configured to receive audio data samples and provide the audio data samples to a first neural network to generate a first output corresponding to a first set of sound classes. The processor is further configured to provide the audio data samples to a second neural network to generate a second output corresponding to a second set of sound classes. A second count of classes of the second set of sound classes is greater than a first count of classes of the first set of sound classes. The processor is also configured to provide the first output to a neural adapter to generate a third output corresponding to the second set of sound classes. The processor is further configured to provide the second output and the third output to a merger adapter to generate sound event identification data based on the audio data samples.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: May 30, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser, Eunjeong Koh
  • Patent number: 11636866
    Abstract: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device also includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are also configured to apply one adaptive network, based on a constraint, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: April 25, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Shankar Thagadur Shivappa, S M Akramus Salehin, Shuhua Zhang, Erik Visser
  • Patent number: 11626104
    Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: April 11, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Soo Jin Park, Sunkuk Moon, Lae-Hoon Kim, Erik Visser
  • Publication number: 20230105655
    Abstract: A wearable device may include a processor configured to perform active noise cancelation (ANC) applied to an input audio signal received by at least one microphone, and detect a self-voice signal, based on one or more transducers. The processor may also be configured to apply a first filter to an external audio signal, detected by at least one external microphone on the wearable device, during a listen through operation based on an activation of the audio zoom feature to generate a first listen-through signal that includes the external audio signal. The processor may also be configured to after the activation of the audio zoom feature terminate a second filter that provides low frequency compensation. The processor may be configured to produce an output audio signal that is based on at least the first listen-through signal that includes the external signal, and is based on the detected self-voice signal.
    Type: Application
    Filed: December 8, 2022
    Publication date: April 6, 2023
    Inventors: Lae-Hoon KIM, Dongmei WANG, Fatemeh SAKI, Taher SHAHBAZI MIRZAHASANLOO, Erik VISSER, Rogerio Guedes ALVES
  • Patent number: 11606643
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: March 14, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Publication number: 20230060774
    Abstract: A device includes one or more processors configured to determine, based on data descriptive of two or more audio environments, a geometry of a mutual audio environment. The one or more processors are also configured to process audio data, based on the geometry of the mutual audio environment, for output at an audio device disposed in a first audio environment of the two or more audio environments.
    Type: Application
    Filed: August 31, 2021
    Publication date: March 2, 2023
    Inventors: S M Akramus SALEHIN, Lae-Hoon KIM, Xiaoxin ZHANG, Erik VISSER
  • Patent number: 11589153
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device may receive an input audio signal (e.g., including both an external signal and a self-voice signal). The wearable device may detect the self-voice signal in the input audio signal based on a self-voice activity detection (SVAD) procedure, and may implement the described techniques based thereon. The wearable device may perform beamforming operations or other separation procedures to isolate the external signal and the self-voice signal from the input audio signal. The wearable device may apply a first filter to the external signal, and a second filter to the self-voice signal. The wearable device may then mix the filtered signals, and generate an output signal that sounds natural to the user.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: February 21, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Dongmei Wang, Fatemeh Saki, Taher Shahbazi Mirzahasanloo, Erik Visser, Rogerio Guedes Alves
  • Publication number: 20230035531
    Abstract: A second device includes a memory configured to store instructions and one or more processors configured to receive, from a first device, an indication of an audio class corresponding to an audio event.
    Type: Application
    Filed: July 25, 2022
    Publication date: February 2, 2023
    Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Lae-Hoon Kim, Rogerio Guedes Alves, Hannes Pessentheiner
  • Publication number: 20230036986
    Abstract: A first device includes a memory configured to store instructions and one or more processors configured to receive audio signals from multiple microphones. The one or more processors are configured to process the audio signals to generate direction-of-arrival information corresponding to one or more sources of sound represented in one or more of the audio signals. The one or more processors are also configured to and send, to a second device, data based on the direction-of-arrival information and a class or embedding associated with the direction-of-arrival information.
    Type: Application
    Filed: July 25, 2022
    Publication date: February 2, 2023
    Inventors: Erik VISSER, Fatemeh SAKI, Yinyi GUO, Lae-Hoon KIM, Rogerio Guedes ALVES, Hannes PESSENTHEINER
  • Publication number: 20230034450
    Abstract: A device includes a memory configured to store instructions. The device also includes one or more processors configured to execute the instructions to provide context and one or more items of interest corresponding to the context to a dependency network encoder to generate a semantic-based representation of the context. The one or more processors are also configured to provide the context to a data dependent encoder to generate a context-based representation. The one or more processors are further configured to combine the semantic-based representation and the context-based representation to generate a semantically-augmented representation of the context.
    Type: Application
    Filed: July 22, 2021
    Publication date: February 2, 2023
    Inventors: Arvind Krishna SRIDHAR, Ravi CHOUDHARY, Lae-Hoon KIM, Erik VISSER
  • Publication number: 20230026735
    Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.
    Type: Application
    Filed: July 21, 2021
    Publication date: January 26, 2023
    Inventors: Vahid MONTAZERI, Van NGUYEN, Hannes PESSENTHEINER, Lae-Hoon KIM, Erik VISSER, Rogerio Guedes ALVES
  • Patent number: 11533561
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Grant
    Filed: February 9, 2022
    Date of Patent: December 20, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Publication number: 20220360891
    Abstract: A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.
    Type: Application
    Filed: May 10, 2021
    Publication date: November 10, 2022
    Inventors: Lae-Hoon KIM, Fatemeh SAKI, Yoon Mo YANG, Erik VISSER
  • Publication number: 20220310108
    Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.
    Type: Application
    Filed: March 23, 2021
    Publication date: September 29, 2022
    Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
  • Publication number: 20220277744
    Abstract: A vehicle includes an interface device, an in-vehicle control unit, a functional unit, and a processing circuitry. The interface device receives a spoken command to identify an in-cabin vehicle zone of two or more in-cabin vehicle zones of the vehicle, and receives background audio data concurrently with a portion of the spoken command. The in-cabin vehicle control unit separates the background audio data from the spoken command, and selects which in-cabin vehicle zone of the two or more in-cabin vehicle zones is identified by the spoken command. The functional unit controls a function within the vehicle. The processing circuitry stores, to a command buffer, data processed from the received spoken command, and controls, based on the data processed from the received spoken command, the functional unit using audio input received from the selected in-cabin vehicle zone.
    Type: Application
    Filed: May 18, 2022
    Publication date: September 1, 2022
    Inventors: Asif Iqbal Mohammad, Sreekanth Narayanaswamy, Rishabh Tyagi, Erik Visser
  • Publication number: 20220272451
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Application
    Filed: February 9, 2022
    Publication date: August 25, 2022
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser