Patents by Inventor Erik Visser

Erik Visser has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11276415
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate an output representation of the audio data. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the output representation to be provided as a common input to each of the multiple speech application modules.
    Type: Grant
    Filed: April 9, 2020
    Date of Patent: March 15, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Sunkuk Moon, Erik Visser, Prajakt Kulkarni
  • Patent number: 11259119
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device to receive an input audio signal from one or more outer microphones, an input audio signal from one or more inner microphones, and a bone conduction signal from a bone conduction sensor based on the input audio signals. The wearable device may filter the bone conduction signal based on a set of frequencies of the input audio signals, such as a low frequency portion of the input audio signals. For example, the wearable device may apply a filter to the bone conduction signal that accounts for an error in the input audio signals. The wearable device may add a gain to the filtered bone conduction signal and may equalize the filtered bone conduction signal based on the gain. The wearable device may output an audio signal to a speaker.
    Type: Grant
    Filed: October 6, 2020
    Date of Patent: February 22, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Rogerio Guedes Alves, Jacob Jon Bean, Erik Visser
  • Patent number: 11240058
    Abstract: A device to provide information to a visual interface that is mountable to a vehicle dashboard includes a memory configured to store device information indicative of controllable devices of a building and occupant data indicative of one or more occupants of the building. The device includes a processor configured to receive, in real-time, status information associated with the one or more occupants of the building. The status information includes at least one of dynamic location information or dynamic activity information. The processor is configured to generate an output to provide, at the visual interface device, a visual representation of at least a portion of the building and the status information associated with the one or more occupants. The processor is also configured to generate an instruction to adjust an operation of one or more devices of the controllable devices based on user input.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: February 1, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Ravi Choudhary, Yinyi Guo, Fatemeh Saki, Erik Visser
  • Patent number: 11212637
    Abstract: An apparatus includes a processor configured to receive one or more media signals associated with a scene. The processor is also configured to identify a spatial location in the scene for each source of the one or more media signals. The processor is further configured to identify audio content for each media signal of the one or more media signals. The processor is also configured to determine one or more candidate spatial locations in the scene based on the identified spatial locations. The processor is further configured to generate audio to playback as virtual sounds that originate from the one or more candidate spatial locations.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: December 28, 2021
    Assignee: Qualcomm Incorproated
    Inventors: Yinyi Guo, Lae-Hoon Kim, Dongmei Wang, Erik Visser
  • Publication number: 20210343306
    Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.
    Type: Application
    Filed: April 28, 2021
    Publication date: November 4, 2021
    Inventors: Erik VISSER, Vahid MONTAZERI, Shuhua ZHANG, Lae-Hoon KIM
  • Publication number: 20210319801
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate an output representation of the audio data. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the output representation to be provided as a common input to each of the multiple speech application modules.
    Type: Application
    Filed: April 9, 2020
    Publication date: October 14, 2021
    Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
  • Publication number: 20210312943
    Abstract: A device to perform target sound detection includes one or more processors. The one or more processors include a buffer configured to store audio data and a target sound detector. The target sound detector includes a first stage and a second stage. The first stage includes a binary target sound classifier configured to process the audio data. The first stage is configured to activate the second stage in response to detection of a target sound. The second stage is configured to receive the audio data from the buffer in response to the detection of the target sound.
    Type: Application
    Filed: April 1, 2020
    Publication date: October 7, 2021
    Inventors: Prajakt KULKARNI, Yinyi GUO, Erik VISSER
  • Publication number: 20210304777
    Abstract: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device also includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are also configured to apply one adaptive network, based on a constraint, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint.
    Type: Application
    Filed: March 23, 2021
    Publication date: September 30, 2021
    Inventors: Lae-Hoon KIM, Shankar THAGADUR SHIVAPPA, S M Akramus SALEHIN, Shuhua ZHANG, Erik VISSER
  • Patent number: 11094316
    Abstract: A device includes a memory configured to store category labels associated with categories of a natural language processing library. A processor is configured to analyze input audio data to generate a text string and to perform natural language processing on at least the text string to generate an output text string including an action associated with a first device, a speaker, a location, or a combination thereof. The processor is configured to compare the input audio data to audio data of the categories to determine whether the input audio data matches any of the categories and, in response to determining that the input audio data does not match any of the categories: create a new category label, associate the new category label with at least a portion of the output text string, update the categories with the new category label, and generate a notification indicating the new category label.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: August 17, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Sunkuk Moon, Lae-Hoon Kim, Ravi Choudhary
  • Publication number: 20210204053
    Abstract: Methods, systems, and devices for signal processing are described. Generally, as provided for by the described techniques, a wearable device may receive an input audio signal (e.g., including both an external signal and a self-voice signal). The wearable device may detect the self-voice signal in the input audio signal based on a self-voice activity detection (SVAD) procedure, and may implement the described techniques based thereon. The wearable device may perform beamforming operations or other separation procedures to isolate the external signal and the self-voice signal from the input audio signal. The wearable device may apply a first filter to the external signal, and a second filter to the self-voice signal. The wearable device may then mix the filtered signals, and generate an output signal that sounds natural to the user.
    Type: Application
    Filed: March 15, 2021
    Publication date: July 1, 2021
    Inventors: Lae-Hoon KIM, Dongmei WANG, Fatemeh SAKI, Taher SHAHBAZI MIRZAHASANLOO, Erik VISSER, Rogerio Guedes ALVES
  • Publication number: 20210158837
    Abstract: A device includes a processor configured to receive audio data samples and provide the audio data samples to a first neural network to generate a first output corresponding to a first set of sound classes. The processor is further configured to provide the audio data samples to a second neural network to generate a second output corresponding to a second set of sound classes. A second count of classes of the second set of sound classes is greater than a first count of classes of the first set of sound classes. The processor is also configured to provide the first output to a neural adapter to generate a third output corresponding to the second set of sound classes. The processor is further configured to provide the second output and the third output to a merger adapter to generate sound event identification data based on the audio data samples.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 27, 2021
    Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER, Eunjeong KOH
  • Patent number: 11017783
    Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
  • Publication number: 20210151064
    Abstract: A device includes one or more processors configured to perform signal processing including a linear transformation and a non-linear transformation of an input signal to generate a reference target signal. The reference target signal has a linear component associated with the linear transformation and a non-linear component associated with the non-linear transformation. The one or more processors are also configured to perform linear filtering of the input signal by controlling adaptation of the linear filtering to generate an output signal that substantially matches the linear component of the reference target signal.
    Type: Application
    Filed: November 15, 2019
    Publication date: May 20, 2021
    Inventors: Lae-Hoon KIM, Dongmei Wang, Cheng-Yu Hung, Erik Visser
  • Patent number: 10964335
    Abstract: Methods, systems, and devices for auditory enhancement are described. A device may receive a respective auditory signal at each of a set of microphones, where each auditory signal includes a respective representation of a target auditory component and one or more noise artifacts. The device may identify a directionality associated with a source of the target auditory component (e.g., based on an arrangement of the multiple microphones). The device may determine a distribution function for the target auditory component based at least in part on the directionality associated with the source and on the received plurality of auditory signals. The device may generate an estimate of the target auditory component based at least in part on the distribution function and output the estimate of the target auditory component.
    Type: Grant
    Filed: April 9, 2018
    Date of Patent: March 30, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Shuhua Zhang, Erik Visser
  • Patent number: 10957334
    Abstract: Methods, systems, computer-readable media, and apparatuses for signal enhancement are presented. One example of such an apparatus includes a receiver configured to produce a remote speech signal from information carried by a wireless signal; a signal canceller configured to perform a signal cancellation operation on a local speech signal to generate a room response; and a filter configured to filter the remote speech signal according to the room response to produce a filtered speech signal. In this example, the signal cancellation operation is based on the remote speech signal as a reference signal.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: March 23, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Sharon Kaziunas, Anne Katrin Konertz, Erik Visser, Cheng-Yu Hung, Shuhua Zhang, Fatemeh Saki, Dongmei Wang
  • Patent number: 10951975
    Abstract: Methods, systems, and devices for signal processing are described. Generally, in one example as provided for by the described techniques, a wearable device includes a processor configured to retrieve a plurality of external microphone signals that includes audio sound from outside of the device from a memory; to separate, based on at least information from an internal microphone signal, a self-voice component from a background component; to perform a first listen-through operation on the separated self-voice component to produce a first listen-through signal; and to produce an output audio signal that is based on at least the first listen-through signal, wherein the output audio signal includes an audio zoom signal that includes audio sound of the plurality of external microphone signals.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: March 16, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Dongmei Wang, Fatemeh Saki, Taher Shahbazi Mirzahasanloo, Erik Visser, Rogerio Guedes Alves
  • Patent number: 10909988
    Abstract: An electronic device includes a display, wherein the display is configured to present a user interface, wherein the user interface comprises a coordinate system. The coordinate system corresponds to physical coordinates. The display is configured to present a sector selection feature that allows selection of at least one sector of the coordinate system. The at least one sector corresponds to captured audio from multiple microphones. The sector selection may also include an audio signal indicator. The electronic device includes operation circuitry coupled to the display. The operation circuitry is configured to perform an audio operation on the captured audio corresponding to the audio signal indicator based on the sector selection.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: February 2, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Lae-Hoon Kim, Erik Visser, Phuong Lam Ton, Jeremy Patrick Toman, Jeffrey Clinton Shaw
  • Publication number: 20210012770
    Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.
    Type: Application
    Filed: November 15, 2019
    Publication date: January 14, 2021
    Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
  • Publication number: 20210011887
    Abstract: A device for activity tracking includes a memory and one or more processors. The memory is configured to store an activity log. The one or more processors are configured to update the activity log based on activity data. The activity data is received from a second device. The one or more processors are also configured to, responsive to receiving a natural language query, generate a query response based on the activity log.
    Type: Application
    Filed: September 27, 2019
    Publication date: January 14, 2021
    Inventors: Erik VISSER, Rehana MAHFUZ, Ravi CHOUDHARY, Lae-Hoon KIM, Sunkuk MOON, Yinyi GUO, Fatemeh SAKI
  • Patent number: 10878831
    Abstract: An apparatus includes a speech processing engine configured to receive data corresponding to speech and to determine whether a first characteristic associated with the speech differs from a reference characteristic by at least a threshold amount. The apparatus further includes a selection circuit responsive to the speech processing engine. The selection circuit is configured to select a particular speech codebook from among a plurality of speech codebooks based on the first characteristic differing from the reference characteristic by at least the threshold amount. The particular speech codebook is associated with the first characteristic.
    Type: Grant
    Filed: January 12, 2017
    Date of Patent: December 29, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Yinyi Guo, Erik Visser