Patents by Inventor Sunkuk MOON

Sunkuk MOON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240087597
    Abstract: A device includes one or more processors configured to process an input audio spectrum of input speech to detect a first characteristic associated with the input speech. The one or more processors are also configured to select, based at least in part on the first characteristic, one or more reference embeddings from among multiple reference embeddings. The one or more processors are further configured to process a representation of source speech, using the one or more reference embeddings, to generate an output audio spectrum of output speech.
    Type: Application
    Filed: September 13, 2022
    Publication date: March 14, 2024
    Inventors: Kyungguen BYUN, Sunkuk MOON, Erik VISSER
  • Patent number: 11869478
    Abstract: A device includes one or more processors configured to receive an input audio signal. The one or more processors are also configured to process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal. The combined representation is used to selectively retain or remove sounds of the multiple sound sources from the input audio signal. The one or more processors are further configured to provide the output audio signal to a second device.
    Type: Grant
    Filed: March 18, 2022
    Date of Patent: January 9, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Siddhartha Goutham Swaminathan, Sunkuk Moon, Shuhua Zhang, Erik Visser
  • Publication number: 20230326477
    Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.
    Type: Application
    Filed: June 14, 2023
    Publication date: October 12, 2023
    Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
  • Publication number: 20230298561
    Abstract: A device includes one or more processors configured to receive an input audio signal. The one or more processors are also configured to process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal. The combined representation is used to selectively retain or remove sounds of the multiple sound sources from the input audio signal. The one or more processors are further configured to provide the output audio signal to a second device.
    Type: Application
    Filed: March 18, 2022
    Publication date: September 21, 2023
    Inventors: Siddhartha Goutham SWAMINATHAN, Sunkuk Moon, Shuhua Zhang, Erik Visser
  • Publication number: 20230300527
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules.
    Type: Application
    Filed: May 26, 2023
    Publication date: September 21, 2023
    Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
  • Patent number: 11715480
    Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: August 1, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Kyungguen Byun, Shuhua Zhang, Lae-Hoon Kim, Erik Visser, Sunkuk Moon, Vahid Montazeri
  • Patent number: 11700484
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules. A first speech application module corresponds to a speaker verifier, and a second speech application module corresponds to a speech recognition network.
    Type: Grant
    Filed: February 10, 2022
    Date of Patent: July 11, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Sunkuk Moon, Erik Visser, Prajakt Kulkarni
  • Patent number: 11676571
    Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: June 13, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Kyungguen Byun, Sunkuk Moon, Shuhua Zhang, Vahid Montazeri, Lae-Hoon Kim, Erik Visser
  • Patent number: 11626104
    Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: April 11, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Soo Jin Park, Sunkuk Moon, Lae-Hoon Kim, Erik Visser
  • Publication number: 20220310108
    Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.
    Type: Application
    Filed: March 23, 2021
    Publication date: September 29, 2022
    Inventors: Kyungguen BYUN, Shuhua ZHANG, Lae-Hoon KIM, Erik VISSER, Sunkuk MOON, Vahid MONTAZERI
  • Publication number: 20220230623
    Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.
    Type: Application
    Filed: January 21, 2021
    Publication date: July 21, 2022
    Applicant: QUALCOMM Incorporated
    Inventors: Kyungguen BYUN, Sunkuk MOON, Shuhua ZHANG, Vahid MONTAZERI, Lae-Hoon KIM, Erik VISSER
  • Publication number: 20220180859
    Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.
    Type: Application
    Filed: December 8, 2020
    Publication date: June 9, 2022
    Inventors: Soo Jin PARK, Sunkuk MOON, Lae-Hoon KIM, Erik VISSER
  • Patent number: 11348581
    Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: May 31, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
  • Publication number: 20220165285
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules. A first speech application module corresponds to a speaker verifier, and a second speech application module corresponds to a speech recognition network.
    Type: Application
    Filed: February 10, 2022
    Publication date: May 26, 2022
    Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
  • Patent number: 11276415
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate an output representation of the audio data. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the output representation to be provided as a common input to each of the multiple speech application modules.
    Type: Grant
    Filed: April 9, 2020
    Date of Patent: March 15, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Sunkuk Moon, Erik Visser, Prajakt Kulkarni
  • Publication number: 20210319801
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate an output representation of the audio data. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the output representation to be provided as a common input to each of the multiple speech application modules.
    Type: Application
    Filed: April 9, 2020
    Publication date: October 14, 2021
    Inventors: Lae-Hoon KIM, Sunkuk MOON, Erik VISSER, Prajakt KULKARNI
  • Patent number: 11094316
    Abstract: A device includes a memory configured to store category labels associated with categories of a natural language processing library. A processor is configured to analyze input audio data to generate a text string and to perform natural language processing on at least the text string to generate an output text string including an action associated with a first device, a speaker, a location, or a combination thereof. The processor is configured to compare the input audio data to audio data of the categories to determine whether the input audio data matches any of the categories and, in response to determining that the input audio data does not match any of the categories: create a new category label, associate the new category label with at least a portion of the output text string, update the categories with the new category label, and generate a notification indicating the new category label.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: August 17, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Sunkuk Moon, Lae-Hoon Kim, Ravi Choudhary
  • Patent number: 11017783
    Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
  • Publication number: 20210012770
    Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.
    Type: Application
    Filed: November 15, 2019
    Publication date: January 14, 2021
    Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
  • Publication number: 20210011887
    Abstract: A device for activity tracking includes a memory and one or more processors. The memory is configured to store an activity log. The one or more processors are configured to update the activity log based on activity data. The activity data is received from a second device. The one or more processors are also configured to, responsive to receiving a natural language query, generate a query response based on the activity log.
    Type: Application
    Filed: September 27, 2019
    Publication date: January 14, 2021
    Inventors: Erik VISSER, Rehana MAHFUZ, Ravi CHOUDHARY, Lae-Hoon KIM, Sunkuk MOON, Yinyi GUO, Fatemeh SAKI