Patents Assigned to SoundHound, Inc.
  • Publication number: 20240135927
    Abstract: A system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech of a user that is stored on a user's device or the system, which detects the voice activity, to determine according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech, which is based on the user profile, has a high probability of being a prefix of a longer utterance.
    Type: Application
    Filed: January 2, 2024
    Publication date: April 25, 2024
    Applicant: SoundHound, Inc.
    Inventors: Patricia Pozon AGUAYO, Jennifer Hee Young ZHANG, Jonah PROBELL
  • Publication number: 20240135922
    Abstract: A method includes recognizing words comprised by a first utterance; interpreting the recognized words according to a grammar comprised by a domain; from the interpreting of the recognized words, determining a timeout period for the first utterance based on the domain of the first utterance; detecting end of voice activity in the first utterance; executing an instruction following an amount of time after detecting end of voice activity of the first utterance in response to the amount of time exceeding the timeout period, the executed instruction based at least in part on interpreting the recognized words.
    Type: Application
    Filed: October 18, 2022
    Publication date: April 25, 2024
    Applicant: SoundHound, Inc.
    Inventor: Victor LEITMAN
  • Publication number: 20240127803
    Abstract: A voice morphing model can transform diverse voices to one or a small number of target voices. An acoustic model can be trained for high accuracy on the target voices. Speech recognition on diverse voices can be performed by morphing it to a target voice and then performing recognition on audio with the target voice. The morphing model and an acoustic model for speech recognition can be trained separately or jointly. A source of requests for speech recognition can pass audio and a voiceprint with requests. Speech recognition can run with improved accuracy by biasing an acoustic model for the voice in the audio using the voiceprint. The audio can be used to calculate a new voiceprint, which can be used to update the voiceprint included with the audio. The updated voiceprint can be sent back to the source and then used with future speech recognition requests.
    Type: Application
    Filed: October 12, 2022
    Publication date: April 18, 2024
    Applicant: SoundHound, Inc.
    Inventor: Keyvan MOHAJER
  • Patent number: 11935029
    Abstract: A virtual assistant processes natural language expressions according to grammar rules created by domain providers. The virtual assistant uniquely identifies each of a multiplicity of users and stores values of grammar slots filled by natural language expressions from each user. The virtual assistant stores histories of slot values and computes statistics from the history. The virtual assistant provider, or a classification client, provides values of attributes of users as labels for a machine learning classification algorithm. The algorithm processes the grammar slot values and labels to compute probability distributions for unknown attribute values of users. A network effect of users and domain grammars make the virtual assistant useful and provides increasing amounts of data that improve classification accuracy and usefulness.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: March 19, 2024
    Assignee: SoundHound, Inc.
    Inventors: Joe Aung, Jonah Probell
  • Publication number: 20240054195
    Abstract: Actions are authorized by computing a confidence score that exceeds a threshold. The confidence score is based on a match between metadata about requests and fields in corresponding database records. The confidences score weights matches by the dependability of the metadata for authentication. The confidence score is further based on the closeness of a sample of speech audio to a stored voiceprint. Additional identification may be required for authorization. The confidence score requirement may be relaxed based on identification in a buffer of recent action requests.
    Type: Application
    Filed: August 9, 2022
    Publication date: February 15, 2024
    Applicant: SoundHound, Inc.
    Inventors: Ahmadul HASSAN, James HOM
  • Publication number: 20240046923
    Abstract: In an interaction system, a server can obtain a setting expression including a query and a condition for functioning as a virtual assistant, store the query and the condition in a memory, and deliver an inquiry expression including the query in response to occurrence of a situation specified by the condition. The setting expression can be by voice or natural language. Processes can be different for different users and can be based on domain. The inquiry expression includes a question asking the user for an affirmative response before performing the inquiry. Implementations can be adopted in or near a vehicle.
    Type: Application
    Filed: July 28, 2023
    Publication date: February 8, 2024
    Applicant: SoundHound, Inc.
    Inventor: Masaki NAITO
  • Publication number: 20240021189
    Abstract: A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.
    Type: Application
    Filed: July 14, 2023
    Publication date: January 18, 2024
    Applicant: SoundHound, Inc.
    Inventor: Andrew RICHARDS
  • Patent number: 11862162
    Abstract: A processing system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech to determine, according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech has a high probability of being a prefix of a longer utterance.
    Type: Grant
    Filed: March 18, 2022
    Date of Patent: January 2, 2024
    Assignee: SoundHound, Inc.
    Inventors: Patricia Pozon Aguayo, Jennifer Hee Young Zhang, Jonah Probell
  • Patent number: 11836453
    Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.
    Type: Grant
    Filed: July 22, 2021
    Date of Patent: December 5, 2023
    Assignee: SoundHound, Inc.
    Inventors: Kamyar Mohajer, Keyvan Mohajer, Bernard Mont-Reynaud, Pranav Singh
  • Publication number: 20230386458
    Abstract: Methods and systems for pre-wakeword speech processing are disclosed. Speech audio, comprising command speech spoken before a wakeword, may be stored in a buffer in oldest to newest order. Upon detection of the wakeword, reverse acoustic models and language models, such as reverse automatic speech recognition (R-ASR) can be applied to the buffered audio, in newest to oldest order, starting from before the wakeword. The speech is converted into a sequence of words. Natural language grammar models, such as natural language understanding (NLU), can be applied to match the sequence of words to a complete command, the complete command being associated with invoking a computer operation.
    Type: Application
    Filed: May 27, 2022
    Publication date: November 30, 2023
    Applicant: SoundHound, Inc.
    Inventors: Karl STAHL, Bernard MONT-REYNAUD
  • Publication number: 20230386459
    Abstract: The application provides an apparatus, platform, method and medium for intention importance interference. The apparatus includes an interface configured to receive user-related information; and a processor coupled to the interface and configured to: extract data related to different aspects of a user from the user-related information; generate a plurality of intention probes based on the data related to different aspects of the user, each intention probe comprising an intention and associated data items; infer an importance of each intention probe by calculating a score of each associated data items of the intention probe based on the data related to different aspects of the user; and provide information associated with an intention probe with a highest importance.
    Type: Application
    Filed: August 18, 2022
    Publication date: November 30, 2023
    Applicant: SoundHound, Inc.
    Inventor: Chong Wang
  • Publication number: 20230352000
    Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
    Type: Application
    Filed: July 6, 2023
    Publication date: November 2, 2023
    Applicant: SoundHound, Inc.
    Inventors: Zizu GOWAYYED, Keyvan MOHAJER
  • Publication number: 20230353826
    Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.
    Type: Application
    Filed: July 6, 2023
    Publication date: November 2, 2023
    Applicant: SoundHound, Inc.
    Inventors: Thor S. KHOV, Terry KONG
  • Publication number: 20230325358
    Abstract: Systems and methods for searching databases by sound data input are provided herein. A service provider may have a need to make their database(s) searchable through search technology. However, the service provider may not have the resources to implement such search technology. The search technology may allow for search queries using sound data input. The technology described herein provides a solution addressing the service provider’s need, by giving a search technology that furnishes search results in a fast, accurate manner. In further embodiments, systems and methods to monetize those search results are also described herein.
    Type: Application
    Filed: June 6, 2023
    Publication date: October 12, 2023
    Applicant: SoundHound, Inc.
    Inventor: Keyvan Mohajer
  • Patent number: 11776533
    Abstract: A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.
    Type: Grant
    Filed: April 8, 2021
    Date of Patent: October 3, 2023
    Assignee: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Seyed M. Emami, Chris Wilson, Keyvan Mohajer
  • Publication number: 20230298607
    Abstract: A system and a method are disclosed for a machine learned audio morpher that is trained such that the voice characteristics of a user spoken phrase are replaced with those of a target speaker, which removes and/or reduces the user identifiable information for the spoken phrase. Training can be performed by a user and a target speaker speaking the same or similar phrases and training the audio morpher to minimize the differences between the target speaker phrase and a morphed user phrase.
    Type: Application
    Filed: March 15, 2022
    Publication date: September 21, 2023
    Applicant: SoundHound, Inc.
    Inventors: Ziming YIN, Zili LI
  • Publication number: 20230289530
    Abstract: A computer system ingests a catalog of a plurality of items. The catalog is specific to a particular domain and including names for individual items of the plurality of items. One or more attributes are respectively associated to the individual items of the plurality of items. A specialist grammar specific to the particular domain of the catalog is obtained and a programming language code to interpret natural language input related to the catalog is generated using the specialist grammar, and the names for the individual items of the plurality of items and their associated one or more attributes.
    Type: Application
    Filed: April 8, 2022
    Publication date: September 14, 2023
    Applicant: SoundHound, Inc.
    Inventors: Joe Kyaw Soe AUNG, Vincent GARCIA, Junru REN
  • Patent number: 11741941
    Abstract: A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: August 29, 2023
    Assignee: SoundHound, Inc
    Inventor: Andrew Richards
  • Patent number: 11741943
    Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
    Type: Grant
    Filed: April 7, 2021
    Date of Patent: August 29, 2023
    Assignee: SoundHound, Inc
    Inventors: Zizu Gowayyed, Keyvan Mohajer
  • Patent number: 11736769
    Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.
    Type: Grant
    Filed: April 12, 2021
    Date of Patent: August 22, 2023
    Assignee: SoundHound, Inc
    Inventors: Thor S. Khov, Terry Kong