Patents Assigned to SoundHound, Inc.
  • Patent number: 11205051
    Abstract: A method of predicting a person's interests is provided. The method includes receiving geolocation information about a user location, reading, from a database of interpretations, at least one interpretation of an expression made in close proximity to the location, reading, from a database of ad bids, a plurality of ad bids comprising interpretations, comparing the interpretation from the database to the interpretations of the ad bids to select a most valuable ad bid having an interpretation that matches the interpretation of an expression made in close proximity to the location, and presenting an ad associated with the most valuable ad bid, wherein the interpretation is from a natural language expression.
    Type: Grant
    Filed: January 2, 2019
    Date of Patent: December 21, 2021
    Assignee: SoundHound, Inc.
    Inventors: Kheng Khov, Pranav Singh, Bernard Mont-Reynaud, Jonah Probell
  • Publication number: 20210390944
    Abstract: A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.
    Type: Application
    Filed: June 7, 2021
    Publication date: December 16, 2021
    Applicant: SoundHound, Inc.
    Inventor: Andrew RICHARDS
  • Publication number: 20210357594
    Abstract: The present invention extends to methods, systems, and computer program products for interpreting queries according to preferences. Multi-domain natural language understanding systems can support a variety of different types of clients. Queries can be received and interpreted across one or more domains. Preferred query interpretations can be identified and query responses provided based on any of: domain preferences, preferences indicated by an identifier, or (e.g., weighted) scores exceeding a threshold.
    Type: Application
    Filed: July 30, 2021
    Publication date: November 18, 2021
    Applicant: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Bernard Mont-Reynaud, Christopher S. Wilson
  • Publication number: 20210350784
    Abstract: A personalized name pronunciation is generated by receiving a request from a client device associated with a person ID. A lexical representation of a name is obtained and pronunciation information for the name of is created based on an input from to the client device. The pronunciation information is stored with the lexical representation associated with the person ID in a database. A message request to provide a message that includes the name associated with the person ID may be received and a script obtained. The database is accessed using the person ID to obtain the pronunciation information for the name. Speech representing lexical text of the script is synthesized and an audio representation of the name is generated based on the pronunciation information. The speech and the audio representation of the name are delivered to at least one individual as audio.
    Type: Application
    Filed: May 7, 2021
    Publication date: November 11, 2021
    Applicant: SoundHound, Inc.
    Inventor: Mara SELVAGGI
  • Publication number: 20210350087
    Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.
    Type: Application
    Filed: July 22, 2021
    Publication date: November 11, 2021
    Applicant: SoundHound, Inc.
    Inventors: Kamyar Mohajer, Keyvan Mohajer, Bernard Mont-Reynaud, Pranav Singh
  • Publication number: 20210335340
    Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
    Type: Application
    Filed: April 7, 2021
    Publication date: October 28, 2021
    Applicant: SoundHound, Inc.
    Inventors: Zizu GOWAYYED, Keyvan MOHAJER
  • Publication number: 20210335351
    Abstract: An utterance is analyzed to determine a characteristic of the utterance and a transcription hypothesis is generated for the utterance. Grammar rules are then used to parse the transcription hypothesis to produce a plurality of interpretation hypotheses, each having a likelihood score. A set of authorized domains is determined based on the characteristic and the plurality of interpretation hypotheses are filtered according to the set of authorized domains. Of the remaining interpretation hypotheses, one is selected according to their likelihood scores. The characteristic may include one or more characteristics such as mood, prosody, or whether the utterance has a rising intonation.
    Type: Application
    Filed: July 2, 2021
    Publication date: October 28, 2021
    Applicant: SoundHound, Inc.
    Inventor: Karl Stahl
  • Publication number: 20210329338
    Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.
    Type: Application
    Filed: April 12, 2021
    Publication date: October 21, 2021
    Applicant: SoundHound, Inc.
    Inventors: Thor S. KHOV, Terry KONG
  • Patent number: 11151329
    Abstract: A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: October 19, 2021
    Assignee: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Seth Taron
  • Patent number: 11144731
    Abstract: A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.
    Type: Grant
    Filed: September 11, 2018
    Date of Patent: October 12, 2021
    Assignee: SoundHound, Inc.
    Inventors: Pranav Singh, Keyvan Mohajer, Kamyar Mohajer, Bernard Mont-Reynaud
  • Publication number: 20210312901
    Abstract: Systems for automatic speech recognition and/or natural language understanding automatically learn new words by finding subsequences of phonemes that, if they were a new word, would enable a successful tokenization of a phoneme sequence. Systems can learn alternate pronunciations of words by finding phoneme sequences with a small edit distance to existing pronunciations. Systems can learn the part of speech of words by finding part-of-speech variations that would enable parses by syntactic grammars. Systems can learn what types of entities a word describes by finding sentences that could be parsed by a semantic grammar but for the words not being on an entity list.
    Type: Application
    Filed: January 11, 2021
    Publication date: October 7, 2021
    Applicant: SoundHound, Inc.
    Inventor: Anton V. RELIN
  • Patent number: 11138205
    Abstract: A query-processing server provides natural language services to applications. More specifically, the query-processing server receives and stores domain knowledge information from application developers, the domain knowledge information comprising a linguistic description of the natural language user queries that application developers wish their applications to support. A first portion of the domain knowledge information is applied to transform a natural language query received from an application to an ordered sequence of question elements. A second portion of the domain knowledge information is applied to group the ordered sequence of question elements into a plurality of distinct structured questions posed by the natural language query. The distinct structured questions may then be provided to the application, which may then execute them and obtain the corresponding data referenced by the questions.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: October 5, 2021
    Assignee: Soundhound, Inc.
    Inventors: Keyvan Mohajer, Bernard Mont-Reynaud, Philipp Hubert
  • Patent number: 11132504
    Abstract: A domain-independent framework parses and interprets compound natural language queries in the context of a conversation between a human and an agent. Generic grammar rules and corresponding semantics support the understanding of compound queries in the conversation context. The sub-queries themselves are from one or more domains, and they are parsed and interpreted by a pre-existing grammar, covering one or more pre-existing domains. The pre-existing grammar, extended by the generic rules, recognizes all compound queries based on any queries recognized by the pre-existing grammar. Use of the disclosed framework requires little or no change in the domain-specific NLU handling code. The framework defines a generic approach to propagating context data between sub-queries of a compound query. The framework can be further extended to propagate intra-query context data in, out and across query components.
    Type: Grant
    Filed: March 25, 2019
    Date of Patent: September 28, 2021
    Assignee: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Christopher S Wilson, Keyvan Mohajer
  • Patent number: 11113473
    Abstract: The present invention extends to methods, systems, and computer program products for interpreting expressions having potentially ambiguous meanings in different domains. Multi-domain natural language understanding systems can support a variety of different types of clients. Expressions can be interpreted across multiple domains. Weights can be assigned to domains. Weights can be client specific or expression specific so that a chosen interpretation is more likely correct for the type of client or for its context. Stored weight sets can be chosen according to identifying information carried as metadata with expressions or weight sets carried directly as metadata. Domains can additionally or alternatively be ranked in ordered lists or comparative domain pairs of to favor some domains over others as appropriate for client type or client context.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: September 7, 2021
    Assignee: SoundHound Inc.
    Inventors: Christopher S. Wilson, Keyvan Mohajer, Bernard Mont-Reynaud
  • Publication number: 20210272552
    Abstract: A computer-implemented method is provided. The method including receiving speech audio of dictation associated with a user ID, deriving acoustic features from the speech audio, storing the derived acoustic features in a user profile associated with the user ID, receiving a request for acoustic features through an application programming interface (API), the request including the user ID, and sending the derived acoustic features through the API.
    Type: Application
    Filed: May 19, 2021
    Publication date: September 2, 2021
    Applicant: SoundHound, Inc.
    Inventors: Kiran Garaga LOKESWARAPPA, Joel GEDALIUS, Bernard MONT-REYNAUD, Jun HUANG
  • Patent number: 11100291
    Abstract: A query-processing server that interprets natural language expressions supports the extension of a first semantic grammar (for a particular type of expression), which is declared extensible, by a second semantic grammar (for another type of expression). When an extension is requested, the query-processing server checks that the two semantic grammars have compatible semantic types. The developers need not have any knowledge of each other, or about their respective grammars. Performing an extension may be done by yet another party, such as the query-processing server, or another server, independently of all previous parties. The use of semantic grammar extensions provides a way to expand the coverage and functionality of natural language interpretation in a simple and flexible manner, so that new forms of expression may be supported, and seamlessly combined with pre-existing interpretations. Finally, in some implementations, this is done without loss of efficiency.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: August 24, 2021
    Assignee: SOUNDHOUND, INC.
    Inventors: Keyvan Mohajer, Christopher S. Wilson, Bernard Mont-Reynaud
  • Patent number: 11100288
    Abstract: A factored neural network estimates a conditional distribution of token probabilities using two smaller models, a class model and an index model. Every token has a unique class, and a unique index in the class. The two smaller models are trained independently but cooperate at inference time. Factoring with more than two models is possible. Networks can be recurrent. Factored neural networks for statistical language modelling treat words as tokens. In that context, classes capture linguistic regularities. Partitioning of words into classes keeps the number of classes and the maximum size of a class both low. Optimization of partitioning is by iteratively splitting and assembling classes.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: August 24, 2021
    Assignee: SoundHound Inc.
    Inventors: Zizu Gowayyed, Bernard Mont-Reynaud
  • Patent number: 11100940
    Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: August 24, 2021
    Assignee: SOUNDHOUND, INC.
    Inventor: Steve Pearson
  • Publication number: 20210256386
    Abstract: An audio processing system is described. The audio processing system uses a convolutional neural network architecture to process audio data, a recurrent neural network architecture to process at least data derived from an output of the convolutional neural network architecture, and a feed-forward neural network architecture to process at least data derived from an output of the recurrent neural network architecture. The feed-forward neural network architecture is configured to output classification scores for a plurality of sound units associated with speech. The classification scores indicate a presence of one or more sound units in the audio data. The convolutional neural network architecture has a plurality of convolutional groups arranged in series, where a convolutional group includes a combination of two data mappings arranged in parallel.
    Type: Application
    Filed: February 13, 2020
    Publication date: August 19, 2021
    Applicant: SoundHound, Inc.
    Inventors: Maisy Wieman, Andrew Carl Spencer, Zìlì Li, Cristina Vasconcelos
  • Publication number: 20210241759
    Abstract: A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.
    Type: Application
    Filed: February 4, 2020
    Publication date: August 5, 2021
    Applicant: SoundHound, Inc.
    Inventors: Hsuan Yang, Qìndí Zhäng, Warren S. Heit