Patents Assigned to SoundHound, Inc.
  • Publication number: 20220122607
    Abstract: A method of controlling an engagement state of an agent during a human-machine dialog is provided. The method can include receiving a spoken request that is a conditional locking request, wherein the conditional locking request uses a natural language expression to explicitly specify a locking condition, which is a predicate, storing the predicate in a format that can be evaluated when needed by the agent, entering a conditionally locked state in response to the conditional locking request, in the conditionally locked state, receiving a multiplicity of requests without a need for a wakeup indicator, and for a request from the multiplicity of requests evaluating the predicate upon receiving the request, and processing the request if the predicate is true.
    Type: Application
    Filed: December 27, 2021
    Publication date: April 21, 2022
    Applicant: SoundHound, Inc.
    Inventors: Scott Halstvedt, Keyvan Mohajer, Bernard Mont-Reynaud
  • Patent number: 11308960
    Abstract: A processing system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech to determine, according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech has a high probability of being a prefix of a longer utterance.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: April 19, 2022
    Assignee: SoundHound, Inc.
    Inventors: Patricia Pozon Aguayo, Jennifer Hee Young Zhang, Jonah Probell
  • Patent number: 11308938
    Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: April 19, 2022
    Assignee: SoundHound, Inc.
    Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
  • Publication number: 20220115019
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Application
    Filed: October 11, 2021
    Publication date: April 14, 2022
    Applicant: SoundHound, Inc.
    Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
  • Publication number: 20220115020
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Application
    Filed: October 11, 2021
    Publication date: April 14, 2022
    Applicant: SoundHound, Inc.
    Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
  • Patent number: 11295730
    Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
  • Patent number: 11295732
    Abstract: In order to improve the accuracy of ASR, an utterance is transcribed using a plurality of language models, such as for example, an N-gram language model and a neural language model. The language models are trained separately. They each output a probability score or other figure of merit for a partial transcription hypothesis. Model scores are interpolated to determine a hybrid score. While recognizing an utterance, interpolation weights are chosen or updated dynamically, in the specific context of processing. The weights are based on dynamic variables associated with the utterance, the partial transcription hypothesis, or other aspects of context.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventors: Steffen Holm, Terry Kong, Kiran Garaga Lokeswarappa
  • Patent number: 11295741
    Abstract: A system and method are disclosed capable of parsing a spoken utterance into a natural language request and a speech audio segment, where the natural language request directs the system to use the speech audio segment as a new wakeword. In response to this wakeword assignment directive, the system and method are further capable of immediately building a new wakeword spotter to activate the device upon matching the new wakeword in the input audio. Different approaches to promptly building a new wakeword spotter are described. Variations of wakeword assignment directives can make the new wakeword public or private. They can also add the new wakeword to earlier wakewords, or replace earlier wakewords.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventor: Bernard Mont-Reynaud
  • Publication number: 20220092273
    Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.
    Type: Application
    Filed: November 30, 2021
    Publication date: March 24, 2022
    Applicant: SoundHound, Inc.
    Inventor: Dylan H. Ross
  • Patent number: 11276398
    Abstract: A system that includes a stand-alone device or a server connected client device are in communication with a server and provide recommendations. The device includes an input component, a storage component, a processor and an output component. The server-connected client device includes an input component that receives the user's request, a communication component that communicates the request to the server and receives the recommendation from the server, and an output component that provides the recommendation to user.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: March 15, 2022
    Assignee: SoundHound, Inc.
    Inventors: Robert MacRAE, Kamyar Mohajer
  • Publication number: 20220076678
    Abstract: A computer-implemented method is provided. The method includes receiving commands to store memos, identifying subjects related to the memos, storing, in a database, the memos, their related subjects, and associated time information, receiving a natural language request to retrieve a memo, the request having query information, identifying a subject related to the request, responsive to the request, querying the database for memos related to the subject, identifying multiple memos in response to the database query, identifying a memo, from the multiple identified memos, that has the most recent associated time information and providing a response in dependence on the identified memo.
    Type: Application
    Filed: November 19, 2021
    Publication date: March 10, 2022
    Applicant: SoundHound, Inc.
    Inventors: Irina A. SPIRIDONOVA, Karl STAHL, Mara SELVAGGI
  • Publication number: 20220075956
    Abstract: A method of providing relevant messages to an automotive virtual assistant is provided. The method includes receiving a spoken utterance and corresponding first geolocation information detected by a subsystem of a first automobile, parsing the spoken utterance to determine concepts and storing the concepts in a concept database indexed by the corresponding first geolocation information. The method further includes receiving second geolocation information detected by a subsystem of a second automobile, searching the concept database for an index based on the second geolocation information to find a stored concept of the stored concepts, searching a natural language expression database using the stored concept as an index to find an assistive natural language expression, wherein the assistive natural language expression includes a constituent part, and sending the assistive natural language expression to the second automobile with the stored concept in place of the constituent part.
    Type: Application
    Filed: November 15, 2021
    Publication date: March 10, 2022
    Applicant: SoundHound, Inc.
    Inventors: Bernard MONT-REYNAUD, Jonah PROBELL, Pranav SINGH, Kheng KHOV
  • Patent number: 11263198
    Abstract: Systems and methods are provided for systematically finding and fixing automatic speech recognition (ASR) mistranscriptions and natural language understanding (NLU) misinterpretations and labeling data for machine learning. High similarity of non-identical consecutive queries indicates ASR mistranscriptions. Consecutive queries with close vectors in a semantic embedding space indicates NLU misinterpretations. Key phrases and barge-in also indicate errors. Only queries within a short amount of time are considered.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: March 1, 2022
    Assignee: SOUNDHOUND, INC.
    Inventors: Olivia Bettaglio, Pranav Singh
  • Patent number: 11257493
    Abstract: Systems and methods for processing speech are described. In certain examples, image data is used to generate visual feature tensors and audio data is used to generate audio feature tensors. The visual feature tensors and the audio feature tensors are used by a linguistic model to determine linguistic features that are usable to parse an utterance of a user. The generation of the feature tensors may be jointly configured with the linguistic model. Systems may be provided in a client-server architecture.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: February 22, 2022
    Assignee: SoundHound, Inc.
    Inventors: Cristina Vasconcelos, Zili Li
  • Patent number: 11250844
    Abstract: Agents engage and disengage with users intelligently. Users can tell agents to remain engaged without requiring a wakeword. Engaged states can support modal dialogs and barge-in. Users can cause disengagement explicitly. Disengagement can be conditional based on timeout, change of user, or environmental conditions. Engagement can be one-time or recurrent. Recurrent states can be attentive or locked. Locked states can be unconditional or conditional, including being reserved to support user continuity. User continuity can be tested by matching parameters or tracking user by many modalities including microphone arrays, cameras, and other sensors.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: February 15, 2022
    Assignee: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Scott Halstvedt, Keyvan Mohajer
  • Patent number: 11250217
    Abstract: A client device receives a user request (e.g., in natural language form) to execute a command of an application. The client device delegates interpretation of the request to a response-processing server. Using domain knowledge previously provided by a developer of the application, the response-processing server determines the various possible responses that client devices could make in response to the request based on circumstances such as the capabilities of the client devices and the state of the application data. The response-processing server accordingly generates a response package that describes a number of different conditional responses that client devices could have to the request and provides the response package to the client device. The client device selects the appropriate response from the response package based on the circumstances as determined by the client device, executes the command (if possible), and provides the user with some representation of the response.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: February 15, 2022
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Christopher S. Wilson, Kheng Khov, Ian Graves
  • Patent number: 11238101
    Abstract: A command-processing server receives a natural language command from a user. The command-processing server has a set of domain command interpreters corresponding to different domains in which commands can be expressed, such as the domain of entertainment, or the domain of travel. Some or all of the domain command interpreters recognize user commands having a verbal prefix, an optional pre-filter, an object, and an optional post-filter; the pre- and post-filters may be compounded expressions involving multiple atomic filters. Different developers may independently specify the domain command interpreters and the sub-structure interpreters on which they are based.
    Type: Grant
    Filed: October 27, 2020
    Date of Patent: February 1, 2022
    Assignee: SOUNDHOUND, INC.
    Inventor: Keyvan Mohajer
  • Patent number: 11211064
    Abstract: The technology disclosed relates to retrieving a personal memo from a database. The method includes receiving, by a virtual assistant, a natural language utterance that expresses a request, interpreting the natural language utterance according to a natural language grammar rule for retrieving memo data from the natural language utterance, the natural language grammar rule recognizing query information, responsive to interpreting the natural language utterance, using the query information to query the database for a memo related to the query information, and providing, to a user, a response generated in dependence upon the memo related to the query information.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: December 28, 2021
    Assignee: SoundHound, Inc.
    Inventors: Mara Selvaggi, Irina A Spiridonova, Karl Stahl
  • Publication number: 20210397610
    Abstract: A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data representation for the query. The method of training involves generating a training dataset for the machine learning system. The method involves clustering vector representations of the query data samples to generate canonical-query original-query pairs in training the machine learning system.
    Type: Application
    Filed: June 17, 2021
    Publication date: December 23, 2021
    Applicant: SoundHound, Inc.
    Inventors: Pranav SINGH, Yilun ZHANG, Keyvan MOHAJER, Mohammadreza FAZELI
  • Patent number: 11205056
    Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift.
    Type: Grant
    Filed: September 22, 2019
    Date of Patent: December 21, 2021
    Assignee: SoundHound, Inc.
    Inventor: Dylan H Ross