Patents Assigned to SoundHound AI IP, LLC

PERFORMING SPEECH RECOGNITION USING A SET OF WORDS WITH DESCRIPTIONS IN TERMS OF COMPONENTS SMALLER THAN THE WORDS

Publication number: 20250149043

Abstract: A system and method is presented for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.

Type: Application

Filed: January 8, 2025

Publication date: May 8, 2025

Applicant: SoundHound AI IP, LLC

Inventors: Keyvan MOHAJER, Timothy STONEHOCKER, Bernard MONT-REYNAUD
DERIVING ACOUSTIC FEATURES AND LINGUISTIC FEATURES FROM RECEIVED SPEECH AUDIO

Publication number: 20250069589

Abstract: A computer-implemented method is provided. The method including receiving speech audio of dictation associated with a user ID, deriving acoustic features from the speech audio, storing the derived acoustic features in a user profile associated with the user ID, receiving a request for acoustic features through an application programming interface (API), the request including the user ID, and sending the derived acoustic features through the API.

Type: Application

Filed: November 12, 2024

Publication date: February 27, 2025

Applicant: SoundHound AI IP, LLC

Inventors: Kiran Garaga LOKESWARAPPA, Joel GEDALIUS, Bernard MONT-REYNAUD, Jun HUANG
METHOD AND SYSTEM FOR ACOUSTIC MODEL CONDITIONING ON NON-PHONEME INFORMATION FEATURES

Publication number: 20250054490

Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.

Type: Application

Filed: October 28, 2024

Publication date: February 13, 2025

Applicant: SoundHound AI IP, LLC.

Inventors: Zizu GOWAYYED, Keyvan MOHAJER
ARTIFICIAL INTELLIGENCE SMART ANSWERING ARCHITECTURE

Publication number: 20250029114

Abstract: An automated answering system and method are disclosed for use in providing automated customer service. The automated answering system uses generative artificial intelligence to aid in forming a knowledgebase of information regarding a merchant's business that is used in answering the customer queries. The automated answering system of the present technology also uses generative artificial intelligence to aid in formulating a response to queries using the formed knowledgebase.

Type: Application

Filed: July 21, 2023

Publication date: January 23, 2025

Applicant: SoundHound AI IP, LLC

Inventors: Timothy P. Stonehocker, Kamyar Mohajer
System and method for correction of a query using a replacement phrase

Patent number: 12197417

Abstract: Systems and methods are provided for natural language processing using neural network models and natural language virtual assistants. The system and method include receiving a natural language phrase including a word sequence, computing corresponding error probabilities that the words are errors, and for a word with a corresponding error probability above a threshold, then computing a replacement phrase with a low error probability to provide a response from the virtual assistant depending on the replacement phrase.

Type: Grant

Filed: January 21, 2022

Date of Patent: January 14, 2025

Assignee: SoundHound AI IP, LLC

Inventors: Pranav Singh, Olivia Bettaglio
METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA

Publication number: 20250014582

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Application

Filed: September 18, 2024

Publication date: January 9, 2025

Applicant: SoundHound AI IP, LLC.

Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
CONTROLLING AN ENGAGEMENT STATE OF AN AGENT DURING A HUMAN-MACHINE DIALOG

Publication number: 20250006193

Abstract: A method of controlling an engagement state of an agent during a human-machine dialog is provided. The method can include receiving a spoken request that is a conditional locking request, wherein the conditional locking request uses a natural language expression to explicitly specify a locking condition, which is a predicate, storing the predicate in a format that can be evaluated when needed by the agent, entering a conditionally locked state in response to the conditional locking request, in the conditionally locked state, receiving a request without a need for a wakeup indicator, and for a request evaluating the predicate upon receiving the request, and processing the request if the predicate is true.

Type: Application

Filed: September 9, 2024

Publication date: January 2, 2025

Applicant: SoundHound AI IP, LLC

Inventors: Scott Halstvedt, Keyvan Mohajer, Bernard Mont-Reynaud
CONTENT FILTERING IN MEDIA PLAYING DEVICES

Publication number: 20240430526

Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.

Type: Application

Filed: September 3, 2024

Publication date: December 26, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Thor S. KHOV, Terry KONG
QUERY-SPECIFIC TARGETED AD DELIVERY

Publication number: 20240412256

Abstract: An audio recognition system provides for delivery of promotional content to its user. A user interface device, locally or with the assistance of a network-connected server, performs recognition of audio in response to queries. Recognition can be through a method such as processing features extracted from the audio. Audio can comprise recorded music, singing or humming, instrumental music, vocal music, spoken voice, or other recognizable types of audio. Campaign managers provide promotional content for delivery in response to audio recognized in queries.

Type: Application

Filed: August 21, 2024

Publication date: December 12, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Aaron MASTER, Keyvan MOHAJER
Method and system for acoustic model conditioning on non-phoneme information features

Patent number: 12154546

Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.

Type: Grant

Filed: July 6, 2023

Date of Patent: November 26, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Zizu Gowayyed, Keyvan Mohajer
MACHINE LEARNING SYSTEM FOR DIGITAL ASSISTANTS

Publication number: 20240378193

Abstract: A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data representation for the query. The method of training involves generating a training dataset for the machine learning system. The method involves clustering vector representations of the query data samples to generate canonical-query original-query pairs in training the machine learning system.

Type: Application

Filed: July 23, 2024

Publication date: November 14, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Pranav SINGH, Yilun ZHANG, Keyvan MOHAJER, Mohammadreza FAZELI
AUTOMATIC LEARNING OF ENTITIES, WORDS, PRONUNCIATIONS, AND PARTS OF SPEECH

Publication number: 20240379092

Abstract: Systems for automatic speech recognition and/or natural language understanding automatically learn new words by finding subsequences of phonemes that, if they were a new word, would enable a successful tokenization of a phoneme sequence. Systems can learn alternate pronunciations of words by finding phoneme sequences with a small edit distance to existing pronunciations. Systems can learn the part of speech of words by finding part-of-speech variations that would enable parses by syntactic grammars. Systems can learn what types of entities a word describes by finding sentences that could be parsed by a semantic grammar but for the words not being on an entity list.

Type: Application

Filed: July 25, 2024

Publication date: November 14, 2024

Applicant: SoundHound AI IP, LLC.

Inventor: Anton V. RELIN
SYSTEM AND METHOD FOR VOICE MORPHING IN A DATA ANNOTATOR TOOL

Publication number: 20240370667

Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.

Type: Application

Filed: July 19, 2024

Publication date: November 7, 2024

Applicant: SoundHound AI IP, LLC.

Inventor: Dylan H. Ross
SERVER SUPPORTED RECOGNITION OF WAKE PHRASES

Publication number: 20240363101

Abstract: A server supports multiple virtual assistants. It receives requests that include wake phrase audio and an identification of the source of the request, such as a virtual assistant device. Based on the identification, the server searches a database for a wake phrase detector appropriate for the identified source. The server then applies the wake phrase detector to the received wake phrase audio. If the wake phrase audio triggers the wake phrase detector, the server provides an appropriate response to the source.

Type: Application

Filed: July 12, 2024

Publication date: October 31, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Newton Jain, Sameer Syed Zaheer
Method and system for conversation transcription with metadata

Patent number: 12125487

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Grant

Filed: October 11, 2021

Date of Patent: October 22, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Kiersten L. Bradley, Ethan Coeytaux, Ziming Yin
Content filtering in media playing devices

Patent number: 12126868

Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.

Type: Grant

Filed: July 6, 2023

Date of Patent: October 22, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Thor S. Khov, Terry Kong
Controlling an engagement state of an agent during a human-machine dialog

Patent number: 12125484

Abstract: A method of controlling an engagement state of an agent during a human-machine dialog is provided. The method can include receiving a spoken request that is a conditional locking request, wherein the conditional locking request uses a natural language expression to explicitly specify a locking condition, which is a predicate, storing the predicate in a format that can be evaluated when needed by the agent, entering a conditionally locked state in response to the conditional locking request, in the conditionally locked state, receiving a multiplicity of requests without a need for a wakeup indicator, and for a request from the multiplicity of requests evaluating the predicate upon receiving the request, and processing the request if the predicate is true.

Type: Grant

Filed: December 27, 2021

Date of Patent: October 22, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Scott Halstvedt, Keyvan Mohajer, Bernard Mont-Reynaud
SPONSORED SEARCH RANKING SIMULATION FOR PATTERNS TRIGGERED BY NATURAL LANGUAGE QUERIES

Publication number: 20240346031

Abstract: The technology disclosed relates to natural language understanding-based search engines, ranking sponsored search results and simulated ranking of sponsored search results. Tools and methods describe how to simulate the ranking of sponsored search results. The tools further identify instances of user queries within the scope of trigger patterns, optionally providing examples both of user queries for which a sponsored search result is likely to be displayed and examples for which the sponsored search result will not rank highly enough to be displayed, at least on the first page of search results.

Type: Application

Filed: May 15, 2024

Publication date: October 17, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Bernard MONT-REYNAUD, Keyvan MOHAJER, Kamyar MOHAJER, Chris WILSON
AUTOMATIC SYNCHRONIZATION FOR AN OFFLINE VIRTUAL ASSISTANT

Publication number: 20240347055

Abstract: [Object] Technology is provided to enable a mobile terminal to function as a digital assistant even when the mobile terminal is in a state where it cannot communicate with a server apparatus. [Solution] When a user terminal 200 receives a query A from a user, user terminal 200 sends query A to a server 100. Server 100 interprets the meaning of query A using a grammar A. Server 100 obtains a response to query A based on the meaning of query A and sends the response to user terminal 200. Server 100 further sends grammar A to user terminal 200. That is, server 100 sends to user terminal 200 a grammar used to interpret the query received from user terminal 200.

Type: Application

Filed: June 24, 2024

Publication date: October 17, 2024

Applicant: SoundHound AI IP, LLC

Inventor: Karl Stahl
METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA

Publication number: 20240331702

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Application

Filed: June 14, 2024

Publication date: October 3, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN

1 2 3 next