Patents Assigned to SoundHound AI IP, LLC

METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA

Publication number: 20240331702

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Application

Filed: June 14, 2024

Publication date: October 3, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
System and method for voice morphing in a data annotator tool

Patent number: 12086564

Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.

Type: Grant

Filed: November 30, 2021

Date of Patent: September 10, 2024

Assignee: SoundHound AI IP, LLC.

Inventor: Dylan H. Ross
METHOD FOR PROVIDING INFORMATION, METHOD FOR GENERATING DATABASE, AND PROGRAM

Publication number: 20240296197

Abstract: As audio (1) is input to an extension of a browser, the extension transmits the audio (1) to a language processing server. A speech recognition unit obtains a text (1) corresponding to the audio (1), and transmits the text (1) to a natural language understanding unit. In the natural language understanding unit, an information processing unit identifies a URL (1) corresponding to the text (1), and transmits the URL (1) to the browser. The extension passes the URL (1) to a browsing function. The browsing function uses the URL (1) to access a web server. The web server transmits a web page (1) corresponding to the URL (1) to the browser. The browsing function shows a screen corresponding to the web page (1) on a display.

Type: Application

Filed: May 13, 2024

Publication date: September 5, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Masaki NAITO, Keisuke TSUCHIDA, Jun YONEYAMA, Kaku SAWADA
Automatic learning of entities, words, pronunciations, and parts of speech

Patent number: 12080275

Abstract: Systems for automatic speech recognition and/or natural language understanding automatically learn new words by finding subsequences of phonemes that, if they were a new word, would enable a successful tokenization of a phoneme sequence. Systems can learn alternate pronunciations of words by finding phoneme sequences with a small edit distance to existing pronunciations. Systems can learn the part of speech of words by finding part-of-speech variations that would enable parses by syntactic grammars. Systems can learn what types of entities a word describes by finding sentences that could be parsed by a semantic grammar but for the words not being on an entity list.

Type: Grant

Filed: January 11, 2021

Date of Patent: September 3, 2024

Assignee: SoundHound AI IP, LLC.

Inventor: Anton V. Relin
Machine learning system for digital assistants

Patent number: 12067006

Abstract: A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data representation for the query. The method of training involves generating a training dataset for the machine learning system. The method involves clustering vector representations of the query data samples to generate canonical-query original-query pairs in training the machine learning system.

Type: Grant

Filed: June 17, 2021

Date of Patent: August 20, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Pranav Singh, Yilun Zhang, Keyvan Mohajer, Mohammadreza Fazeli
Server supported recognition of wake phrases

Patent number: 12051403

Abstract: A server supports multiple virtual assistants. It receives requests that include wake phrase audio and an identification of the source of the request, such as a virtual assistant device. Based on the identification, the server searches a database for a wake phrase detector appropriate for the identified source. The server then applies the wake phrase detector to the received wake phrase audio. If the wake phrase audio triggers the wake phrase detector, the server provides an appropriate response to the source.

Type: Grant

Filed: January 26, 2022

Date of Patent: July 30, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Newton Jain, Sameer Syed Zaheer
MULTI-PARTICIPANT VOICE ORDERING

Publication number: 20240212678

Abstract: A voice interface recognizes spoken utterances from multiple users. It responds to the utterances in ways such as modifying the attributes of instances of items. The voice interface computes a voice vector for each utterance and associates it with the item instance that is modified. For following utterances with a closely matching voice vector, the voice interface modifies the same instance. For following utterances with a voice vector that is not a close match to one stored for any item instance, the voice interface modifies a different item instance.

Type: Application

Filed: December 21, 2023

Publication date: June 27, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Robert Macrae, Jon Grossman, Scott Halstvedt
Automatic synchronization for an offline virtual assistant

Patent number: 12020696

Abstract: [Object] Technology is provided to enable a mobile terminal to function as a digital assistant even when the mobile terminal is in a state where it cannot communicate with a server apparatus. [Solution] When a user terminal 200 receives a query A from a user, user terminal 200 sends query A to a server 100. Server 100 interprets the meaning of query A using a grammar A. Server 100 obtains a response to query A based on the meaning of query A and sends the response to user terminal 200. Server 100 further sends grammar A to user terminal 200. That is, server 100 sends to user terminal 200 a grammar used to interpret the query received from user terminal 200.

Type: Grant

Filed: October 21, 2019

Date of Patent: June 25, 2024

Assignee: SoundHound AI IP, LLC

Inventor: Karl Stahl
Method and system for conversation transcription with metadata

Patent number: 12020708

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Grant

Filed: October 11, 2021

Date of Patent: June 25, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Kiersten L. Bradley, Ethan Coeytaux, Ziming Yin
Sponsored search ranking simulation for patterns triggered by natural language queries

Patent number: 12013862

Abstract: The technology disclosed relates to natural language understanding-based search engines, ranking sponsored search results and simulated ranking of sponsored search results. Tools and methods describe how to simulate the ranking of sponsored search results. The tools further identify instances of user queries within the scope of trigger patterns, optionally providing examples both of user queries for which a sponsored search result is likely to be displayed and examples for which the sponsored search result will not rank highly enough to be displayed, at least on the first page of search results.

Type: Grant

Filed: December 27, 2019

Date of Patent: June 18, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Bernard Mont-Reynaud, Keyvan Mohajer, Kamyar Mohajer, Chris Wilson
Enabling natural language interactions with user interfaces for users of a software application

Patent number: 12008991

Abstract: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.

Type: Grant

Filed: May 27, 2021

Date of Patent: June 11, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Utku Yabas, Philipp Hubert, Karl Stahl
SYSTEM AND METHOD FOR ADAPTED INTERACTIVE EXPERIENCES

Publication number: 20240185853

Abstract: Natural language grammars interpret expressions at the conversational human-machine interfaces of devices. Under conditions favoring engagement, as specified in a unit of conversational code, the device initiates a discussion using one or more of TTS, images, video, audio, and animation depending on the device capabilities of screen and audio output. Conversational code units specify conditions based on conversation state, mood, and privacy. Grammars provide intents that cause calls to system functions. Units can provide scripts for guiding the conversation. The device, or supporting server system, can provide feedback to creators of the conversational code units for analysis and machine learning.

Type: Application

Filed: February 13, 2024

Publication date: June 6, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Joel McKENZIE, Qindi ZHANG
Multi-modal audio processing for voice-controlled devices

Patent number: 11997448

Abstract: A voice-controlled device includes a microphone to receive a set of sound waves that includes speech uttered by a user and other sound, and to output a first audio signal that includes a contribution from the speech uttered by the user and a contribution from the other sound. The device also includes a receiver to receive an electromagnetic signal and to output a second audio signal obtained from the electromagnetic signal. An audio pre-processor of the device processes the first audio signal using the second audio signal to reduce the contribution from the other sound in a processed audio signal. The voice-controlled device then provides the processed audio signal to a speech recognition module to determine a voice command issued by the user.

Type: Grant

Filed: April 3, 2023

Date of Patent: May 28, 2024

Assignee: SOUNDHOUND AI IP, LLC

Inventor: Karl Stahl
Multiple service levels for automatic speech recognition

Patent number: 11978454

Abstract: A system for performing automated speech recognition (ASR) on audio data includes a queue manager to receive a request to perform ASR on audio data, add the request to a queue of incoming requests, and determine a queue depth representing a number of requests in the queue at a given time. The system also includes a load supervisor to receive the request and the queue depth from the queue manager and assign a service level for the request based on the queue depth. In addition, the system includes a speech-to-text converter to receive the assigned service level for the request from the load supervisor, select an ASR model for the request based on the received service level, receive the audio data associated with the request, and perform ASR on the audio data using the selected ASR model.

Type: Grant

Filed: September 16, 2021

Date of Patent: May 7, 2024

Assignee: SOUNDHOUND AI IP, LLC

Inventors: Timothy P. Stonehocker, Zizu Gowayyed, Matthias Eichstaedt, Seyed Majid Emami, Evelyn Jiang, Ryan Berryhill, Mathieu Ramona, Neil Veira
Wakeword selection

Patent number: 11948571

Abstract: A system and method are disclosed capable of parsing a spoken utterance into a natural language request and a speech audio segment, where the natural language request directs the system to use the speech audio segment as a new wakeword. In response to this wakeword assignment directive, the system and method are further capable of immediately building a new wakeword spotter to activate the device upon matching the new wakeword in the input audio. Different approaches to promptly building a new wakeword spotter are described. Variations of wakeword assignment directives can make the new wakeword public or private. They can also add the new wakeword to earlier wakewords, or replace earlier wakewords.

Type: Grant

Filed: March 30, 2022

Date of Patent: April 2, 2024

Assignee: SoundHound AI IP, LLC

Inventor: Bernard Mont-Reynaud
Wake suppression for audio playing and listening devices

Patent number: 11922939

Abstract: A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.

Type: Grant

Filed: May 4, 2022

Date of Patent: March 5, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Hsuan Yang, Qindí Zhãng, Warren S. Heit
MESSAGE PROCESSING METHOD, INFORMATION PROCESSING APPARATUS, AND PROGRAM

Publication number: 20240073161

Abstract: [Object] To provide a technique for more accurate interpretation of a message inputted by a user. [Solving Means] An information processing server 300 obtains a first message from a user in a thread 001, has a context of the first message stored in a context database 500 in association with the thread 001, obtains a second message from the user in the thread 001, and provides the second message to a conversation server 400 together with the context of the first message.

Type: Application

Filed: August 25, 2023

Publication date: February 29, 2024

Applicant: SoundHound AI IP, LLC.

Inventors: Yuki Matsuda, Keisuke Tsuchida
VIRTUAL ASSISTANT DOMAIN FUNCTIONALITY

Publication number: 20240054297

Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.

Type: Application

Filed: October 24, 2023

Publication date: February 15, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Kamyar Mohajer, Keyvan Mohajer, Bernard Mont-Reynaud, Pranav Singh
System and method for adapted interactive experiences

Patent number: 11900928

Abstract: Natural language grammars interpret expressions at the conversational human-machine interfaces of devices. Under conditions favoring engagement, as specified in a unit of conversational code, the device initiates a discussion using one or more of TTS, images, video, audio, and animation depending on the device capabilities of screen and audio output. Conversational code units specify conditions based on conversation state, mood, and privacy. Grammars provide intents that cause calls to system functions. Units can provide scripts for guiding the conversation. The device, or supporting server system, can provide feedback to creators of the conversational code units for analysis and machine learning.

Type: Grant

Filed: December 23, 2017

Date of Patent: February 13, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Joel McKenzie, Qindi Zhang
MEANING INFERENCE FROM SPEECH AUDIO

Publication number: 20240046918

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Application

Filed: September 26, 2023

Publication date: February 8, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell

prev 1 2 3 next