Patents by Inventor Maisy Wieman

Maisy Wieman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Meaning inference from speech audio

Patent number: 12300219

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Grant

Filed: September 26, 2023

Date of Patent: May 13, 2025

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
MEANING INFERENCE FROM SPEECH AUDIO

Publication number: 20240046918

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Application

Filed: September 26, 2023

Publication date: February 8, 2024

Applicant: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
NEURAL SPEECH-TO-MEANING

Publication number: 20230419970

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

Type: Application

Filed: September 5, 2023

Publication date: December 28, 2023

Applicant: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Meaning inference from speech audio

Patent number: 11769488

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Grant

Filed: March 3, 2022

Date of Patent: September 26, 2023

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Neural speech-to-meaning

Patent number: 11749281

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

Type: Grant

Filed: December 4, 2019

Date of Patent: September 5, 2023

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Neural acoustic model

Patent number: 11392833

Abstract: An audio processing system is described. The audio processing system uses a convolutional neural network architecture to process audio data, a recurrent neural network architecture to process at least data derived from an output of the convolutional neural network architecture, and a feed-forward neural network architecture to process at least data derived from an output of the recurrent neural network architecture. The feed-forward neural network architecture is configured to output classification scores for a plurality of sound units associated with speech. The classification scores indicate a presence of one or more sound units in the audio data. The convolutional neural network architecture has a plurality of convolutional groups arranged in series, where a convolutional group includes a combination of two data mappings arranged in parallel.

Type: Grant

Filed: February 13, 2020

Date of Patent: July 19, 2022

Assignee: SoundHound, Inc.

Inventors: Maisy Wieman, Andrew Carl Spencer, Zìlì L{hacek over (i)}, Cristina Vasconcelos
Meaning Inference from Speech Audio

Publication number: 20220189464

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Application

Filed: March 3, 2022

Publication date: June 16, 2022

Applicant: SoundHound, Inc.

Inventors: Sudharsan KRISHNASWAMY, Maisy WIEMAN, Jonah PROBELL
Synthesizing speech recognition training data

Patent number: 11308938

Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Type: Grant

Filed: December 5, 2019

Date of Patent: April 19, 2022

Assignee: SoundHound, Inc.

Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
NEURAL ACOUSTIC MODEL

Publication number: 20210256386

Abstract: An audio processing system is described. The audio processing system uses a convolutional neural network architecture to process audio data, a recurrent neural network architecture to process at least data derived from an output of the convolutional neural network architecture, and a feed-forward neural network architecture to process at least data derived from an output of the recurrent neural network architecture. The feed-forward neural network architecture is configured to output classification scores for a plurality of sound units associated with speech. The classification scores indicate a presence of one or more sound units in the audio data. The convolutional neural network architecture has a plurality of convolutional groups arranged in series, where a convolutional group includes a combination of two data mappings arranged in parallel.

Type: Application

Filed: February 13, 2020

Publication date: August 19, 2021

Applicant: SoundHound, Inc.

Inventors: Maisy Wieman, Andrew Carl Spencer, Zìlì Li, Cristina Vasconcelos
Synthesizing Speech Recognition Training Data

Publication number: 20210174783

Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Type: Application

Filed: December 5, 2019

Publication date: June 10, 2021

Applicant: SoundHound, Inc.

Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
Neural Speech-to-Meaning

Publication number: 20210174806

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

Type: Application

Filed: December 4, 2019

Publication date: June 10, 2021

Applicant: SoundHound, Inc.

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell