Patents Assigned to SoundHound, Inc.
  • Publication number: 20210241759
    Abstract: A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.
    Type: Application
    Filed: February 4, 2020
    Publication date: August 5, 2021
    Applicant: SoundHound, Inc.
    Inventors: Hsuan Yang, Qìndí Zhäng, Warren S. Heit
  • Publication number: 20210241769
    Abstract: A method of providing a platform for configuring device-specific speech recognition is provided. The method includes providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device, receiving, from a developer, a selection of the set of the at least two acoustic models, and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
    Type: Application
    Filed: April 21, 2021
    Publication date: August 5, 2021
    Applicant: SOUNDHOUND, INC.
    Inventors: Keyvan MOHAJER, Mehul PATEL
  • Publication number: 20210224043
    Abstract: A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.
    Type: Application
    Filed: April 8, 2021
    Publication date: July 22, 2021
    Applicant: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Seyed M. Emami, Chris Wilson, Keyvan Mohajer
  • Publication number: 20210217431
    Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.
    Type: Application
    Filed: January 11, 2020
    Publication date: July 15, 2021
    Applicant: SoundHound, Inc.
    Inventor: Steve PEARSON
  • Publication number: 20210210099
    Abstract: A method and system for responding to multiple voice requests sent from a group of devices in substantive response to a single spoken utterance of a user. In one embodiment, if the devices have a same group ID, a server determines if any of the group of received voice requests are duplicate. In one embodiment, voice requests received within a predetermined time window are examined to determine if they are duplicate. If so, the server deems one of the received voice requests as non-duplicate and the others as duplicate and sends a substantive response for the non-duplicate voice request. In some embodiments, a no-op is sent to the devices that do not receive the substantive response.
    Type: Application
    Filed: January 6, 2020
    Publication date: July 8, 2021
    Applicant: SoundHound, Inc.
    Inventors: Arvinderpal S. Wander, Evelyn Jiang, Matthias Eichstaedt, Timothy Calhoun
  • Publication number: 20210193159
    Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.
    Type: Application
    Filed: January 10, 2020
    Publication date: June 24, 2021
    Applicant: SoundHound, Inc.
    Inventor: Steve PEARSON
  • Patent number: 11043213
    Abstract: A system and method are disclosed for capturing a segment of speech audio, performing phoneme recognition on the segment of speech audio to produce a segmented phoneme sequence, comparing the segmented phoneme sequence to stored phoneme sequences that represent incorrect pronunciations of words to determine if there is a match, and identifying an incorrect pronunciation for a word in the segment of speech audio. The system builds a library based on the data collected for the incorrect pronunciations.
    Type: Grant
    Filed: December 7, 2018
    Date of Patent: June 22, 2021
    Assignee: SoundHound, Inc.
    Inventors: Katayoun Norouzi, Karl Stahl
  • Publication number: 20210182660
    Abstract: Systems and methods for distributed training of a neural network model are described. Various embodiments include a master device and a slave device. The master device has a first version of the neural network model. The slave device is communicatively coupled to a first data source and the master device, and the first data source is inaccessible by the master device, in accordance with one embodiment. The slave device is remote from the master device. The master device is configured to output first configuration data for the neural network model based on the first version of the neural network model. The slave device is configured to use the first configuration data to instantiate a second version of the neural network model. The slave device is configured to train the second version of the neural network model using data from the first data source and to output second configuration data for the neural network model.
    Type: Application
    Filed: December 16, 2019
    Publication date: June 17, 2021
    Applicant: SoundHound, Inc.
    Inventors: Asif Amirguliyev, Zili Li, Jonah Probell
  • Publication number: 20210182661
    Abstract: Training and enhancement of neural network models, such as from private data, are described. A slave device receives a version of a neural network model from a master. The slave accesses a local and/or private data source and uses the data to perform optimization of the neural network model. This can be done such as by computing gradients or performing knowledge distillation to locally train an enhanced second version of the model. The slave sends the gradients or enhanced neural network model to a master. The master may use the gradient or second version of the model to improve a master model.
    Type: Application
    Filed: December 17, 2019
    Publication date: June 17, 2021
    Applicant: SoundHound, Inc.
    Inventors: Zili LI, Asif AMIRGULIYEV, Jonah PROBELL
  • Publication number: 20210174794
    Abstract: A system and method are disclosed capable of parsing a spoken utterance into a natural language request and a speech audio segment, where the natural language request directs the system to use the speech audio segment as a new wakeword. In response to this wakeword assignment directive, the system and method are further capable of immediately building a new wakeword spotter to activate the device upon matching the new wakeword in the input audio. Different approaches to promptly building a new wakeword spotter are described. Variations of wakeword assignment directives can make the new wakeword public or private. They can also add the new wakeword to earlier wakewords, or replace earlier wakewords.
    Type: Application
    Filed: December 5, 2019
    Publication date: June 10, 2021
    Applicant: SoundHound, Inc.
    Inventor: Bernard Mont-Reynaud
  • Publication number: 20210174806
    Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.
    Type: Application
    Filed: December 4, 2019
    Publication date: June 10, 2021
    Applicant: SoundHound, Inc.
    Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
  • Publication number: 20210174783
    Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.
    Type: Application
    Filed: December 5, 2019
    Publication date: June 10, 2021
    Applicant: SoundHound, Inc.
    Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
  • Patent number: 11030993
    Abstract: A method is provided for advertisement selection. The method includes recognizing words from user speech over a large number of interactions, computing a number of unique words uttered during the interactions, classifying the user by the number of unique words uttered during the interactions, and selecting an advertisement targeted to the classified users.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: June 8, 2021
    Assignee: SoundHound, Inc.
    Inventors: Jun Huang, Kiran Garaga Lokeswarappa, Joel Gedalius, Bernard Mont-Reynaud
  • Patent number: 11023509
    Abstract: A method for processing a natural language query. The method includes receiving a text query, the query referring to a plurality of objects, attributes, qualifiers and other arguments and parsing the query to produce an argument tree representing the substance and structure of the query. The method also includes the capability to define qualifiers as being possibly projectable onto other arguments and indicate their direction of projectability and the capability to denote nodes of the argument tree as foldable, as splittable, or as containing sequences of qualifier arguments. The method additionally includes defining validity rules for a domain of knowledge, used to determine whether a list of arguments form a valid granular query component and processing of the argument tree, in view of the above in order to derive a corresponding plurality of granular query components that collectively request the plurality of pieces of information representing the intent of the query.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: June 1, 2021
    Assignee: SOUNDHOUND, INC.
    Inventors: Jason Weinstein, Keyvan Mohajer
  • Patent number: 11011162
    Abstract: The technology disclosed relates to performing speech recognition for a plurality of different devices or devices in a plurality of conditions. This includes storing a plurality of acoustic models associated with different devices or device conditions, receiving speech audio including natural language utterances, receiving metadata indicative of a device type or device condition, selecting an acoustic model from the plurality in dependence upon the received metadata, and employing the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio. Each of speech recognition and the storage of acoustic models can be performed locally by devices or on a network-connected server. Also provided is a platform and interface, used by device developers to select, configure, and/or train acoustic models for particular devices and/or conditions.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: May 18, 2021
    Assignee: SOUNDHOUND, INC.
    Inventors: Mehul Patel, Keyvan Mohajer
  • Patent number: 11003426
    Abstract: A command-processing server provides natural language processing services to applications. The command-processing server stores a set of code blocks, each code block being able to interpret a set of corresponding natural language expressions. The command-processing server accepts natural language expressions and identifies the code blocks that are capable of interpreting those expressions by attempting to parse the natural language expressions using the code blocks. The command-processing server then provides a list of the identified code blocks to the developers, who can then incorporate the code blocks into their applications.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: May 11, 2021
    Assignee: SOUNDHOUND, INC.
    Inventors: Christopher S. Wilson, Keyvan Mohajer
  • Patent number: 10996931
    Abstract: The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: May 4, 2021
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Seyed M. Emami, Chris Wilson, Bernard Mont-Reynaud
  • Publication number: 20210118435
    Abstract: [Object] Technology is provided to enable a mobile terminal to function as a digital assistant even when the mobile terminal is in a state where it cannot communicate with a server apparatus. [Solution] When a user terminal 200 receives a query A from a user, user terminal 200 sends query A to a server 100. Server 100 interprets the meaning of query A using a grammar A. Server 100 obtains a response to query A based on the meaning of query A and sends the response to user terminal 200. Server 100 further sends grammar A to user terminal 200. That is, server 100 sends to user terminal 200 a grammar used to interpret the query received from user terminal 200.
    Type: Application
    Filed: October 21, 2019
    Publication date: April 22, 2021
    Applicant: SoundHound, Inc.
    Inventor: Karl Stahl
  • Publication number: 20210089626
    Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift.
    Type: Application
    Filed: September 22, 2019
    Publication date: March 25, 2021
    Applicant: SoundHound, Inc.
    Inventor: Dylan H. Ross
  • Patent number: 10957310
    Abstract: The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 23, 2021
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Seyed Majid Emami, Chris Wilson, Bernard Mont-Reynaud