Patents Assigned to SoundHound, Inc.
-
Publication number: 20220405797Abstract: Ads are generated based on product info and consumer profiles. A discriminator evaluates probabilities of ads being effective at causing consumer engagement. A decoder extracts product info from generated ads. Based on the probabilities of ads being effective and similarity of extracted and source product info, generated ads are labeled as examples. The examples are used in training an improved ad generator. Ads may be visual and/or audio containing speech. Ads may even contain humor, as recognized by mismatches between source and decoded product info.Type: ApplicationFiled: August 18, 2022Publication date: December 22, 2022Applicant: SoundHound, Inc.Inventor: Jonah PROBELL
-
Publication number: 20220408059Abstract: A system and a method are disclosed that enable sidebar conversations between two or more attendees that are participating in a primary or main meeting. The sidebar conversation occurs in conjunction or concurrently with the primary meeting. A first attendee provides commands to indicate a desire to initiate a sidebar conversation and information about a targeted attendee. The commands are analyzed to determine if a trigger phrase is included. The commands are analyzed to determine if there is an identification of a second (targeted) attendee, who is currently participating in the main meeting. If the second attendee is available, then the sidebar conversation is initiated. Additional attendees can be added to the sidebar conversation.Type: ApplicationFiled: June 21, 2021Publication date: December 22, 2022Applicant: SoundHound, Inc.Inventor: Timothy P STONEHOCKER
-
Patent number: 11531819Abstract: Machine learned models take in vectors representing desired behaviors and generate voice vectors that provide the parameters for text-to-speech (TTS) synthesis. Models may be trained on behavior vectors that include user profile attributes, situational attributes, or semantic attributes. Situational attributes may include age of people present, music that is playing, location, noise, and mood. Semantic attributes may include presence of proper nouns, number of modifiers, emotional charge, and domain of discourse. TTS voice parameters may apply per utterance and per word as to enable contrastive emphasis.Type: GrantFiled: January 14, 2020Date of Patent: December 20, 2022Assignee: SoundHound, Inc.Inventors: Bernard Mont-Reynaud, Monika Almudafar-Depeyrot
-
Publication number: 20220383869Abstract: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.Type: ApplicationFiled: May 27, 2021Publication date: December 1, 2022Applicant: SoundHound, Inc.Inventors: Utku YABAS, Philipp HUBERT, Karl STAHL
-
Publication number: 20220382823Abstract: As audio (1) is input to an extension of a browser, the extension transmits the audio (1) to a language processing server. A speech recognition unit obtains a text (1) corresponding to the audio (1), and transmits the text (1) to a natural language understanding unit. In the natural language understanding unit, an information processing unit identifies a URL (1) corresponding to the text (1), and transmits the URL (1) to the browser. The extension passes the URL (1) to a browsing function. The browsing function uses the URL (1) to access a web server. The web server transmits a web page (1) corresponding to the URL (1) to the browser. The browsing function shows a screen corresponding to the web page (1) on a display.Type: ApplicationFiled: January 26, 2022Publication date: December 1, 2022Applicant: SoundHound, Inc.Inventors: Masaki NAITO, Keisuke TSUCHIDA, Jun YONEYAMA, Kaku SAWADA
-
Publication number: 20220343014Abstract: A system and method are disclosed for fulfilling GDPR and other privacy requests in a client device system as well as a downstream service provider with which the client device system partners. In examples, the downstream service provider may be a voice assistant service provider providing voice recognition and language understanding capabilities to an upstream client device system.Type: ApplicationFiled: April 22, 2021Publication date: October 27, 2022Applicant: SoundHound, Inc.Inventors: Kevin QIU, Evelyn JIANG, Matthias EICHSTAEDT, Warren S. HEIT
-
Patent number: 11461812Abstract: Original concepts obtained from a query may be augmented with additional concepts connected to the original concepts in a concept graph in response to determining that the original concepts did not match a sufficient number of bid functions. The augmented set of concepts may then be evaluated with respect to the bid functions to identify matching ad functions. This process may be repeated until a sufficient number of matching ad functions are found. A bid amount of the matching bid functions may be calculated, such as based on semantic information obtained as a result of the query. The bid amounts may further be based on environmental information. A bid function is selected based on the bid amounts and the content associated with the bid function is provided to the source of the query. The content may be selected based on the semantic information.Type: GrantFiled: September 16, 2019Date of Patent: October 4, 2022Assignee: SoundHound, Inc.Inventors: Keyvan Mohajer, Scott Halstvedt
-
Publication number: 20220262362Abstract: A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.Type: ApplicationFiled: May 4, 2022Publication date: August 18, 2022Applicant: SoundHound, Inc.Inventors: Hsuan Yang, Qìndí Zhang, Warren S. Heit
-
Patent number: 11410642Abstract: A system and method for creating an embedded phoneme map from a corpus of speech in accordance with a multiplicity of acoustic features of the speech. The embedded phoneme map is used to determine how to pronounce borrowed words from a lending language in the borrowing language, using the phonemes of the borrowing language that are closest to the phonemes of the lending language. The embedded phoneme map is also used to help linguists visualize the phonemes being pronounced by a speaker in real-time and to help non-native speakers practice pronunciation by displaying the differences between proper pronunciation and actual pronunciation for open-ended speech by the speaker.Type: GrantFiled: August 16, 2019Date of Patent: August 9, 2022Assignee: SOUNDHOUND, INC.Inventors: Serena Caterina Scuderi, Gioia Zoli, Sarah Beth Hotung
-
Patent number: 11393463Abstract: A system and method are disclosed for setting up a communication link between a device or application and a system with a controller. The controller can collect and send information to the application. A user interfaces with the controller to access the functionality of the application through providing commands to the controller. The system allows the user to interface with multiple applications.Type: GrantFiled: April 19, 2019Date of Patent: July 19, 2022Assignee: SoundHound, Inc.Inventors: Timothy P. Stonehocker, Kathleen Worthington McMahon
-
Patent number: 11392833Abstract: An audio processing system is described. The audio processing system uses a convolutional neural network architecture to process audio data, a recurrent neural network architecture to process at least data derived from an output of the convolutional neural network architecture, and a feed-forward neural network architecture to process at least data derived from an output of the recurrent neural network architecture. The feed-forward neural network architecture is configured to output classification scores for a plurality of sound units associated with speech. The classification scores indicate a presence of one or more sound units in the audio data. The convolutional neural network architecture has a plurality of convolutional groups arranged in series, where a convolutional group includes a combination of two data mappings arranged in parallel.Type: GrantFiled: February 13, 2020Date of Patent: July 19, 2022Assignee: SoundHound, Inc.Inventors: Maisy Wieman, Andrew Carl Spencer, Zìlì L{hacek over (i)}, Cristina Vasconcelos
-
Publication number: 20220223155Abstract: A system and method are disclosed capable of parsing a spoken utterance into a natural language request and a speech audio segment, where the natural language request directs the system to use the speech audio segment as a new wakeword. In response to this wakeword assignment directive, the system and method are further capable of immediately building a new wakeword spotter to activate the device upon matching the new wakeword in the input audio. Different approaches to promptly building a new wakeword spotter are described. Variations of wakeword assignment directives can make the new wakeword public or private. They can also add the new wakeword to earlier wakewords, or replace earlier wakewords.Type: ApplicationFiled: March 30, 2022Publication date: July 14, 2022Applicant: SoundHound, Inc.Inventor: Bernard Mont-Reynaud
-
Publication number: 20220208192Abstract: A processing system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech to determine, according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech has a high probability of being a prefix of a longer utterance.Type: ApplicationFiled: March 18, 2022Publication date: June 30, 2022Applicant: SoundHound, Inc.Inventors: Patricia Pozon AGUAYO, Jennifer Hee Young ZHANG, Jonah PROBELL
-
Patent number: 11367448Abstract: A method of providing a platform for configuring device-specific speech recognition is provided. The method includes providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device, receiving, from a developer, a selection of the set of the at least two acoustic models, and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.Type: GrantFiled: April 21, 2021Date of Patent: June 21, 2022Assignee: SOUNDHOUND, INC.Inventors: Keyvan Mohajer, Mehul Patel
-
Publication number: 20220188580Abstract: A system and a method are disclosed that calculate the center of a geographic region. A set of topological/geographical points is received. A set of clusters is determined. A weight for each cluster is computed. The highest weighted cluster is selected. The geographic region center is calculated using the selected cluster. The geographical points can include a key for each point and be filtered by an indicated key before calculating the center of a geographic location.Type: ApplicationFiled: December 13, 2021Publication date: June 16, 2022Applicant: SoundHound, Inc.Inventor: Christophe PIERRET
-
Publication number: 20220189464Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.Type: ApplicationFiled: March 3, 2022Publication date: June 16, 2022Applicant: SoundHound, Inc.Inventors: Sudharsan KRISHNASWAMY, Maisy WIEMAN, Jonah PROBELL
-
Publication number: 20220172729Abstract: A system and method are disclosed for achieving interoperability and access to a personal extension knowledge/preference database (PEKD) through interconnected voice verification systems. Devices from various different companies and systems can link to a voice verification system (VVS). Users can also enroll with the VSS so that the VSS can provide authentication of users by personal wake phrases. Thereafter users can access their PEKD from un-owned devices by speaking their wake phrase.Type: ApplicationFiled: December 1, 2020Publication date: June 2, 2022Applicant: SoundHound, Inc.Inventors: Keyvan Mohajer, Warren S. Heit
-
Publication number: 20220165257Abstract: Methods and systems for automatically generating sample phrases or sentences that a user can say to invoke a set of defined actions performed by a virtual assistant are disclosed. By enabling finetuned general-purpose natural language models, the system can generate potential and accurate utterance sentences based on extracted keywords or the input utterance sentence. Furthermore, domain-specific datasets can be used to train the pre-trained, general-purpose natural language models via unsupervised learning. These generated sentences can improve the efficiency of configuring a virtual assistant. The system can further optimize the effectiveness of a virtual assistant in understanding the user, which can enhance the user experience of communicating with it.Type: ApplicationFiled: November 19, 2021Publication date: May 26, 2022Applicant: SoundHound, Inc.Inventors: Pranav SINGH, Keyvan MOHAJER, Yilun ZHANG
-
Publication number: 20220165272Abstract: A computer-implemented method is provided to support a food ordering system for food items from a menu of a restaurant using natural language. Expressions made for ordering are used to recommend a food item that a user has a high probability of wanting to include in an order. The recommendation engine is trained using machine learning. Expressions are collected and parsed to identify words that might indicate food items offered by the restaurant. The words are provided to a restaurant owner to identify food items on a menu, with which the words are associated.Type: ApplicationFiled: February 8, 2022Publication date: May 26, 2022Applicant: SoundHound, Inc.Inventors: Kamyar MOHAJER, Robert MACRAE
-
Publication number: 20220147510Abstract: Systems and methods are provided for natural language processing using neural network models and natural language virtual assistants. The system and method include receiving a natural language phrase including a word sequence, computing corresponding error probabilities that the words are errors, and for a word with a corresponding error probability above a threshold, then computing a replacement phrase with a low error probability to provide a response from the virtual assistant depending on the replacement phrase.Type: ApplicationFiled: January 21, 2022Publication date: May 12, 2022Applicant: SoundHound, Inc.Inventors: Pranav Singh, Olivia Bettaglio