Patents Examined by Darioush Agahi

Method, system, and non-transitory computer readable record medium to record conversations in connection with video communication service

Patent number: 12380890

Abstract: Provided is a method, system, and non-transitory computer-readable record medium to record a conversation, between multiple users, in connection with a video communication service. An audio record management method includes creating an audio record by converting an audio of a video communication channel to a text; and providing the audio record to a user that participates in the video communication channel.

Type: Grant

Filed: October 22, 2021

Date of Patent: August 5, 2025

Assignees: NAVER CORPORATION, LINE WORKS CORP.

Inventors: SuMee Lee, Jieun Shin, Ye Lim Jeong, Gilhwan Hwang
Injection of user feedback into language model adaptation

Patent number: 12380281

Abstract: The present disclosure generally relates to updating a language model based on user feedback. Based on a user text input, a language model predicts a set of tokens and an action that will be taken by the user in response to the predicted set of tokens. If the predicted action does not match a detected actual user action, the language model is updated to reflect the user feedback by modifying an output token probability distribution based on the actual user action and updating the language model to converge with a target language model using the modified output token probability distribution.

Type: Grant

Filed: September 2, 2022

Date of Patent: August 5, 2025

Assignee: Apple Inc.

Inventor: Jerome R. Bellegarda
Reference resolution during natural language processing

Patent number: 12379894

Abstract: Systems and processes for operating a digital assistant are provided. An example method includes, at an electronic device having one or more processors and memory, detecting invocation of a digital assistant; determining, using a reference resolution service, a set of possible entities; receiving a user utterance including an ambiguous reference; determining based on the user utterance and the list of possible entities, a candidate interpretation including a preliminary set of entities corresponding to the ambiguous reference; determining, with the reference resolution service and based on the candidate interpretation including the preliminary set of entities corresponding to the ambiguous reference, an entity corresponding to the ambiguous reference; and performing, based on the candidate interpretation and the entity corresponding to the ambiguous reference, a task associated with the user utterance.

Type: Grant

Filed: February 22, 2022

Date of Patent: August 5, 2025

Assignee: Apple Inc.

Inventors: Antoine R. Raux, Yi Ma, Kevin D. Pitolin, Felicia Edwards, Chengchao Yang
Device, method, and program for enhancing output content through iterative generation

Patent number: 12380887

Abstract: A method of improving output content through iterative generation is provided. The method includes receiving a natural language input, obtaining user intention information based on the natural language input by using a natural language understanding (NLU) model, setting a target area in base content based on a first user input, determining input content based on the user intention information or a second user input, generating output content related to the base content based on the input content, the target area, and the user intention information by using a neural network (NN) model, generating a caption for the output content by using an image captioning model, calculating similarity between text of the natural language input and the generated output content, and iterating generation of the output content based on the similarity.

Type: Grant

Filed: April 24, 2023

Date of Patent: August 5, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Seohyun Back, Yonghyun Ryu, Wonho Ryu, Haejun Lee, Cheolseung Jung, Sai Chetan, Jiyeon Hong
Speaker disambiguation and transcription from multiple audio feeds

Patent number: 12367882

Abstract: Participants may use one or more devices for engaging in a meeting, such as phones, conferencing devices, and/or computers. The devices include microphones that capture speech for determining the presence of distinct participants. Speech signals originating from different participants, or microphones, may be determined and associated with the participants. For example, microphones may be directional and more sensitive to sound coming from one or more specific directions than sound coming from other directions. By associating an individual with a microphone, or set of microphones, overlapping voices may be disambiguated to provide clear voice streams that aid in producing a clear transcript indicating the speech of the participants, respectively. An identity of the participants may be determined using voiceprint and/or voice recognition techniques.

Type: Grant

Filed: November 21, 2023

Date of Patent: July 22, 2025

Assignee: Amazon Technologies, Inc.

Inventor: Jonathan Alan Leblang
Systems and methods for managing multiple voice assistant services based on voice input

Patent number: 12315514

Abstract: Systems and methods are described for assigning a voice assistant service (VAS) from multiple VASs, based on a voice input. The system generally comprises a processor that is configured to process at least one voice input, e.g., with a wake word, and assign at least one VAS to output a response based on the voice input. Some embodiments support the skill or skills of a secondary VAS when the primary VAS is unavailable or when the primary VAS does not possess the required skill. The system may evaluate the skills required to process a user request based on the context and/or intent. The system may distribute data related to voice input and context and/or intent among various VASs to complete a task. Furthermore, the system may classify voice input as generic voice input or target VAS-specific voice input, e.g., by utilizing a trained model.

Type: Grant

Filed: July 15, 2022

Date of Patent: May 27, 2025

Assignee: Adeia Guides Inc.

Inventors: Gyanveer Singh, Reda Harb
Selectively activating on-device speech recognition, and using recognized text in selectively activating on-device NLU and/or on-device fulfillment

Patent number: 12315508

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant by, for example, obviating the need to provide an explicit invocation to the automated assistant, such as by saying a hot-word/phrase or performing a specific user input, prior to speaking a command or query. In addition, the automated assistant can optionally receive, understand, and/or respond to the command or query without communicating with a server, thereby further reducing the time in which a response can be provided. Implementations only selectively initiate on-device speech recognition responsive to determining one or more condition(s) are satisfied. Further, in some implementations, on-device NLU, on-device fulfillment, and/or resulting execution occur only responsive to determining, based on recognized text form the on-device speech recognition, that such further processing should occur.

Type: Grant

Filed: October 21, 2022

Date of Patent: May 27, 2025

Assignee: GOOGLE LLC

Inventors: Michael Golikov, Zaheed Sabur, Denis Burakov, Behshad Behzadi, Sergey Nazarov, Daniel Cotting, Mario Bertschler, Lucas Mirelmann, Steve Cheng, Bohdan Vlasyuk, Jonathan Lee, Lucia Terrenghi, Adrian Zumbrunnen
Methods for examining game context for determining a user's voice commands

Patent number: 12300221

Abstract: A method for executing a session of a video game is provided, including the following operations: recording speech of a player engaged in gameplay of the session of the video game; analyzing a game state generated by the execution of the session of the video game, wherein analyzing the game state identifies a context of the gameplay; analyzing the recorded speech using the identified context of the gameplay and a speech recognition model, to identify textual content of the recorded speech; applying the identified textual content as a gameplay input for the session of the video game.

Type: Grant

Filed: May 28, 2022

Date of Patent: May 13, 2025

Assignee: SONY INTERACTIVEENTERTAINMENT INC.

Inventor: Mahdi Azmandian
Method and apparatus for training semantic retrieval network, electronic device and storage medium

Patent number: 12293300

Abstract: The disclosure provides a method for training a semantic retrieval network, an electronic device and a storage medium. The method includes: obtaining a training sample including a search term and n candidate files corresponding to the search term, where n is an integer greater than 1; inputting the training sample into the ranking model, to obtain n first correlation degrees output by the ranking model, in which each first correlation degree represents a correlation between a candidate document and the search term; inputting the training sample into the semantic retrieval model, to obtain n second correlation degrees output by the semantic retrieval model, wherein each second correlation degree represents a correlation between a candidate document and the search term; and training the semantic retrieval model and the ranking model jointly based on the n first correlation degrees and the n second correlation degrees.

Type: Grant

Filed: September 7, 2022

Date of Patent: May 6, 2025

Assignees: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., CO., LTD.

Inventors: Yingqi Qu, Yuchen Ding, Jing Liu, Hua Wu, Haifeng Wang
Machine learning-based automated narrative text scoring including emotion arc characterization

Patent number: 12271694

Abstract: Quality of a narrative is characterized by receiving data that includes a narrative text. This narrative text is then tokenized and events are extracted from the tokenized words. The extraction can use, in parallel, two or more different extraction techniques. The extracted events are then extracted so that a waveform can be generated based on the aggregated extracted events that characterizes a plurality of emotional arcs within the narrative text. Subsequently, a plurality of waveform elements are extracted from the waveform. The narrative quality (or other quality) of the narrative text is then scored based on the extracted plurality of waveform elements and using a machine learning model trained to correlate emotional arc waveforms with narrative quality scores. Related apparatus, systems, techniques and articles are also described.

Type: Grant

Filed: April 23, 2021

Date of Patent: April 8, 2025

Assignee: Educational Testing Service

Inventors: Swapna Somasundaran, Xianyang Chen, Michael Flor
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 12272356

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).

Type: Grant

Filed: June 16, 2022

Date of Patent: April 8, 2025

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Jaclyn Konzelmann
Natural language processing device

Patent number: 12266352

Abstract: A natural language processing device according to an embodiment of the present disclosure may comprise: a memory for storing a first channel named entity dictionary including basic channel names and a synonym of each of the basic channel names; a communication interface for receiving, from a display device, voice data corresponding to a voice instruction uttered by a user; and a processor which: acquires multiple channel names included in electronic program guide information; extracts channel names matching the acquired multiple channel names from the first channel named entity dictionary so as to acquire a second channel named entity dictionary; acquires the intention of a speech of the voice instruction on the basis of text data of the voice data and the second channel named entity dictionary; and transmits the acquired intention of the speech to the display device through the communication interface.

Type: Grant

Filed: March 6, 2020

Date of Patent: April 1, 2025

Assignee: LG ELECTRONICS INC.

Inventors: Taejoon Kim, Jaekyung Lee
Grouping similar words in a language model

Patent number: 12236946

Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.

Type: Grant

Filed: August 22, 2022

Date of Patent: February 25, 2025

Inventors: Jacob Assa, Alan Bekker, Zach Moshe
Using token level context to generate SSML tags

Patent number: 12230243

Abstract: This disclosure describes a system that analyzes a corpus of text (e.g., a financial article, an audio book, etc.) so that the context surrounding the text is fully understood. For instance, the context may be an environment described by the text, or an environment in which the text occurs. Based on the analysis, the system can determine sentiment, part of speech, entities, and/or human characters at the token level of the text, and automatically generate Speech Synthesis Markup Language (SSML) tags based on this information. The SSML tags can be used by applications, services, and/or features that implement text-to-speech (TTS) conversion to improve the audio experience for end-users. Consequently, via the techniques described herein, more realistic and human-like speech synthesis can be efficiently implemented at larger scale (e.g., for audio books, for all the articles published to a news site, etc.).

Type: Grant

Filed: December 30, 2021

Date of Patent: February 18, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Mikayel Mirzoyan, André Aing, Aysar Khalid, Chad Joseph Lynch, Graham Michael Reeve, Sadek Baroudi, Vidush Vishwanath
Audio signal processing method and system for echo mitigation using an echo reference derived from an internal sensor

Patent number: 12223977

Abstract: Disclosed is an audio signal processing method implemented by an audio system including at least an internal sensor, an external sensor and a speaker unit. The audio signal processing method includes: producing internal and external audio signals by measuring simultaneously acoustic signals reaching the internal sensor and acoustic signals reaching the external sensor, respectively; converting the internal audio signal to frequency domain, thereby producing an internal audio spectrum; converting the external audio signal to frequency domain, thereby producing an external audio spectrum; estimating, based on the internal audio spectrum, an echo audio spectrum of an echo audio signal caused by the speaker unit in the external audio signal, wherein the echo audio spectrum is estimated for frequencies above a minimum frequency, wherein the minimum frequency?600 hertz; and correcting the external audio spectrum for frequencies above the minimum frequency, based on the echo audio spectrum.

Type: Grant

Filed: August 8, 2022

Date of Patent: February 11, 2025

Assignee: Analog Devices International Unlimited Company

Inventors: Charles Fox, Jean-Marc Luneau
Audible transaction verification with an assistance device

Patent number: 12205573

Abstract: A user assistance device aids validation of transactions on payment terminals. A payment terminal that accepts a payment card at a point of sale can be detected. A transaction amount for validation can be determined from the payment terminal. Subsequently, audio of the transaction amount can be generated. Playing of the audio on a user assistance device can then be triggered, allowing a user to hear the transaction amount before accepting or rejecting the transaction. Further, transaction information can be analyzed and used to suggest whether a customer should accept or reject the transaction.

Type: Grant

Filed: June 23, 2022

Date of Patent: January 21, 2025

Assignee: CAPTIAL ONE SERVICES, LLC

Inventors: Joshua Edwards, Abdelkader M'Hamed Benkreira, Noel Lyles, Michael Mossoba
Voice processing system and voice processing method

Patent number: 12205597

Abstract: A voice processing system includes: a first acquisition processor that acquires voice data corrected by a microphone installed in a microphone-speaker device; a second acquisition processor that acquires authentication information of a wearer who wears the microphone-speaker device, the authentication information being acquired by an authentication information acquirer installed in the microphone-speaker device; and a control processor that executes predetermined processing related to the voice data, which is acquired by the first acquisition processor, on the basis of the authentication information acquired by the second acquisition processor.

Type: Grant

Filed: May 9, 2022

Date of Patent: January 21, 2025

Assignee: SHARP KABUSHIKI KAISHA

Inventors: Tatsuya Nishio, Fumiaki Sugimori
System and method for generating, triggering, and playing audio cues in real time using a personal audio device

Patent number: 12190018

Abstract: A system and method for generating, triggering and playing a sequence of audio files with cues for delivering a presentation for a presenter using a personal audio device coupled to a computing device. The system comprising the comprising a computer device that is coupled to a presentation data analysis server through a network. The method includes (i) generating a sequence of audio files with cues for delivering a presentation, (ii) triggering playing an audio file from the sequence of audio files, and (iii) playing the sequence of audio files one by one, on the computing device, using the personal audio device coupled to a computing device to enable the presenter to recall and speak the content based on the sequence of the audio files.

Type: Grant

Filed: January 4, 2021

Date of Patent: January 7, 2025

Inventor: Arjun Karthik Bala
Speech analysis devices and methods for identifying migraine attacks

Patent number: 12175998

Abstract: Speech analysis devices and methods for identifying migraine attacks are provided. Migraine sufferers can experience changes in speech patterns both during a migraine attack and in a pre-attack phase (e.g., a time period before the migraine attack can be recognized by the migraine sufferer). Embodiments identify or predict migraine attacks during the pre-attack phase and/or the attack phase (such as early stages of a migraine attack) by comparing speech features from one or more speech samples provided by a user against baseline data. The speech features are indicative and/or predictive of migraine onset, and can be personalized to a user and/or based on normative data.

Type: Grant

Filed: November 8, 2019

Date of Patent: December 24, 2024

Assignees: Arizona Board of Regents on behalf of Arizona State University, Mayo Foundation for Medical Education and Research

Inventors: Visar Berisha, Jacob Peplinski, Todd Schwedt
History-based ASR mistake corrections

Patent number: 12165641

Abstract: A method includes receiving follow-on audio data captured by an assistant-enabled device, the follow-on audio data corresponding to a follow-on query spoken by a user of the assistant-enabled device to a digital assistant subsequent to the user submitting a previous query to the digital assistant. The method also includes processing, using a speech recognizer, the follow-on audio data to generate multiple candidate hypotheses, each candidate hypothesis corresponding to a candidate transcription for the follow-on query and represented by a respective sequence of hypothesized terms. For each corresponding candidate hypothesis among the multiple candidate hypotheses, the method also includes determining a corresponding similarity metric between the previous query and the corresponding candidate hypothesis and determining a transcription of the follow-on query spoken by the user based on the similarity metrics determined for the multiple candidate hypotheses.

Type: Grant

Filed: July 11, 2022

Date of Patent: December 10, 2024

Assignee: Google LLC

Inventors: Patrick Siegler, Aurélien Boffy, Ágoston Weisz

1 2 3 4 5 … next