Patents Examined by Darioush Agahi
-
Patent number: 12380890Abstract: Provided is a method, system, and non-transitory computer-readable record medium to record a conversation, between multiple users, in connection with a video communication service. An audio record management method includes creating an audio record by converting an audio of a video communication channel to a text; and providing the audio record to a user that participates in the video communication channel.Type: GrantFiled: October 22, 2021Date of Patent: August 5, 2025Assignees: NAVER CORPORATION, LINE WORKS CORP.Inventors: SuMee Lee, Jieun Shin, Ye Lim Jeong, Gilhwan Hwang
-
Patent number: 12380281Abstract: The present disclosure generally relates to updating a language model based on user feedback. Based on a user text input, a language model predicts a set of tokens and an action that will be taken by the user in response to the predicted set of tokens. If the predicted action does not match a detected actual user action, the language model is updated to reflect the user feedback by modifying an output token probability distribution based on the actual user action and updating the language model to converge with a target language model using the modified output token probability distribution.Type: GrantFiled: September 2, 2022Date of Patent: August 5, 2025Assignee: Apple Inc.Inventor: Jerome R. Bellegarda
-
Patent number: 12379894Abstract: Systems and processes for operating a digital assistant are provided. An example method includes, at an electronic device having one or more processors and memory, detecting invocation of a digital assistant; determining, using a reference resolution service, a set of possible entities; receiving a user utterance including an ambiguous reference; determining based on the user utterance and the list of possible entities, a candidate interpretation including a preliminary set of entities corresponding to the ambiguous reference; determining, with the reference resolution service and based on the candidate interpretation including the preliminary set of entities corresponding to the ambiguous reference, an entity corresponding to the ambiguous reference; and performing, based on the candidate interpretation and the entity corresponding to the ambiguous reference, a task associated with the user utterance.Type: GrantFiled: February 22, 2022Date of Patent: August 5, 2025Assignee: Apple Inc.Inventors: Antoine R. Raux, Yi Ma, Kevin D. Pitolin, Felicia Edwards, Chengchao Yang
-
Patent number: 12380887Abstract: A method of improving output content through iterative generation is provided. The method includes receiving a natural language input, obtaining user intention information based on the natural language input by using a natural language understanding (NLU) model, setting a target area in base content based on a first user input, determining input content based on the user intention information or a second user input, generating output content related to the base content based on the input content, the target area, and the user intention information by using a neural network (NN) model, generating a caption for the output content by using an image captioning model, calculating similarity between text of the natural language input and the generated output content, and iterating generation of the output content based on the similarity.Type: GrantFiled: April 24, 2023Date of Patent: August 5, 2025Assignee: Samsung Electronics Co., Ltd.Inventors: Seohyun Back, Yonghyun Ryu, Wonho Ryu, Haejun Lee, Cheolseung Jung, Sai Chetan, Jiyeon Hong
-
Patent number: 12367882Abstract: Participants may use one or more devices for engaging in a meeting, such as phones, conferencing devices, and/or computers. The devices include microphones that capture speech for determining the presence of distinct participants. Speech signals originating from different participants, or microphones, may be determined and associated with the participants. For example, microphones may be directional and more sensitive to sound coming from one or more specific directions than sound coming from other directions. By associating an individual with a microphone, or set of microphones, overlapping voices may be disambiguated to provide clear voice streams that aid in producing a clear transcript indicating the speech of the participants, respectively. An identity of the participants may be determined using voiceprint and/or voice recognition techniques.Type: GrantFiled: November 21, 2023Date of Patent: July 22, 2025Assignee: Amazon Technologies, Inc.Inventor: Jonathan Alan Leblang
-
Patent number: 12315514Abstract: Systems and methods are described for assigning a voice assistant service (VAS) from multiple VASs, based on a voice input. The system generally comprises a processor that is configured to process at least one voice input, e.g., with a wake word, and assign at least one VAS to output a response based on the voice input. Some embodiments support the skill or skills of a secondary VAS when the primary VAS is unavailable or when the primary VAS does not possess the required skill. The system may evaluate the skills required to process a user request based on the context and/or intent. The system may distribute data related to voice input and context and/or intent among various VASs to complete a task. Furthermore, the system may classify voice input as generic voice input or target VAS-specific voice input, e.g., by utilizing a trained model.Type: GrantFiled: July 15, 2022Date of Patent: May 27, 2025Assignee: Adeia Guides Inc.Inventors: Gyanveer Singh, Reda Harb
-
Patent number: 12315508Abstract: Implementations can reduce the time required to obtain responses from an automated assistant by, for example, obviating the need to provide an explicit invocation to the automated assistant, such as by saying a hot-word/phrase or performing a specific user input, prior to speaking a command or query. In addition, the automated assistant can optionally receive, understand, and/or respond to the command or query without communicating with a server, thereby further reducing the time in which a response can be provided. Implementations only selectively initiate on-device speech recognition responsive to determining one or more condition(s) are satisfied. Further, in some implementations, on-device NLU, on-device fulfillment, and/or resulting execution occur only responsive to determining, based on recognized text form the on-device speech recognition, that such further processing should occur.Type: GrantFiled: October 21, 2022Date of Patent: May 27, 2025Assignee: GOOGLE LLCInventors: Michael Golikov, Zaheed Sabur, Denis Burakov, Behshad Behzadi, Sergey Nazarov, Daniel Cotting, Mario Bertschler, Lucas Mirelmann, Steve Cheng, Bohdan Vlasyuk, Jonathan Lee, Lucia Terrenghi, Adrian Zumbrunnen
-
Patent number: 12300221Abstract: A method for executing a session of a video game is provided, including the following operations: recording speech of a player engaged in gameplay of the session of the video game; analyzing a game state generated by the execution of the session of the video game, wherein analyzing the game state identifies a context of the gameplay; analyzing the recorded speech using the identified context of the gameplay and a speech recognition model, to identify textual content of the recorded speech; applying the identified textual content as a gameplay input for the session of the video game.Type: GrantFiled: May 28, 2022Date of Patent: May 13, 2025Assignee: SONY INTERACTIVEENTERTAINMENT INC.Inventor: Mahdi Azmandian
-
Patent number: 12293300Abstract: The disclosure provides a method for training a semantic retrieval network, an electronic device and a storage medium. The method includes: obtaining a training sample including a search term and n candidate files corresponding to the search term, where n is an integer greater than 1; inputting the training sample into the ranking model, to obtain n first correlation degrees output by the ranking model, in which each first correlation degree represents a correlation between a candidate document and the search term; inputting the training sample into the semantic retrieval model, to obtain n second correlation degrees output by the semantic retrieval model, wherein each second correlation degree represents a correlation between a candidate document and the search term; and training the semantic retrieval model and the ranking model jointly based on the n first correlation degrees and the n second correlation degrees.Type: GrantFiled: September 7, 2022Date of Patent: May 6, 2025Assignees: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., CO., LTD.Inventors: Yingqi Qu, Yuchen Ding, Jing Liu, Hua Wu, Haifeng Wang
-
Patent number: 12271694Abstract: Quality of a narrative is characterized by receiving data that includes a narrative text. This narrative text is then tokenized and events are extracted from the tokenized words. The extraction can use, in parallel, two or more different extraction techniques. The extracted events are then extracted so that a waveform can be generated based on the aggregated extracted events that characterizes a plurality of emotional arcs within the narrative text. Subsequently, a plurality of waveform elements are extracted from the waveform. The narrative quality (or other quality) of the narrative text is then scored based on the extracted plurality of waveform elements and using a machine learning model trained to correlate emotional arc waveforms with narrative quality scores. Related apparatus, systems, techniques and articles are also described.Type: GrantFiled: April 23, 2021Date of Patent: April 8, 2025Assignee: Educational Testing ServiceInventors: Swapna Somasundaran, Xianyang Chen, Michael Flor
-
Patent number: 12272356Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).Type: GrantFiled: June 16, 2022Date of Patent: April 8, 2025Assignee: GOOGLE LLCInventors: Diego Melendo Casado, Jaclyn Konzelmann
-
Patent number: 12266352Abstract: A natural language processing device according to an embodiment of the present disclosure may comprise: a memory for storing a first channel named entity dictionary including basic channel names and a synonym of each of the basic channel names; a communication interface for receiving, from a display device, voice data corresponding to a voice instruction uttered by a user; and a processor which: acquires multiple channel names included in electronic program guide information; extracts channel names matching the acquired multiple channel names from the first channel named entity dictionary so as to acquire a second channel named entity dictionary; acquires the intention of a speech of the voice instruction on the basis of text data of the voice data and the second channel named entity dictionary; and transmits the acquired intention of the speech to the display device through the communication interface.Type: GrantFiled: March 6, 2020Date of Patent: April 1, 2025Assignee: LG ELECTRONICS INC.Inventors: Taejoon Kim, Jaekyung Lee
-
Patent number: 12236946Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.Type: GrantFiled: August 22, 2022Date of Patent: February 25, 2025Inventors: Jacob Assa, Alan Bekker, Zach Moshe
-
Patent number: 12230243Abstract: This disclosure describes a system that analyzes a corpus of text (e.g., a financial article, an audio book, etc.) so that the context surrounding the text is fully understood. For instance, the context may be an environment described by the text, or an environment in which the text occurs. Based on the analysis, the system can determine sentiment, part of speech, entities, and/or human characters at the token level of the text, and automatically generate Speech Synthesis Markup Language (SSML) tags based on this information. The SSML tags can be used by applications, services, and/or features that implement text-to-speech (TTS) conversion to improve the audio experience for end-users. Consequently, via the techniques described herein, more realistic and human-like speech synthesis can be efficiently implemented at larger scale (e.g., for audio books, for all the articles published to a news site, etc.).Type: GrantFiled: December 30, 2021Date of Patent: February 18, 2025Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Mikayel Mirzoyan, André Aing, Aysar Khalid, Chad Joseph Lynch, Graham Michael Reeve, Sadek Baroudi, Vidush Vishwanath
-
Patent number: 12223977Abstract: Disclosed is an audio signal processing method implemented by an audio system including at least an internal sensor, an external sensor and a speaker unit. The audio signal processing method includes: producing internal and external audio signals by measuring simultaneously acoustic signals reaching the internal sensor and acoustic signals reaching the external sensor, respectively; converting the internal audio signal to frequency domain, thereby producing an internal audio spectrum; converting the external audio signal to frequency domain, thereby producing an external audio spectrum; estimating, based on the internal audio spectrum, an echo audio spectrum of an echo audio signal caused by the speaker unit in the external audio signal, wherein the echo audio spectrum is estimated for frequencies above a minimum frequency, wherein the minimum frequency?600 hertz; and correcting the external audio spectrum for frequencies above the minimum frequency, based on the echo audio spectrum.Type: GrantFiled: August 8, 2022Date of Patent: February 11, 2025Assignee: Analog Devices International Unlimited CompanyInventors: Charles Fox, Jean-Marc Luneau
-
Patent number: 12205573Abstract: A user assistance device aids validation of transactions on payment terminals. A payment terminal that accepts a payment card at a point of sale can be detected. A transaction amount for validation can be determined from the payment terminal. Subsequently, audio of the transaction amount can be generated. Playing of the audio on a user assistance device can then be triggered, allowing a user to hear the transaction amount before accepting or rejecting the transaction. Further, transaction information can be analyzed and used to suggest whether a customer should accept or reject the transaction.Type: GrantFiled: June 23, 2022Date of Patent: January 21, 2025Assignee: CAPTIAL ONE SERVICES, LLCInventors: Joshua Edwards, Abdelkader M'Hamed Benkreira, Noel Lyles, Michael Mossoba
-
Patent number: 12205597Abstract: A voice processing system includes: a first acquisition processor that acquires voice data corrected by a microphone installed in a microphone-speaker device; a second acquisition processor that acquires authentication information of a wearer who wears the microphone-speaker device, the authentication information being acquired by an authentication information acquirer installed in the microphone-speaker device; and a control processor that executes predetermined processing related to the voice data, which is acquired by the first acquisition processor, on the basis of the authentication information acquired by the second acquisition processor.Type: GrantFiled: May 9, 2022Date of Patent: January 21, 2025Assignee: SHARP KABUSHIKI KAISHAInventors: Tatsuya Nishio, Fumiaki Sugimori
-
Patent number: 12190018Abstract: A system and method for generating, triggering and playing a sequence of audio files with cues for delivering a presentation for a presenter using a personal audio device coupled to a computing device. The system comprising the comprising a computer device that is coupled to a presentation data analysis server through a network. The method includes (i) generating a sequence of audio files with cues for delivering a presentation, (ii) triggering playing an audio file from the sequence of audio files, and (iii) playing the sequence of audio files one by one, on the computing device, using the personal audio device coupled to a computing device to enable the presenter to recall and speak the content based on the sequence of the audio files.Type: GrantFiled: January 4, 2021Date of Patent: January 7, 2025Inventor: Arjun Karthik Bala
-
Patent number: 12175998Abstract: Speech analysis devices and methods for identifying migraine attacks are provided. Migraine sufferers can experience changes in speech patterns both during a migraine attack and in a pre-attack phase (e.g., a time period before the migraine attack can be recognized by the migraine sufferer). Embodiments identify or predict migraine attacks during the pre-attack phase and/or the attack phase (such as early stages of a migraine attack) by comparing speech features from one or more speech samples provided by a user against baseline data. The speech features are indicative and/or predictive of migraine onset, and can be personalized to a user and/or based on normative data.Type: GrantFiled: November 8, 2019Date of Patent: December 24, 2024Assignees: Arizona Board of Regents on behalf of Arizona State University, Mayo Foundation for Medical Education and ResearchInventors: Visar Berisha, Jacob Peplinski, Todd Schwedt
-
Patent number: 12165641Abstract: A method includes receiving follow-on audio data captured by an assistant-enabled device, the follow-on audio data corresponding to a follow-on query spoken by a user of the assistant-enabled device to a digital assistant subsequent to the user submitting a previous query to the digital assistant. The method also includes processing, using a speech recognizer, the follow-on audio data to generate multiple candidate hypotheses, each candidate hypothesis corresponding to a candidate transcription for the follow-on query and represented by a respective sequence of hypothesized terms. For each corresponding candidate hypothesis among the multiple candidate hypotheses, the method also includes determining a corresponding similarity metric between the previous query and the corresponding candidate hypothesis and determining a transcription of the follow-on query spoken by the user based on the similarity metrics determined for the multiple candidate hypotheses.Type: GrantFiled: July 11, 2022Date of Patent: December 10, 2024Assignee: Google LLCInventors: Patrick Siegler, Aurélien Boffy, Ágoston Weisz