Patents Examined by Richemond Dorvil

Learning method for generating lip sync image based on machine learning and lip sync image generation device for performing same

Patent number: 12198713

Abstract: A lip sync image generation device based on machine learning according to a disclosed embodiment includes an image synthesis model, which is an artificial neural network model, and which uses a person background image and an utterance audio signal as an input to generate a lip sync image, and a lip sync discrimination model, which is an artificial neural network model, and which discriminates the degree of match between the lip sync image generated by the image synthesis model and the utterance audio signal input to the image synthesis model.

Type: Grant

Filed: June 17, 2021

Date of Patent: January 14, 2025

Assignee: DEEPBRAIN AI INC.

Inventor: Gyeong Su Chae
Speaker recognition with quality indicators

Patent number: 12190905

Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.

Type: Grant

Filed: August 20, 2021

Date of Patent: January 7, 2025

Assignee: Pindrop Security, Inc.

Inventors: Hrishikesh Rao, Kedar Phatak, Elie Khoury
Using corrections, of predicted textual segments of spoken utterances, for training of on-device speech recognition model

Patent number: 12183321

Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

Type: Grant

Filed: October 5, 2023

Date of Patent: December 31, 2024

Assignee: GOOGLE LLC

Inventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
Signal processing device, method, and program

Patent number: 12170092

Abstract: The present technology relates to a signal processing device, a method, and a program that can obtain a signal with higher sound quality. The signal processing device includes: a calculation unit that calculates a parameter for generating a difference signal corresponding to an input compressed sound source signal on the basis of a prediction coefficient and the input compressed sound source signal, the prediction coefficient being obtained by learning using, as training data, a difference signal between an original sound signal and a learning compressed sound source signal obtained by compressing and coding the original sound signal; a difference signal generation unit that generates the difference signal on the basis of the parameter and the input compressed sound source signal; and a synthesis unit that synthesizes the generated difference signal and the input compressed sound source signal. The present technology can be applied to a signal processing device.

Type: Grant

Filed: February 20, 2020

Date of Patent: December 17, 2024

Assignee: Sony Group Corporation

Inventor: Takao Fukui
Self-supervised audio representation learning for mobile devices

Patent number: 12165663

Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

Type: Grant

Filed: November 14, 2022

Date of Patent: December 10, 2024

Assignee: GOOGLE LLC

Inventors: Beat Gfeller, Dominik Roblek, Félix de Chaumont Quitry, Marco Tagliasacchi
Multi-channel feedback analytics for presentation generation

Patent number: 12165179

Abstract: An analytics system receives, from a client computing device, a request to generate a presentation. The analytics system accesses one or more feedback datasets of feedback data. The feedback data comprises unstructured data available from multiple data stores. The analytics system generates, for each feedback dataset, a respective feedback text and a respective sentiment score indicating a degree of negativity associated with the respective feedback text. For a combination of a plurality of generated feedback texts, the analytics system selects a set of themes based at least on a plurality of generated sentiment scores. Each sentiment score of the plurality of generated sentiment scores is associated with one of the plurality of generated feedback texts. The analytics system generates a presentation file that indicates the set of themes. The analytics system causes the presentation file to be transmitted to the client computing device.

Type: Grant

Filed: March 11, 2022

Date of Patent: December 10, 2024

Assignee: TREDENCE INC.

Inventors: Ankush Chopra, Aravind Chandramouli, Ashutosh Rajesh Kothiwala, Shiven Purohit, Siddharth Shukla, Shubham Pandey, Soumendra Mohanty
System and method for data augmentation of feature-based voice data

Patent number: 12154541

Abstract: A method, computer program product, and computing system for receiving feature-based voice data associated with a first acoustic domain. One or more reverberation-based augmentations may be performed on at least a portion of the feature-based voice data, thus defining reverberation-augmented feature-based voice data.

Type: Grant

Filed: March 10, 2021

Date of Patent: November 26, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dushyant Sharma, Patrick A. Naylor, James W. Fosburgh, Do Yeong Kim
Assessing speaker recognition performance

Patent number: 12154574

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Grant

Filed: November 9, 2023

Date of Patent: November 26, 2024

Assignee: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
Synthetic speech processing

Patent number: 12154544

Abstract: A speech-processing system receives input data representing text. An encoder processes segments of the text to determine embedding data, and a decoder processes the embedding data to determine one or more categories associated with each segment. Output data is determined by selecting words based on the segments and categories.

Type: Grant

Filed: March 18, 2021

Date of Patent: November 26, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Michal Czuczman, You Wang, Masaki Noguchi, Viacheslav Klimkov
Recommending action(s) based on entity or entity type

Patent number: 12147767

Abstract: Implementations are described herein for recommending actions based on entity or entity type. In various implementations, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.

Type: Grant

Filed: May 8, 2023

Date of Patent: November 19, 2024

Assignee: GOOGLE LLC

Inventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
Adaptive echo delay determination using an out-of-band acoustic reference signal

Patent number: 12142289

Abstract: An adaptive echo cancellation system introduces an acoustic reference signal to audio content being transmitted to the speaker for playback. The acoustic reference signal is an out-of-band signal, such as an ultrasonic signal, which is typically not audible to humans. The microphone of the mobile device receives the audio content played back by the speaker as well as audio content introduced by the user (e.g., the speech of the user). The adaptive echo cancellation system detects the acoustic reference signal and determines a time delay between when the acoustic reference signal was introduced to the audio content and when the audio content including the acoustic reference signal was received by the mobile device. Echo is cancelled from the received audio content based on this determined time delay.

Type: Grant

Filed: February 3, 2022

Date of Patent: November 12, 2024

Assignee: Motorola Mobility LLC

Inventors: Seungho Kim, Joseph C. Dwyer, Giles T. Davis
Detection of live speech

Patent number: 12142259

Abstract: A method of detecting live speech comprises: receiving a signal containing speech; obtaining a first component of the received signal in a first frequency band, wherein the first frequency band includes audio frequencies; and obtaining a second component of the received signal in a second frequency band higher than the first frequency band. Then, modulation of the first component of the received signal is detected; modulation of the second component of the received signal is detected; and the modulation of the first component of the received signal and the modulation of the second component of the received signal are compared. It may then be determined that the speech may not be live speech, if the modulation of the first component of the received signal differs from the modulation of the second component of the received signal.

Type: Grant

Filed: May 16, 2023

Date of Patent: November 12, 2024

Assignee: Cirrus Logic Inc.

Inventors: John Paul Lesso, Toru Ido
Superimposing high-frequency copies of emitted sounds

Patent number: 12142292

Abstract: An audio emitter configured to emit a sound creates a high-frequency copy of the sound to be emitted. The high-frequency copy of the sound is superimposed over the sound, resulting in a composite signal. The composite signal is emitted by the emitter. The high-frequency copy is at a frequency inaudible to humans, enabling a receiver to identify the emitter and/or the sound.

Type: Grant

Filed: March 23, 2021

Date of Patent: November 12, 2024

Assignee: International Business Machines Corporation

Inventors: Samuel B. Hawker, Alexander John Naylor-Teece, Bhavnit Patel, Grace Jansen
Integrating dialog history into end-to-end spoken language understanding systems

Patent number: 12136414

Abstract: Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.

Type: Grant

Filed: August 18, 2021

Date of Patent: November 5, 2024

Assignee: International Business Machines Corporation

Inventors: Samuel Thomas, Jatin Ganhotra, Hong-Kwang Kuo, Sachindra Joshi, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
Method and system for obtaining similarity rates between electronic documents

Patent number: 12124798

Abstract: A method is disclosed for calculating similarity rates between electronic documents. The similarity rate is calculated based on a count of matching phrases between the electronic documents and distances between subsequent matching phrases in each of the electronic documents. A system is also disclosed for comparing the electronic documents to obtain their similarity rates. A computing device determines at least one first proximity parameter based on the number of matched words in a matching phrase and at least one second proximity parameter based on distances between the subsequent matching phrases in each of the electronic documents. The similarity rate is determined based on the first and second proximity parameters.

Type: Grant

Filed: August 30, 2021

Date of Patent: October 22, 2024

Assignee: KYOCERA DOCUMENT SOLUTIONS INC.

Inventor: Oleg Y. Zakharov
Adaptively recognizing speech using key phrases

Patent number: 12125482

Abstract: An example apparatus for recognizing speech includes an audio receiver to receive a stream of audio. The apparatus also includes a key phrase detector to detect a key phrase in the stream of audio. The apparatus further includes a model adapter to dynamically adapt a model based on the detected key phrase. The apparatus also includes a query recognizer to detect a voice query following the key phrase in a stream of audio via the adapted model.

Type: Grant

Filed: November 22, 2019

Date of Patent: October 22, 2024

Assignee: Intel Corporation

Inventors: Krzysztof Czarnowski, Munir Nikolai Alexander Georges, Tobias Bocklet, Georg Stemmer
System and method for digital assistant receiving intent input from a secondary user

Patent number: 12125490

Abstract: Techniques for a digital assistant to receive intent input from a secondary user are provided. A digital assistant receives a query intent from a primary user, wherein the primary user is authorized to provide query intents. It is determined, based on the query intent, that intent input will be provided by a secondary user, wherein the secondary user is not authorized to provide query intents. Intent input provided by the secondary user is received. The digital assistant processes the query intent using the intent input. The results of the query intent are provided to the primary user.

Type: Grant

Filed: June 18, 2020

Date of Patent: October 22, 2024

Assignee: MOTOROLA SOLUTIONS, INC.

Inventors: Woei Chyuan Tan, Bing Qin Lim, Guo Dong Gan, Chun Meng Tan
Optimization of lip syncing in natural language translated video

Patent number: 12118323

Abstract: An approach for generating an optimized video of a speaker, translated from a source language into a target language with the speaker's lips synchronized to the translated speech, while balancing optimization of the translation into a target language. A source video may be fed into a neural machine translation model. The model may synthesize a plurality of potential translations. The translations may be received by a generative adversarial network which generates video for each translation and classifies the translations as in-sync or out of sync. A lip-syncing score may be for each of the generated videos that are classified as in-sync.

Type: Grant

Filed: September 23, 2021

Date of Patent: October 15, 2024

Assignee: International Business Machines Corporation

Inventors: Sathya Santhar, Sridevi Kannan, Sarbajit K. Rakshit, Samuel Mathew Jawaharlal
Controlling generative language models for artificial intelligence characters

Patent number: 12118320

Abstract: Systems and methods for conducting communications between a user and an Artificial Intelligence (AI) character model are provided. An example method includes determining a context of a dialog between the AI character model and the user, the context being determined based on a data stream received from a client-side computing device associated with the user; receiving a message of the user in the dialog; and generating, based on the context and the message, an input to a language model configured to predict a response to the message; providing the input to the language model to obtain the response; and transmitting the response to the client-side computing device, where the client-side computing device presents the response to the user. The input to the language model includes the message expanded by a keyword associated with the context. The context includes an intent of the user and an emotional state of the user.

Type: Grant

Filed: April 27, 2023

Date of Patent: October 15, 2024

Assignee: Theai, Inc.

Inventors: Ilya Gelfenbeyn, Mikhail Ermolenko, Kylan Gibbs
Systems and methods to resolve conflicts in conversations

Patent number: 12118984

Abstract: Systems and methods are presented herein for providing a user with a notification, or access to content, based on the user's factual discourse during a conversation with other users. A first user may provide a first statement. A second user may provide a second statement. An application determines the first and the second statement are associated with first and second user profiles, respectively. The application analyzes the elements of each respective statement and determines there is a conflict between the user statements. In response to determining there is a conflict between the respective statements, the application generates a respective search query to verify each respective statement. When the application determines there is an answer that resolves the conflict between the respective statements, the application generates a notification for the users that comprises the answer that resolves the conflict and may include access to content affirming the answer.

Type: Grant

Filed: November 11, 2020

Date of Patent: October 15, 2024

Assignee: Rovi Guides, Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose

prev 1 2 3 4 5 6 … next