Patents Examined by Richemond Dorvil
  • Patent number: 12198713
    Abstract: A lip sync image generation device based on machine learning according to a disclosed embodiment includes an image synthesis model, which is an artificial neural network model, and which uses a person background image and an utterance audio signal as an input to generate a lip sync image, and a lip sync discrimination model, which is an artificial neural network model, and which discriminates the degree of match between the lip sync image generated by the image synthesis model and the utterance audio signal input to the image synthesis model.
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: January 14, 2025
    Assignee: DEEPBRAIN AI INC.
    Inventor: Gyeong Su Chae
  • Patent number: 12190905
    Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.
    Type: Grant
    Filed: August 20, 2021
    Date of Patent: January 7, 2025
    Assignee: Pindrop Security, Inc.
    Inventors: Hrishikesh Rao, Kedar Phatak, Elie Khoury
  • Patent number: 12183321
    Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
    Type: Grant
    Filed: October 5, 2023
    Date of Patent: December 31, 2024
    Assignee: GOOGLE LLC
    Inventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
  • Patent number: 12170092
    Abstract: The present technology relates to a signal processing device, a method, and a program that can obtain a signal with higher sound quality. The signal processing device includes: a calculation unit that calculates a parameter for generating a difference signal corresponding to an input compressed sound source signal on the basis of a prediction coefficient and the input compressed sound source signal, the prediction coefficient being obtained by learning using, as training data, a difference signal between an original sound signal and a learning compressed sound source signal obtained by compressing and coding the original sound signal; a difference signal generation unit that generates the difference signal on the basis of the parameter and the input compressed sound source signal; and a synthesis unit that synthesizes the generated difference signal and the input compressed sound source signal. The present technology can be applied to a signal processing device.
    Type: Grant
    Filed: February 20, 2020
    Date of Patent: December 17, 2024
    Assignee: Sony Group Corporation
    Inventor: Takao Fukui
  • Patent number: 12165663
    Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.
    Type: Grant
    Filed: November 14, 2022
    Date of Patent: December 10, 2024
    Assignee: GOOGLE LLC
    Inventors: Beat Gfeller, Dominik Roblek, Félix de Chaumont Quitry, Marco Tagliasacchi
  • Patent number: 12165179
    Abstract: An analytics system receives, from a client computing device, a request to generate a presentation. The analytics system accesses one or more feedback datasets of feedback data. The feedback data comprises unstructured data available from multiple data stores. The analytics system generates, for each feedback dataset, a respective feedback text and a respective sentiment score indicating a degree of negativity associated with the respective feedback text. For a combination of a plurality of generated feedback texts, the analytics system selects a set of themes based at least on a plurality of generated sentiment scores. Each sentiment score of the plurality of generated sentiment scores is associated with one of the plurality of generated feedback texts. The analytics system generates a presentation file that indicates the set of themes. The analytics system causes the presentation file to be transmitted to the client computing device.
    Type: Grant
    Filed: March 11, 2022
    Date of Patent: December 10, 2024
    Assignee: TREDENCE INC.
    Inventors: Ankush Chopra, Aravind Chandramouli, Ashutosh Rajesh Kothiwala, Shiven Purohit, Siddharth Shukla, Shubham Pandey, Soumendra Mohanty
  • Patent number: 12154541
    Abstract: A method, computer program product, and computing system for receiving feature-based voice data associated with a first acoustic domain. One or more reverberation-based augmentations may be performed on at least a portion of the feature-based voice data, thus defining reverberation-augmented feature-based voice data.
    Type: Grant
    Filed: March 10, 2021
    Date of Patent: November 26, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dushyant Sharma, Patrick A. Naylor, James W. Fosburgh, Do Yeong Kim
  • Patent number: 12154574
    Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
    Type: Grant
    Filed: November 9, 2023
    Date of Patent: November 26, 2024
    Assignee: Google LLC
    Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
  • Patent number: 12154544
    Abstract: A speech-processing system receives input data representing text. An encoder processes segments of the text to determine embedding data, and a decoder processes the embedding data to determine one or more categories associated with each segment. Output data is determined by selecting words based on the segments and categories.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: November 26, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Michal Czuczman, You Wang, Masaki Noguchi, Viacheslav Klimkov
  • Patent number: 12147767
    Abstract: Implementations are described herein for recommending actions based on entity or entity type. In various implementations, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.
    Type: Grant
    Filed: May 8, 2023
    Date of Patent: November 19, 2024
    Assignee: GOOGLE LLC
    Inventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
  • Patent number: 12142289
    Abstract: An adaptive echo cancellation system introduces an acoustic reference signal to audio content being transmitted to the speaker for playback. The acoustic reference signal is an out-of-band signal, such as an ultrasonic signal, which is typically not audible to humans. The microphone of the mobile device receives the audio content played back by the speaker as well as audio content introduced by the user (e.g., the speech of the user). The adaptive echo cancellation system detects the acoustic reference signal and determines a time delay between when the acoustic reference signal was introduced to the audio content and when the audio content including the acoustic reference signal was received by the mobile device. Echo is cancelled from the received audio content based on this determined time delay.
    Type: Grant
    Filed: February 3, 2022
    Date of Patent: November 12, 2024
    Assignee: Motorola Mobility LLC
    Inventors: Seungho Kim, Joseph C. Dwyer, Giles T. Davis
  • Patent number: 12142259
    Abstract: A method of detecting live speech comprises: receiving a signal containing speech; obtaining a first component of the received signal in a first frequency band, wherein the first frequency band includes audio frequencies; and obtaining a second component of the received signal in a second frequency band higher than the first frequency band. Then, modulation of the first component of the received signal is detected; modulation of the second component of the received signal is detected; and the modulation of the first component of the received signal and the modulation of the second component of the received signal are compared. It may then be determined that the speech may not be live speech, if the modulation of the first component of the received signal differs from the modulation of the second component of the received signal.
    Type: Grant
    Filed: May 16, 2023
    Date of Patent: November 12, 2024
    Assignee: Cirrus Logic Inc.
    Inventors: John Paul Lesso, Toru Ido
  • Patent number: 12142292
    Abstract: An audio emitter configured to emit a sound creates a high-frequency copy of the sound to be emitted. The high-frequency copy of the sound is superimposed over the sound, resulting in a composite signal. The composite signal is emitted by the emitter. The high-frequency copy is at a frequency inaudible to humans, enabling a receiver to identify the emitter and/or the sound.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: November 12, 2024
    Assignee: International Business Machines Corporation
    Inventors: Samuel B. Hawker, Alexander John Naylor-Teece, Bhavnit Patel, Grace Jansen
  • Patent number: 12136414
    Abstract: Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.
    Type: Grant
    Filed: August 18, 2021
    Date of Patent: November 5, 2024
    Assignee: International Business Machines Corporation
    Inventors: Samuel Thomas, Jatin Ganhotra, Hong-Kwang Kuo, Sachindra Joshi, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
  • Patent number: 12124798
    Abstract: A method is disclosed for calculating similarity rates between electronic documents. The similarity rate is calculated based on a count of matching phrases between the electronic documents and distances between subsequent matching phrases in each of the electronic documents. A system is also disclosed for comparing the electronic documents to obtain their similarity rates. A computing device determines at least one first proximity parameter based on the number of matched words in a matching phrase and at least one second proximity parameter based on distances between the subsequent matching phrases in each of the electronic documents. The similarity rate is determined based on the first and second proximity parameters.
    Type: Grant
    Filed: August 30, 2021
    Date of Patent: October 22, 2024
    Assignee: KYOCERA DOCUMENT SOLUTIONS INC.
    Inventor: Oleg Y. Zakharov
  • Patent number: 12125482
    Abstract: An example apparatus for recognizing speech includes an audio receiver to receive a stream of audio. The apparatus also includes a key phrase detector to detect a key phrase in the stream of audio. The apparatus further includes a model adapter to dynamically adapt a model based on the detected key phrase. The apparatus also includes a query recognizer to detect a voice query following the key phrase in a stream of audio via the adapted model.
    Type: Grant
    Filed: November 22, 2019
    Date of Patent: October 22, 2024
    Assignee: Intel Corporation
    Inventors: Krzysztof Czarnowski, Munir Nikolai Alexander Georges, Tobias Bocklet, Georg Stemmer
  • Patent number: 12125490
    Abstract: Techniques for a digital assistant to receive intent input from a secondary user are provided. A digital assistant receives a query intent from a primary user, wherein the primary user is authorized to provide query intents. It is determined, based on the query intent, that intent input will be provided by a secondary user, wherein the secondary user is not authorized to provide query intents. Intent input provided by the secondary user is received. The digital assistant processes the query intent using the intent input. The results of the query intent are provided to the primary user.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: October 22, 2024
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Woei Chyuan Tan, Bing Qin Lim, Guo Dong Gan, Chun Meng Tan
  • Patent number: 12118323
    Abstract: An approach for generating an optimized video of a speaker, translated from a source language into a target language with the speaker's lips synchronized to the translated speech, while balancing optimization of the translation into a target language. A source video may be fed into a neural machine translation model. The model may synthesize a plurality of potential translations. The translations may be received by a generative adversarial network which generates video for each translation and classifies the translations as in-sync or out of sync. A lip-syncing score may be for each of the generated videos that are classified as in-sync.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: October 15, 2024
    Assignee: International Business Machines Corporation
    Inventors: Sathya Santhar, Sridevi Kannan, Sarbajit K. Rakshit, Samuel Mathew Jawaharlal
  • Patent number: 12118320
    Abstract: Systems and methods for conducting communications between a user and an Artificial Intelligence (AI) character model are provided. An example method includes determining a context of a dialog between the AI character model and the user, the context being determined based on a data stream received from a client-side computing device associated with the user; receiving a message of the user in the dialog; and generating, based on the context and the message, an input to a language model configured to predict a response to the message; providing the input to the language model to obtain the response; and transmitting the response to the client-side computing device, where the client-side computing device presents the response to the user. The input to the language model includes the message expanded by a keyword associated with the context. The context includes an intent of the user and an emotional state of the user.
    Type: Grant
    Filed: April 27, 2023
    Date of Patent: October 15, 2024
    Assignee: Theai, Inc.
    Inventors: Ilya Gelfenbeyn, Mikhail Ermolenko, Kylan Gibbs
  • Patent number: 12118984
    Abstract: Systems and methods are presented herein for providing a user with a notification, or access to content, based on the user's factual discourse during a conversation with other users. A first user may provide a first statement. A second user may provide a second statement. An application determines the first and the second statement are associated with first and second user profiles, respectively. The application analyzes the elements of each respective statement and determines there is a conflict between the user statements. In response to determining there is a conflict between the respective statements, the application generates a respective search query to verify each respective statement. When the application determines there is an answer that resolves the conflict between the respective statements, the application generates a notification for the users that comprises the answer that resolves the conflict and may include access to content affirming the answer.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: October 15, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose