Patents Examined by Richemond Dorvil
-
Patent number: 12198713Abstract: A lip sync image generation device based on machine learning according to a disclosed embodiment includes an image synthesis model, which is an artificial neural network model, and which uses a person background image and an utterance audio signal as an input to generate a lip sync image, and a lip sync discrimination model, which is an artificial neural network model, and which discriminates the degree of match between the lip sync image generated by the image synthesis model and the utterance audio signal input to the image synthesis model.Type: GrantFiled: June 17, 2021Date of Patent: January 14, 2025Assignee: DEEPBRAIN AI INC.Inventor: Gyeong Su Chae
-
Patent number: 12190905Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.Type: GrantFiled: August 20, 2021Date of Patent: January 7, 2025Assignee: Pindrop Security, Inc.Inventors: Hrishikesh Rao, Kedar Phatak, Elie Khoury
-
Patent number: 12183321Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.Type: GrantFiled: October 5, 2023Date of Patent: December 31, 2024Assignee: GOOGLE LLCInventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
-
Patent number: 12170092Abstract: The present technology relates to a signal processing device, a method, and a program that can obtain a signal with higher sound quality. The signal processing device includes: a calculation unit that calculates a parameter for generating a difference signal corresponding to an input compressed sound source signal on the basis of a prediction coefficient and the input compressed sound source signal, the prediction coefficient being obtained by learning using, as training data, a difference signal between an original sound signal and a learning compressed sound source signal obtained by compressing and coding the original sound signal; a difference signal generation unit that generates the difference signal on the basis of the parameter and the input compressed sound source signal; and a synthesis unit that synthesizes the generated difference signal and the input compressed sound source signal. The present technology can be applied to a signal processing device.Type: GrantFiled: February 20, 2020Date of Patent: December 17, 2024Assignee: Sony Group CorporationInventor: Takao Fukui
-
Patent number: 12165663Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.Type: GrantFiled: November 14, 2022Date of Patent: December 10, 2024Assignee: GOOGLE LLCInventors: Beat Gfeller, Dominik Roblek, Félix de Chaumont Quitry, Marco Tagliasacchi
-
Patent number: 12165179Abstract: An analytics system receives, from a client computing device, a request to generate a presentation. The analytics system accesses one or more feedback datasets of feedback data. The feedback data comprises unstructured data available from multiple data stores. The analytics system generates, for each feedback dataset, a respective feedback text and a respective sentiment score indicating a degree of negativity associated with the respective feedback text. For a combination of a plurality of generated feedback texts, the analytics system selects a set of themes based at least on a plurality of generated sentiment scores. Each sentiment score of the plurality of generated sentiment scores is associated with one of the plurality of generated feedback texts. The analytics system generates a presentation file that indicates the set of themes. The analytics system causes the presentation file to be transmitted to the client computing device.Type: GrantFiled: March 11, 2022Date of Patent: December 10, 2024Assignee: TREDENCE INC.Inventors: Ankush Chopra, Aravind Chandramouli, Ashutosh Rajesh Kothiwala, Shiven Purohit, Siddharth Shukla, Shubham Pandey, Soumendra Mohanty
-
Patent number: 12154541Abstract: A method, computer program product, and computing system for receiving feature-based voice data associated with a first acoustic domain. One or more reverberation-based augmentations may be performed on at least a portion of the feature-based voice data, thus defining reverberation-augmented feature-based voice data.Type: GrantFiled: March 10, 2021Date of Patent: November 26, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Dushyant Sharma, Patrick A. Naylor, James W. Fosburgh, Do Yeong Kim
-
Patent number: 12154574Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.Type: GrantFiled: November 9, 2023Date of Patent: November 26, 2024Assignee: Google LLCInventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
-
Patent number: 12154544Abstract: A speech-processing system receives input data representing text. An encoder processes segments of the text to determine embedding data, and a decoder processes the embedding data to determine one or more categories associated with each segment. Output data is determined by selecting words based on the segments and categories.Type: GrantFiled: March 18, 2021Date of Patent: November 26, 2024Assignee: Amazon Technologies, Inc.Inventors: Michal Czuczman, You Wang, Masaki Noguchi, Viacheslav Klimkov
-
Patent number: 12147767Abstract: Implementations are described herein for recommending actions based on entity or entity type. In various implementations, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.Type: GrantFiled: May 8, 2023Date of Patent: November 19, 2024Assignee: GOOGLE LLCInventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
-
Patent number: 12142289Abstract: An adaptive echo cancellation system introduces an acoustic reference signal to audio content being transmitted to the speaker for playback. The acoustic reference signal is an out-of-band signal, such as an ultrasonic signal, which is typically not audible to humans. The microphone of the mobile device receives the audio content played back by the speaker as well as audio content introduced by the user (e.g., the speech of the user). The adaptive echo cancellation system detects the acoustic reference signal and determines a time delay between when the acoustic reference signal was introduced to the audio content and when the audio content including the acoustic reference signal was received by the mobile device. Echo is cancelled from the received audio content based on this determined time delay.Type: GrantFiled: February 3, 2022Date of Patent: November 12, 2024Assignee: Motorola Mobility LLCInventors: Seungho Kim, Joseph C. Dwyer, Giles T. Davis
-
Patent number: 12142259Abstract: A method of detecting live speech comprises: receiving a signal containing speech; obtaining a first component of the received signal in a first frequency band, wherein the first frequency band includes audio frequencies; and obtaining a second component of the received signal in a second frequency band higher than the first frequency band. Then, modulation of the first component of the received signal is detected; modulation of the second component of the received signal is detected; and the modulation of the first component of the received signal and the modulation of the second component of the received signal are compared. It may then be determined that the speech may not be live speech, if the modulation of the first component of the received signal differs from the modulation of the second component of the received signal.Type: GrantFiled: May 16, 2023Date of Patent: November 12, 2024Assignee: Cirrus Logic Inc.Inventors: John Paul Lesso, Toru Ido
-
Patent number: 12142292Abstract: An audio emitter configured to emit a sound creates a high-frequency copy of the sound to be emitted. The high-frequency copy of the sound is superimposed over the sound, resulting in a composite signal. The composite signal is emitted by the emitter. The high-frequency copy is at a frequency inaudible to humans, enabling a receiver to identify the emitter and/or the sound.Type: GrantFiled: March 23, 2021Date of Patent: November 12, 2024Assignee: International Business Machines CorporationInventors: Samuel B. Hawker, Alexander John Naylor-Teece, Bhavnit Patel, Grace Jansen
-
Patent number: 12136414Abstract: Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.Type: GrantFiled: August 18, 2021Date of Patent: November 5, 2024Assignee: International Business Machines CorporationInventors: Samuel Thomas, Jatin Ganhotra, Hong-Kwang Kuo, Sachindra Joshi, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
-
Patent number: 12124798Abstract: A method is disclosed for calculating similarity rates between electronic documents. The similarity rate is calculated based on a count of matching phrases between the electronic documents and distances between subsequent matching phrases in each of the electronic documents. A system is also disclosed for comparing the electronic documents to obtain their similarity rates. A computing device determines at least one first proximity parameter based on the number of matched words in a matching phrase and at least one second proximity parameter based on distances between the subsequent matching phrases in each of the electronic documents. The similarity rate is determined based on the first and second proximity parameters.Type: GrantFiled: August 30, 2021Date of Patent: October 22, 2024Assignee: KYOCERA DOCUMENT SOLUTIONS INC.Inventor: Oleg Y. Zakharov
-
Patent number: 12125482Abstract: An example apparatus for recognizing speech includes an audio receiver to receive a stream of audio. The apparatus also includes a key phrase detector to detect a key phrase in the stream of audio. The apparatus further includes a model adapter to dynamically adapt a model based on the detected key phrase. The apparatus also includes a query recognizer to detect a voice query following the key phrase in a stream of audio via the adapted model.Type: GrantFiled: November 22, 2019Date of Patent: October 22, 2024Assignee: Intel CorporationInventors: Krzysztof Czarnowski, Munir Nikolai Alexander Georges, Tobias Bocklet, Georg Stemmer
-
Patent number: 12125490Abstract: Techniques for a digital assistant to receive intent input from a secondary user are provided. A digital assistant receives a query intent from a primary user, wherein the primary user is authorized to provide query intents. It is determined, based on the query intent, that intent input will be provided by a secondary user, wherein the secondary user is not authorized to provide query intents. Intent input provided by the secondary user is received. The digital assistant processes the query intent using the intent input. The results of the query intent are provided to the primary user.Type: GrantFiled: June 18, 2020Date of Patent: October 22, 2024Assignee: MOTOROLA SOLUTIONS, INC.Inventors: Woei Chyuan Tan, Bing Qin Lim, Guo Dong Gan, Chun Meng Tan
-
Patent number: 12118323Abstract: An approach for generating an optimized video of a speaker, translated from a source language into a target language with the speaker's lips synchronized to the translated speech, while balancing optimization of the translation into a target language. A source video may be fed into a neural machine translation model. The model may synthesize a plurality of potential translations. The translations may be received by a generative adversarial network which generates video for each translation and classifies the translations as in-sync or out of sync. A lip-syncing score may be for each of the generated videos that are classified as in-sync.Type: GrantFiled: September 23, 2021Date of Patent: October 15, 2024Assignee: International Business Machines CorporationInventors: Sathya Santhar, Sridevi Kannan, Sarbajit K. Rakshit, Samuel Mathew Jawaharlal
-
Patent number: 12118320Abstract: Systems and methods for conducting communications between a user and an Artificial Intelligence (AI) character model are provided. An example method includes determining a context of a dialog between the AI character model and the user, the context being determined based on a data stream received from a client-side computing device associated with the user; receiving a message of the user in the dialog; and generating, based on the context and the message, an input to a language model configured to predict a response to the message; providing the input to the language model to obtain the response; and transmitting the response to the client-side computing device, where the client-side computing device presents the response to the user. The input to the language model includes the message expanded by a keyword associated with the context. The context includes an intent of the user and an emotional state of the user.Type: GrantFiled: April 27, 2023Date of Patent: October 15, 2024Assignee: Theai, Inc.Inventors: Ilya Gelfenbeyn, Mikhail Ermolenko, Kylan Gibbs
-
Patent number: 12118984Abstract: Systems and methods are presented herein for providing a user with a notification, or access to content, based on the user's factual discourse during a conversation with other users. A first user may provide a first statement. A second user may provide a second statement. An application determines the first and the second statement are associated with first and second user profiles, respectively. The application analyzes the elements of each respective statement and determines there is a conflict between the user statements. In response to determining there is a conflict between the respective statements, the application generates a respective search query to verify each respective statement. When the application determines there is an answer that resolves the conflict between the respective statements, the application generates a notification for the users that comprises the answer that resolves the conflict and may include access to content affirming the answer.Type: GrantFiled: November 11, 2020Date of Patent: October 15, 2024Assignee: Rovi Guides, Inc.Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose