Use Of Phonemic Categorization Or Speech Recognition Prior To Speaker Recognition Or Verification (epo) Patents (Class 704/E17.011)
  • Patent number: 12148433
    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.
    Type: Grant
    Filed: October 11, 2023
    Date of Patent: November 19, 2024
    Assignee: Google LLC
    Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
  • Patent number: 11978435
    Abstract: This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.
    Type: Grant
    Filed: October 13, 2020
    Date of Patent: May 7, 2024
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux
  • Patent number: 11783839
    Abstract: Embodiments described herein provide for a voice biometrics system execute machine-learning architectures capable of passive, active, continuous, or static operations, or a combination thereof. Systems passively and/or continuously, in some cases in addition to actively and/or statically, enrolling speakers. The system may dynamically generate and update profiles corresponding to end-users who contact a call center. The system may determine a level of enrollment for the enrollee profiles that limits the types of functions that the user may access. The system may update the profiles as new contact events are received or based on certain temporal triggering conditions.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: October 10, 2023
    Assignee: PINDROP SECURITY, INC.
    Inventors: Payas Gupta, Terry Nelms, II
  • Patent number: 11721324
    Abstract: A computer-implemented method, system and computer program product for providing high quality speech recognition. A first speech-to-text model is selected to perform speech recognition of a customer's spoken words and a second speech-to-text model is selected to perform speech recognition of the agent's spoken words during a call. The combined results of the speech-to-text models used to process the customer's and agent's spoken words are then analyzed to generate a reference speech-to-text result. The customer speech data that was processed by the first speech-to-text model is reprocessed by multiple other speech-to-text models. A similarity analysis is performed on the results of these speech-to-text models with respect to the reference speech-to-text result resulting in similarity scores being assigned to these speech-to-text models.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: August 8, 2023
    Assignee: International Business Machines Corporation
    Inventors: Yuan Jin, Xi Xi Liu, Li ping Wang, Fan Xiao Xin, Zheng Ping Chu
  • Patent number: 11551689
    Abstract: Embodiments of the present invention provide a computer system a computer program product, and a method that comprises analyzing a received voice command by identifying a plurality of contextual factors associated with at least one user in a plurality of users using a natural language processing algorithm; dynamically identifying the at least one user in the plurality of users based on an analysis of the identified contextual factors associated with the received voice command; transmitting the received voice command to another computing device within a plurality of computing devices associated with another user in the plurality of users; and generating a line of communication between the plurality of computing devices based on a correlation between a summation of a plurality of security factors and a predetermined threshold of risk associated with authenticating an identity of each user within the plurality of users.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: January 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shailendra Moyal, Sarbajit K. Rakshit
  • Patent number: 8694322
    Abstract: A voice-activated user interface for an application is described. The voice activated user interface invokes a speech recognition component to recognize audio input from a user. If the audio input is a command, then a validation component is invoked to determine whether to validate the command prior to execution.
    Type: Grant
    Filed: October 21, 2005
    Date of Patent: April 8, 2014
    Assignee: Microsoft Corporation
    Inventors: Alex G. Snitkovskiy, David Mowatt, Felix G. T. I. Andrew, Robert Edward Dewar, Oliver Scholz
  • Publication number: 20100131272
    Abstract: Apparatuses and methods for generating and verifying a voice signature of a message and computer readable medium thereof are provided. The generation and verification ends both use the same set of pronounceable symbols. The set of pronounceable symbols comprises a plurality of pronounceable units, and each of the pronounceable units comprises an index and a pronounceable symbol. The generation end converts the message into a message digest by a hash function and generates a plurality of designated pronounceable symbols according to the message digest. A user utters the designated pronounceable symbols to generate the voice signature. After receiving the message and the voice signature, the verification end performs voice authentication to determine a user identity of the voice signature, performs speech recognition to determine the relation between the message and the voice signature, and determines whether the user generates the voice signature for the message.
    Type: Application
    Filed: January 6, 2009
    Publication date: May 27, 2010
    Applicant: INSTITUTE FOR INFORMATION INDUSTRY
    Inventor: Jui-Ming WU