Specialized Models Patents (Class 704/250)
  • Patent number: 11948066
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing sequences using convolutional neural networks. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: April 2, 2024
    Assignee: DeepMind Technologies Limited
    Inventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals, Lasse Espeholt
  • Patent number: 11942078
    Abstract: A computer-implemented method is provided for improving accuracy recognition of digital speech. The method includes receiving the digital speech. The method further includes splitting the digital speech into overlapping chunks. The method also includes computing a bidirectional encoder embedding of each of the overlapping chunks to obtain bidirectional encoder embeddings. The method additionally includes combining the bidirectional encoder embeddings. The method further includes interpreting, by a speech recognition system, the digital speech using the combined bidirectional encoder embeddings.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: March 26, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: George Andrei Saon
  • Patent number: 11900926
    Abstract: Examples of the present disclosure describe systems and methods for dynamically expanding acronyms in audio content. In aspects, a user access of an audio resource may be detected. The audio content of the audio resource may be evaluated to identify acronyms. One or more of the identified acronyms may be evaluated based on a user-specific context of the user and/or a global context associated with the user. Based on the evaluated context(s), expansion candidates and corresponding confidence scores may be determined for each identified acronym. Based on the confidence scores, an expansion candidate may be selected and used to replace the identified acronym when the audio content is consumed by the user.
    Type: Grant
    Filed: October 5, 2020
    Date of Patent: February 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Amelia Bateman
  • Patent number: 11868725
    Abstract: Provided are a server, a client device, and operation methods thereof for training a language model. The server, the client device, and the operation methods thereof identify a word or phrase including a named entity that is incorrectly pronounced by a user or is difficult for the user to accurately pronounce from an input text for use in training a natural language understanding (NLU) model, generate text candidates for use in training the NLU model by replacing the identified word or phrase with a word or phrase predicted to be uttered by the user and having high phonetic similarity to the identified word or phrase, and train the NLU model by using the generated text candidates.
    Type: Grant
    Filed: January 4, 2021
    Date of Patent: January 9, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hejung Yang, Kwangyoun Kim, Sungsoo Kim
  • Patent number: 11841932
    Abstract: A system and method may adjust the threshold or other settings in a biometric comparison system, or provide a report or display of parameters. Over a series of comparisons of authentication biometric samples (e.g. authentication VPs) to enrollment biometric samples (e.g. enrollment VPs), the authentication samples may be stored, and scores resulting from the biometric comparisons may be stored in a first set of scores. A second set of biometric comparisons may be created, each using a pairing of a stored authentication biometric sample and an enrollment biometric sample, each biometric comparison resulting in a score, the scores forming a second set of scores. The first and second sets of scores may be combined to produce a third set of scores, and an iterative process may be performed over the third set of scores to update the parameters of the Gaussian distribution of the third set of scores.
    Type: Grant
    Filed: April 14, 2020
    Date of Patent: December 12, 2023
    Assignee: Nice Ltd.
    Inventors: Roman Frenkel, Matan Keret, Amit Sharon
  • Patent number: 11837249
    Abstract: Systems, methods and non-transitory computer readable media for processing audio and visually presenting information are provided. Audio data captured by one or more audio sensors included in a wearable apparatus from an environment of a wearer of the wearable apparatus may be obtained. The audio data may be analyzed to obtain textual information. The audio data may be analyzed to associate different portions of the textual information with different speakers. A head mounted display system may be used to present each portion of the textual information in a presentation region associated with the speaker associated with the portion of the textual information.
    Type: Grant
    Filed: August 7, 2021
    Date of Patent: December 5, 2023
    Inventor: Ron Zass
  • Patent number: 11837236
    Abstract: This speech processing device is provided with: a contribution degree estimation means which calculates a contribution degree representing a quality of a segment of the speech signal; and a speaker feature calculation means which calculates a feature from the speech signal, for recognizing attribute information of the speech signal, using the contribution degree as a weight of the segment of the speech signal.
    Type: Grant
    Filed: December 8, 2021
    Date of Patent: December 5, 2023
    Assignee: NEC CORPORATION
    Inventors: Hitoshi Yamamoto, Takafumi Koshinaka
  • Patent number: 11823659
    Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: November 21, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Julia Reinspach, Oleg Rokhlenko, Ramakanthachary Gottumukkala, Giovanni Clemente, Ankit Agrawal, Swayam Bhardwaj, Guy Michaeli, Vaidyanathan Puthucode Krishnamoorthy, Costantino Vlachos, Nalledath P. Vinodkrishnan, Shaun M. Vickers, Sethuraman Ramachandran, Charles C. Moore
  • Patent number: 11797942
    Abstract: Aspects of this disclosure relate to apparatus and method for applicant scoring. Apparatus includes at least a processor and a memory communicatively connected to the at least a processor. Processor is configured to receive a user-associated data set to be stored on a resource data storage system, parse the user-associated data set into a keyword record, generate a user metric, based on a plurality of weighted values, as a function of the keyword record, receive identification of a posting of a plurality of postings, and match the posting of the plurality of postings to a user as a function of the user metric.
    Type: Grant
    Filed: March 9, 2022
    Date of Patent: October 24, 2023
    Assignee: MY JOB MATCHER, INC.
    Inventor: Arran Stewart
  • Patent number: 11776540
    Abstract: A system configured to enable remote control to allow a first user to provide assistance to a second user. The system may receive a command from the second user granting remote control to the first user, enabling the first user to initiate a voice command on behalf of the second user. In some examples, the system may enable the remote control by treating a voice command originating from the first user as though it originated from the second user instead. For example, the system may receive the voice command from a first device associated with the first user but may route the voice command as though it was received by a second device associated with the second user.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: October 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Peng Wang, Pathivada Rajsekhar Naidu
  • Patent number: 11756534
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
    Type: Grant
    Filed: January 26, 2022
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Bo Li, Ron Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
  • Patent number: 11727942
    Abstract: Systems and methods may generate, by a computer, a voice model for an enrollee based upon a set of one or more features extracted from a first audio sample received at a first time; receive at a second time a second audio sample associated with a caller; generate a likelihood score for the second audio sample by applying the voice model associated with the enrollee on the set of features extracted from the second audio sample associated with the caller, the likelihood score indicating a likelihood that the caller is the enrollee; calibrate the likelihood score based upon a time interval from the first time to the second time and at least one of: an enrollee age at the first time and an enrollee gender; and authenticate the caller as the enrollee upon the computer determining that the likelihood score satisfies a predetermined threshold score.
    Type: Grant
    Filed: May 17, 2022
    Date of Patent: August 15, 2023
    Assignee: PINDROP SECURITY, INC.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 11688300
    Abstract: There is provided herein a method for assessing a speech/lingual quality of a subject, the method comprising: providing a content-containing stimulus to a user; recording the user's vocal response to the stimulus and/or to instructions related thereto; processing the user's recorded vocal response to measure/extract/compute at least one linguistics (prosodic) parameter and at least one acoustic parameter; transforming the user's vocal response into a transformed text section, which is based on a processing unit's interpretation of the user's verbal response; comparing the transformed text section to a predetermined text section, which represents the user's expected; and computing an output signal indicative of at least one speech/lingual quality of the user, based at least on data resulted from the texts comparison, the at least one measured/extracted/computed linguistic parameter and the at least one acoustic parameter.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: June 27, 2023
    Assignee: AMPLIO LEARNING TECHNOLOGIES LTD.
    Inventors: Yoav Medan, Yair Shapira, Liron Mick
  • Patent number: 11676604
    Abstract: Systems, methods, and computer-readable storage media for generating a quantifiable communication profile of a participant, then using that communication profile to identify fake media and/or verify media as authentic. An exemplary system can receive communications in which a participant was one of the parties communicating, perform natural language processing and statistical analyses on the received communications, and generate a participant communication profile which details how the participant communicates and prefers to be communicated with. The system can also receive media purported to be from the participant, perform natural language processing on the purported media, compare the syntax of the purported media to the participant's communication profile, and identify, based on word usage, the media as a fake.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: June 13, 2023
    Assignee: Capital One Services, LLC
    Inventors: Christopher Camenares, Joseph Boayue, Lee Adcock, Ana Cruz, Nahid Farhady Ghalaty
  • Patent number: 11670291
    Abstract: Systems, methods, and storage media for providing an interface for textual editing through speech are disclosed. One aspect of the disclosure may include storing a narrative textual message and presenting the narrative textual message via a client computing platform. Speech uttered by a user may be received. Speech recognition may be performed on the speech to determine words spoken by the user. The words determined from the speech may be analyzed to detect whether an editing command was spoken. The editing command may include a parameter and an edited value for the parameter. The editing command may be processed to insert the edited value for the parameter or replace an initial value to the parameter with the edited value. Information may be transmitted to the client computing platform that causes the narrative textual message to be visually presented including edits based on the editing command.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: June 6, 2023
    Assignee: Suki AI, Inc.
    Inventors: Punit Soni, Karthik Rajan, Erin Palm, Matthew Pallakoff, Sanket Agarwal
  • Patent number: 11664032
    Abstract: Multi-agent input coordination can be used to for acoustic collaboration of multiple listening agents deployed in smart devices on a premises, improving the accuracy of identifying requests and specifying where that request should be honored, improving quality of detection, and providing better understanding of user commands and user intent throughout the premises. A processor or processors such as those in a smart speaker can identify audio requests received through at least two agents in a network and determine at which of the agents to actively process a selected audio request. The identification can make use of techniques such as location context and secondary trait analysis. The audio request can include simultaneous audio requests received through at least two agents, differing audio requests received from different requesters, or both.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: May 30, 2023
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: James Pratt, Timothy Innes, Eric Zavesky, Nigel Bradley
  • Patent number: 11636860
    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.
    Type: Grant
    Filed: July 21, 2020
    Date of Patent: April 25, 2023
    Assignee: Verint Systems Ltd.
    Inventors: Alex Gorodetski, Oana Sidi, Ron Wein, Ido Shapira
  • Patent number: 11615239
    Abstract: The present disclosure relates to systems for identifying instances of natural language input, determining intent classifications associated with instances of natural language input, and generating responses based on the determined intent classifications. In particular, the disclosed systems intelligently identify and group instances of natural language input based on characteristics of the user input. Additionally, the disclosed systems determine intent classifications for the instances of natural language input based message queuing in order to delay responses to the user input in ways that increase accuracy of the responses, while retaining a conversational aspect of the ongoing chat. Moreover, in one or more embodiments, the disclosed systems generate responses utilizing natural language.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: March 28, 2023
    Assignee: Adobe Inc.
    Inventors: Oliver Brdiczka, Ajinkya Kale, Piyush Chandra, Tracy King, Abhishek Gupta, Sourabh Goel, Nitin Garg, Deepika Naryani, Feroz Ahmad, Vikas Sagar
  • Patent number: 11580989
    Abstract: A training method of training a speaker identification model which receives voice data as an input and outputs speaker identification information for identifying a speaker of an utterance included in the voice data is provided. The training method includes: performing voice quality conversion of first voice data of a first speaker to generate second voice data of a second speaker; and performing training of the speaker identification model using, as training data, the first voice data and the second voice data.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: February 14, 2023
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Misaki Doi, Takahiro Kamai, Kousuke Itakura
  • Patent number: 11582237
    Abstract: Systems and methods for privacy-protecting hybrid cloud and premise stream processing are disclosed. In one embodiment, in an information processing device comprising at least one computer processor, a method for processing a voice communication including restricted content may include: (1) receiving from an electronic device, a customer communication; (2) identifying restricted content in the customer communication; (3) masking or marking the restricted content in the customer communication; (4) communicating the customer communication with the masked or marked restricted content to a cloud processor; (5) receiving a processed responsive communication comprising the masked or marked restricted content from the cloud processor; (6) unmasking or unmarking the restricted content in the processed responsive communication; and (7) communicating the processed responsive communication comprising the unmasked or unmarked restricted content to the electronic device.
    Type: Grant
    Filed: October 9, 2020
    Date of Patent: February 14, 2023
    Assignee: JPMORGAN CHASE BANK, N.A.
    Inventors: John F. Buford, John Roussochatzakis, Anjur Sundaresan Krishnakumar, Constantine Barzacanos, Uthra J. Ramanujam
  • Patent number: 11580441
    Abstract: A model training method and an apparatus thereof are provided. The method includes reading a portion of sample data in a sample full set to form a sample subset; mapping a model parameter related to the portion of sample data from a first feature component for the sample full set to a second feature component for the sample subset; and training a model based on the portion of sample data having the second feature component. A size of a copy of model parameters(s) on a sample computer can be reduced after mapping, thus greatly reducing an amount of training data and minimizing the occupancy of memory of the computer. Memory of a sample computer is used to place vectors, and store and load samples, thereby performing machine learning and training large-scale models with relatively low resource overhead under a condition of minimizing the loss of efficiency.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: February 14, 2023
    Inventors: Yi Ding, Jin Yu, Huaidong Xiong, Xu Chen
  • Patent number: 11550904
    Abstract: A system includes memory, a processor in communication with the memory. The processor is programmed to define a fingerprint that includes a baseline measurement of a physical attribute of at least a separate processor during an enrollment period of the system, wherein the enrollment period includes measuring the physical attribute of the processor prior to runtime operation, receiving a runtime measurement from a sensor regarding the physical attribute of at least the separate processor during runtime, comparing the runtime measurement of the physical attribute to the fingerprint, and outputting a multi-dimensional domain image in response to the runtime measurement.
    Type: Grant
    Filed: August 25, 2020
    Date of Patent: January 10, 2023
    Inventors: Jorge Guajardo Merchan, Stefan Gehrer, Benjamin Hettwer, Samarjit Das
  • Patent number: 11545146
    Abstract: A user device configured to perform wake-up word detection in a target language.
    Type: Grant
    Filed: November 10, 2016
    Date of Patent: January 3, 2023
    Assignee: Cerence Operating Company
    Inventors: Xiao-Lin Ren, Jianzhong Teng
  • Patent number: 11545154
    Abstract: Disclosed is a method and apparatus with recognition for speaker recognition. The method includes determining whether an input feature vector corresponding to a voice signal of a speaker meets a candidate similarity criterion with at least one registered data included in a registration database, selectively, based on a result of the determining of whether the input feature vector meets the candidate similarity criterion, constructing a candidate list based on the input feature vector, determining whether a candidate input feature vector, among one or more candidate input feature vectors constructed in the candidate list in the selective constructing of the candidate list, meets a registration update similarity criterion with the at least one registered data, and selectively, based on a result of the determination of whether the candidate input feature vector meets the registration update similarity criterion, updating the registration database based on the candidate input feature vector.
    Type: Grant
    Filed: May 12, 2020
    Date of Patent: January 3, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sung-Un Park, Kyuhong Kim
  • Patent number: 11528248
    Abstract: Systems, computer program products, and methods are described herein for intelligent multimodal classification in a distributed technical environment. The present invention is configured to retrieve one or more multimodal communications from a data repository; initiate one or more feature extraction algorithms on the one or more communication modalities to extract one or more features; generate a training dataset based on at least the one or more features extracted from the one or more communication modalities; initiate one or more machine learning algorithms on the training dataset to generate a first set of parameters; receive an unseen multimodal communication; generate an unseen dataset based on at least the unseen multimodal communication; classify, using the first set of parameters, the unseen multimodal communication into one or more class labels; and initiate an execution of one or more actions on the unseen multimodal communication based on at least the classification.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: December 13, 2022
    Assignee: BANK OF AMERICA CORPORATION
    Inventors: Harikrishnan Rajeev, Vinita Gummalla
  • Patent number: 11521642
    Abstract: Methods and systems include sending recording data of a call to a first server and a second server, wherein the recording data includes a first voice of a first participant of the call and a second voice of a second participant of the call; receiving, from the first server, a first emotion score representing a degree of a first emotion associated with the first voice, and a second emotion score representing a degree of a second emotion associated with the first voice; receiving, from the second server, a first sentiment score, a second sentiment score, and a third sentiment score; determining a quality score and classification data for the recording data based on the first emotion score, the second emotion score, the first sentiment score, the second sentiment score, and the third sentiment score; and outputting the quality score and the classification data for visualization of the recording data.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: December 6, 2022
    Assignee: Fidelity Information Services, LLC
    Inventor: Rajiv Ramanjani
  • Patent number: 11514903
    Abstract: The present technology relates to an information processing device and an information processing method that make it possible to generate interaction data with less cost. Provided is the information processing device including a processor that generates, on the basis of interaction history information, a coupling context to be coupled to a context of interest to be noticed among a plurality of contexts. This makes it possible to generate interaction data with less cost. The present technology is applicable as server-side service of a voice interaction system, for example.
    Type: Grant
    Filed: July 25, 2018
    Date of Patent: November 29, 2022
    Assignee: SONY CORPORATION
    Inventor: Junki Ohmura
  • Patent number: 11443750
    Abstract: A method and apparatus for authenticating a user based on an utterance input includes obtaining an input audio signal based on the utterance input of the user; obtaining, from the input audio signal, at least one audio signal of an utterance section and at least one audio signal of a non-utterance section; generating environment information indicating an environment in which the utterance input is received, based on the at least one audio signal of the non-utterance section; obtaining a result of a comparison between the generated environment information and registration environment information indicating an environment in which a registration utterance input corresponding to a previously registered registration audio signal corresponding to the user is received; adjusting an authentication criterion for authenticating the user based on the result of the comparison; and authenticating the user based on the adjusted authentication criterion and the input audio signal.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: September 13, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Keunseok Cho, Jaeyoung Roh, Donghan Jang, Jiwon Hyung, Jaewon Lee
  • Patent number: 11417327
    Abstract: An electronic apparatus is provided. The electronic device includes: a storage configured to store recognition related information and misrecognition related information of a trigger word for entering a speech recognition mode; and a processor configured to identify whether or not the speech recognition mode is activated on the basis of characteristic information of a received uttered speech and the recognition related information, identify a similarity between text information of the received uttered speech and text information of the trigger word, and update at least one of the recognition related information or the misrecognition related information on the basis of whether or not the speech recognition mode is activated and the similarity.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: August 16, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Chanhee Choi
  • Patent number: 11410645
    Abstract: A user device configured to perform wake-up word detection in a target language.
    Type: Grant
    Filed: November 10, 2016
    Date of Patent: August 9, 2022
    Assignee: Cerence Operating Company
    Inventors: Xiao-Lin Ren, Jianzhong Teng
  • Patent number: 11404067
    Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.
    Type: Grant
    Filed: June 15, 2020
    Date of Patent: August 2, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
  • Patent number: 11335353
    Abstract: A score indicating a likelihood that a first subject is the same as a second subject may be calibrated to compensate for aging of the first subject between samples of age-sensitive biometric characteristics. Age of the first subject obtained at a first sample time and age of the second subject obtained at a second sample time may be averaged, and an age approximation may be generated based on at least the age average and an interval between the first and second samples. The age approximation, the interval between the first and second sample times, and an obtained gender of the subject are used to calibrate the likelihood score.
    Type: Grant
    Filed: June 1, 2020
    Date of Patent: May 17, 2022
    Assignee: PINDROP SECURITY, INC.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 11328711
    Abstract: A user adaptive conversation apparatus generating a talk for a conversation based on emotional and ethical states of a user. A voice recognition unit converts a talk of the user in a conversational situation into a natural language script form to generate talk information. An artificial visualization unit generates situation information by recognizing talking situation from a video and generates intention information indicating an intention of the talk. A natural language analysis unit converts the situation information and the intention information into the natural language script form. A natural language analysis unit analyzes the talk information, the intention information, and the situation information.
    Type: Grant
    Filed: July 5, 2019
    Date of Patent: May 10, 2022
    Assignee: KOREA ELECTRONICS TECHNOLOGY INSTITUTE
    Inventors: Saim Shin, Hyedong Jung, Jinyea Jang
  • Patent number: 11322155
    Abstract: A method and apparatus for establishing a voiceprint model, a computer device, and a storage medium are described herein. The method includes: collecting speech acoustic features in a speech signal to form a plurality of cluster structures; calculating an average value and a standard deviation of the plurality of cluster structures and then performing coordinate transformation and activation function calculation to obtain a feature vector; and obtaining a voiceprint model based on the feature vector.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: May 3, 2022
    Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
    Inventors: Yuanzhe Cai, Jianzong Wang, Ning Cheng, Jing Xiao
  • Patent number: 11233490
    Abstract: A method includes detecting an input that triggers a virtual assistance (VA) on an electronic downdevice (ED) to perform a task that includes outputting audio content through a speaker associated with the ED. The method includes identifying a type of the audio content to be outputted through the speaker. The method includes determining whether a registered user of the ED is present in proximity to the ED. Each registered user is associated with a unique user identifier. The method includes, in response to determining that no registered user is present in proximity to the ED, outputting the audio content via the speaker at a current volume level of the ED. The method includes in response to determining that a registered user is in proximity to the ED, outputting the audio content at a selected, preferred volume level based on pre-determined or pre-established volume preference settings of the registered user.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: January 25, 2022
    Assignee: Motorola Mobility LLC
    Inventors: Wendel Malta De Assis, Amit Kumar Agrawal, Lucie Corinne Evelyne Papelier
  • Patent number: 11200903
    Abstract: A method of speaker verification comprises receiving an audio signal representing speech. While the audio signal is being received, features of the received audio signal are extracted. The extracted features, of at least a part of the received audio signal corresponding to the speech of at least one speaker, are summarised, and the summarised extracted features are stored. In response to a request for a speaker verification process relating to at least one enrolled user, the speaker verification process is performed using the previously summarised features.
    Type: Grant
    Filed: February 4, 2020
    Date of Patent: December 14, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: Carlos Vaquero Avilés-Casco, Michael Page
  • Patent number: 11194968
    Abstract: The present invention concerns a text analysis system, the text analysis system being adapted for utilizing a topic model to provide a document representation. The topic model is based on learning performed on a text corpus utilizing hidden layer representations associated to words of the text corpus, wherein each hidden layer representation pertains to a specific word of the text corpus and is based on a word environment including words occurring before and after the specific word in a text of the text corpus.
    Type: Grant
    Filed: May 31, 2018
    Date of Patent: December 7, 2021
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Florian Büttner, Pankaj Gupta
  • Patent number: 11176950
    Abstract: Disclosed herein are an apparatus and method for recognizing a voice speaker. The apparatus for recognizing a voice speaker includes a voice feature extraction unit configured to extract a feature vector from a voice signal inputted through a microphone; and a speaker recognition unit configured to calculate a speaker recognition score by selecting a reverberant environment from multiple reverberant environment learning data sets based on the feature vector extracted by the voice feature extraction unit and to recognize a speaker by assigning a weight depending on the selected reverberant environment to the speaker recognition score.
    Type: Grant
    Filed: March 20, 2019
    Date of Patent: November 16, 2021
    Assignee: Hyundai Mobis Co., Ltd.
    Inventors: Yu Jin Jung, Ki Hee Park, Chang Won Lee, Doh Hyun Kim, Tae Kyung Kim, Tae Yoon Son, Joon Hyuk Chang, Joon Young Yang
  • Patent number: 11170131
    Abstract: Techniques for ensuring differential privacy in top-K selection are provided. In one technique, multiple items and multiple counts are identified in response to a query. For each count, which corresponds to a different item, a noise value is generated and added to the count to generate a noisy value, and the noisy value is added to a set of noisy values that is initially empty. A particular noise value is generated for a particular count and added to the particular count to generate a noisy threshold. The particular noise value is generated using a different technique than the technique used to generate each noise value in the set. Based on the noisy threshold, a subset of the noisy values is identified, where each noisy value in the subset is less than the noisy threshold. A response to the query is generated that excludes items that correspond to the subset.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: November 9, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ryan M. Rogers, David Anthony Durfee, Sean S. Peng, Ya Xu
  • Patent number: 11164587
    Abstract: A personal assistant operation is provided for teaching a personal assistant device names preferred by the user for sensor activated devices. For this purpose, a method includes the personal assistant device receiving a request from a user to activate a requested device which the user has identified with a requested name which is unrecognized by the personal assistant device, determining a most likely candidate device from a list of candidate devices to activate in response to the request, activating the determined most likely candidate device, and identifying and saving the requested name as the name of the most likely candidate device in response to receiving confirmation from the user that the determined most likely candidate device is the requested device.
    Type: Grant
    Filed: January 15, 2019
    Date of Patent: November 2, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bret A. Greenstein, Lisa Seacat DeLuca, Amit Fisher, Steven R. Abrams
  • Patent number: 11120792
    Abstract: An electronic device and method are disclosed. The electronic device includes a communication interface, a microphone, a speaker, a processor and a memory. The processor executes the method, including receiving a first message through a communication interface from an external device, in response to receiving a first user input associated with the first message through an input interface including a microphone, converting first text included in the first message into speech, outputting the speech through an output interface of the electronic device including a speaker, determining whether to execute an additional operation associated with the first message, based on at least one of a state of the first electronic device, the first message, and a second user input received via the microphone, and initiate a conversational mode for executing the additional operation based at least partly on the determination.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: September 14, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jihyun Kim, Gajin Song, Dongho Jang, Sangwook Kang, Hyunwoo Kang, Soojung Choi
  • Patent number: 11080591
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing sequences using convolutional neural networks. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.
    Type: Grant
    Filed: September 6, 2017
    Date of Patent: August 3, 2021
    Assignee: DeepMind Technologies Limited
    Inventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals, Lasse Espeholt
  • Patent number: 11068735
    Abstract: A reliability calculation apparatus calculates a reliability of a recognized object that is recognized as being present on a route on which a vehicle travels. The reliability of the recognized object is used for driving control of the vehicle. In the reliability calculation apparatus, a provisional reliability calculating unit determines a feature quantity of each of a plurality of recognized objects, selects a plurality of pairs of recognized objects from the plurality of recognized objects, determines a difference in feature quantity of each pair of recognized objects, and calculates a reliability that decreases as the difference in feature quantity increases as a provisional reliability of each pair of recognized objects. An integrated reliability calculating unit calculates an integrated reliability of each recognized object from the provisional reliabilities of the plurality of pairs of recognized objects.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: July 20, 2021
    Assignee: DENSO CORPORATION
    Inventors: Masaya Okada, Shunya Kumano, Hiroaki Niino, Mitsuhiro Tokimasa, Takumi Uematsu
  • Patent number: 11038824
    Abstract: The method includes receiving sender media that was recorded by a sender device associated with a sender. The method further comprises playing, by a recipient device, the sender media for a recipient. The method further comprises detecting that the recipient is speaking. The method further comprises recording recipient media based on detecting that the recipient is speaking. The method further comprises determining a location in the sender media at which the recipient media is to be included. The method further comprises generating combined media that includes at least a portion of the sender media and the recipient media at the location.
    Type: Grant
    Filed: September 13, 2018
    Date of Patent: June 15, 2021
    Assignee: Google LLC
    Inventor: Matthew Sharifi
  • Patent number: 11010562
    Abstract: One embodiment provides a method, including: receiving a story, wherein the story is received in a text-based format; identifying conversational blocks contained within the story, wherein the identifying conversational blocks comprises (i) identifying conversational turns within the story, (ii) identifying topic switches between conversational turns, and (iii) grouping conversational turns occurring between topic switches into a conversational block; for each of the conversational blocks, obtaining at least one image illustrating the corresponding conversational block, wherein the obtaining comprises (i) extracting concepts from the conversational block and (ii) identifying images illustrating the concepts; and generating a visual storyline representing the story by ordering the images obtained for each of the conversational blocks, based upon an order of the conversational blocks within the story, wherein the generating comprises refining the images to maintain consistency between the images.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: May 18, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Smitkumar Narotambhai Marvaniya, Padmanabha Venkatagiri Seshadri, Vijay Ekambaram
  • Patent number: 10958466
    Abstract: Methods and apparatuses for environmental control systems are described. In one example, a method includes monitoring a voice communications call between a local call participant located in a building space and a remote call participant. The method includes detecting a change in a user state of the local call participant from the voice communications call. The method further includes adjusting an environmental parameter in the building space responsive to detecting the change in the user state of the local call participant.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: March 23, 2021
    Assignee: Plantronics, Inc.
    Inventors: Marcus Dennis Rose, Evan Harris Benway
  • Patent number: 10923130
    Abstract: An electronic device and method are disclosed, including a communication circuit, a speaker, a microphone and a processor. The processor implements the method, including receiving a first voice input through the microphone, executing first voiceprint authentication on the first voice input including determining whether the first voice input matches voice information corresponding to a user stored in the electronic device, when the first voice input does not match the voice information, transmit a request message to at least one external electronic device for execution of a second voiceprint authentication, receive a response message indicating whether the first voice input is authenticated under the second voiceprint authentication, receive a second voice input through the microphone, and transmit second data related to the second voice input to an external server through the communication circuit for Automatic Speech Recognition (ASR) or Natural Language Understanding (NLU) on the second data.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: February 16, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Gyucheol Choi, Youngkyoo Kim
  • Patent number: 10884503
    Abstract: Methods, computing devices, and computer-program products are provided for implementing a virtual personal assistant. In various implementations, a virtual personal assistant can be configured to receive sensory input, including at least two different types of information. The virtual personal assistant can further be configured to determine semantic information from the sensory input, and to identify a context-specific framework. The virtual personal assistant can further be configured to determine a current intent. Determining the current intent can include using the semantic information and the context-specific framework. The virtual personal assistant can further be configured to determine a current input state. Determining the current input state can include using the semantic information and one or more behavioral models. The behavioral models can include one or more interpretations of previously-provided semantic information.
    Type: Grant
    Filed: October 24, 2016
    Date of Patent: January 5, 2021
    Assignee: SRI International
    Inventors: Ajay Divakaran, Amir Tamrakar, Girish Acharya, William Mark, Greg Ho, Jihua Huang, David Salter, Edgar Kalns, Michael Wessel, Min Yin, James Carpenter, Brent Mombourquette, Kenneth Nitz, Elizabeth Shriberg, Eric Law, Michael Frandsen, Hyong-Gyun Kim, Cory Albright, Andreas Tsiartas
  • Patent number: 10733230
    Abstract: Approaches presented herein enable automatic creation of metadata for contents of a video. More specifically, a video and a script corresponding to the video are obtained. A location corresponding to an object in at least one shot of the video is extracted. This at least one shot includes a series of adjacent frames. The extracted location is saved as an annotation area in an annotation knowledge base. An element of a plot of the video is extracted from the script. This element of the plot is derived from content of the video in combination with content of the script. The extracted element of the plot is saved in a narrative knowledge base.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: August 4, 2020
    Assignee: INHA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION
    Inventor: Geun Sik Jo
  • Patent number: 10726848
    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: July 28, 2020
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Alex Gorodetski, Oana Sidi, Ron Wein, Ido Shapira