Specialized Models Patents (Class 704/250)
-
Patent number: 12257511Abstract: A gaming headset receives a plurality of audio channels during play of a particular game, determines an identity and/or an age of a gamer that is wearing the gaming headset, and configures operation of the gaming headset based on the determined identity and/or age of the gamer. The gaming headset logs into the particular game using credentials associated with the determined identity and/or age, and selects settings based on the determined identity and/or age of the gamer that is wearing the gaming headset. One or more sensors communicatively coupled to the gaming headset may be activated and information received from the sensors is utilized to determine the identity and/or the age of the gamer that is wearing the gaming headset. The sensors determine, for example, the head size, ear size, voice characteristics and/or biometric information for the gamer that is wearing the gaming headset.Type: GrantFiled: August 1, 2023Date of Patent: March 25, 2025Assignee: Voyetra Turtle Beach, Inc.Inventors: Richard Kulavik, Michael A. Jessup
-
Patent number: 12217743Abstract: Deep recurrent neural networks applied to speech recognition. The deep recurrent neural networks (RNNs) are preferably implemented by stacked long short-term memory bidirectional RNNs. The RNNs are trained using end-to-end training with suitable regularisation.Type: GrantFiled: July 28, 2023Date of Patent: February 4, 2025Assignee: Google LLCInventor: Alexander B. Graves
-
Patent number: 12216819Abstract: Provided herein are methods of determining a cognitive load of a sensorized device user in certain aspects. Provided herein are also methods of adjusting an interface between a user and a sensorized device in some aspects. Related devices, systems, and computer program products are also provided.Type: GrantFiled: April 28, 2022Date of Patent: February 4, 2025Assignee: THE JOHNS HOPKINS UNIVERSITYInventors: Mathias Unberath, Kinjal Shah, Wenhao Gu
-
Patent number: 12170098Abstract: The present disclosure discloses a sound detection method. The method includes: obtaining an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal, to obtain a target sound segment, and obtaining a timestamp corresponding to the target sound segment, the target sound segment including a speech of at least one object, and the timestamp being used for indicating a start time of the target sound segment and an end time of the target sound segment; segmenting the spatial distribution spectrum by using the timestamp, to obtain a spatial distribution spectrum segment corresponding to the target sound segment; and inputting the target sound segment and the spatial distribution spectrum segment into a sound detection model, to obtain a first sound detection result, the first sound detection result being used for describing whether sound of multiple objects exists in the initial sound signal.Type: GrantFiled: August 26, 2022Date of Patent: December 17, 2024Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd.Inventors: Shiliang Zhang, Siqi Zheng, Weilong Huang
-
Patent number: 12165650Abstract: A system, method, and computer-program product includes receiving speech audio of a multi-turn conversation, generating, via a speech-to-text process, a transcript of the speech audio, wherein the transcript of the speech audio textually segments speech spoken during the multi-turn conversation into a plurality of utterances, generating a speaker diarization prompt that includes contextual information about a plurality of speakers participating in the multi-turn conversation, inputting, to a large language model, the speaker diarization prompt and the transcript of the speech audio, and obtaining, from the large language model, an output comprising an enhanced transcript of the speech audio, wherein the enhanced transcript of the speech audio textually segments the speech spoken during the multi-turn conversation into a plurality of refined utterances and associates a speaker identification value with each of the plurality of refined utterances.Type: GrantFiled: April 12, 2024Date of Patent: December 10, 2024Assignee: SAS INSTITUTE INC.Inventors: Xiaolong Li, Xiaozhuo Cheng, Xu Yang
-
Patent number: 12131729Abstract: A language model score calculation apparatus calculates a prediction probability of a word wi as a language model score of a language model based on a recurrent neural network. The language model score calculation apparatus includes a memory; and a processor configured to execute converting a word wi-1 that is observed immediately before the word wi into a word vector ?(wi-1); converting a speaker label ri-1 corresponding to the word wi-1 and a speaker label ri corresponding to the word wi into a speaker vector ?(ri-1) and a speaker vector ?(ri), respectively; calculating a word history vector si by using the word vector ?(wi-1), the speaker vector ?(ri-1), and a word history vector si-1 that is obtained when a prediction probability of the word wi-1 is calculated; and calculating a prediction probability of the word wi by using the word history vector si-1 and the speaker vector ?(ri).Type: GrantFiled: June 21, 2019Date of Patent: October 29, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Patent number: 12124999Abstract: The present invention relates to a method and a system for managing an automatic evaluation model for an interview video, and a computer-readable medium.Type: GrantFiled: January 2, 2020Date of Patent: October 22, 2024Assignee: GENESIS LAB, INC.Inventors: Daehun Yoo, Youngbok Lee
-
Patent number: 12112767Abstract: A method, computer system, and a computer program product for audio data augmentation are provided. Sets of audio data from different sources may be obtained. A respective normalization factor for at least two sources of the different sources may be calculated. The normalization factors from the at least two sources may be mixed to determine a mixed normalization factor. A first set of the sets may be normalized by using the mixed normalization factor and to obtain training data for training an acoustic model.Type: GrantFiled: May 21, 2021Date of Patent: October 8, 2024Assignee: International Business Machines CorporationInventors: Toru Nagano, Takashi Fukuda, Masayuki Suzuki
-
Patent number: 12094454Abstract: Implementations described herein include detecting a stream of audio data that captures a spoken utterance of the user and that captures ambient noise occurring within a threshold time period of the spoken utterance being spoken by the user. Implementations further include processing a portion of the audio data that includes the ambient noise to determine ambient noise classification(s), processing a portion of the audio data that includes the spoken utterance to generate a transcription, processing both the transcription and the ambient noise classification(s) with a machine learning model to generate a user intent and parameter(s) for the user intent, and performing one or more automated assistant actions based on the user intent and using the parameter(s).Type: GrantFiled: January 5, 2022Date of Patent: September 17, 2024Assignee: GOOGLE LLCInventors: Victor Carbune, Matthew Sharifi
-
Patent number: 12045269Abstract: An apparatus for generating a digital assistant is disclosed. The apparatus include at least a processor and a memory communicatively connected to the at least a processor. The memory instructs the processor to receive at least one user query from a user. The memory instructs the processor to extract a plurality of background data and a plurality of contextual data from the user dataset. The memory instructs the processor to receive at least one user query from a user. The memory instructs the processor to generate a query response as a function of the at least one user query and the contextual data using the digital assistant. The memory instructs the processor to display the query response using the digital assistant on a display device.Type: GrantFiled: October 10, 2023Date of Patent: July 23, 2024Inventor: David Cook
-
Patent number: 12039980Abstract: A method for a user device, including receiving a first acoustic input of a user speaking a wake-up word in the target language; providing a first acoustic feature derived from the first acoustic input to an acoustic model stored on the user device to obtain a first sequence of speech units corresponding to the wake-up word spoken by the user in the target language, the acoustic model trained on a corpus of training data in a source language different than the target language; receiving a second acoustic input including the wake-up word in the target language; providing a second acoustic feature derived from the second acoustic input to the acoustic model to obtain a second sequence of speech units corresponding to the wake-up word in the target language; and comparing the first and second sequences of speech units to recognize the wake-up word in the target language.Type: GrantFiled: November 22, 2022Date of Patent: July 16, 2024Assignee: CERENCE OPERATING COMPANYInventors: Xiao-Lin Ren, Jianzhong Teng
-
Patent number: 12008997Abstract: A conferencing system for determining a participant's status during a conference may include (1) a conference server configured to identify one or more conference topics, and (2) one or more participant devices in communication with the conference server, wherein each participant device is assigned to a unique participant. A natural language processor (NLP) may be in communication with the conference server, wherein the NLP processor is configured to identify a participant and one or more topics on which the participant comments. A ranking engine is configured to identify one or both of (a) whether the participant was an originator of an idea related to a conference topic, and (b) the participant's contribution to a conference topic.Type: GrantFiled: May 17, 2022Date of Patent: June 11, 2024Assignee: Mitel Networks CorporationInventors: Logendra Naidoo, Shayane Umar, Mithu Babu
-
Patent number: 12001862Abstract: In one embodiment, a method includes receiving a user input including an ambiguous mention by a user from the user's client system, analyzing the user input to identify intents and slots based on a natural-language understanding module, where at least one slot is an ambiguous slot corresponding to the ambiguous mention, identifying candidate entities corresponding to the ambiguous slot, ranking the candidate entities based on the user's user profile data, sending to the user's client system instructions for prompting the user to select a candidate entity from a set of candidate entities having a rank greater than a threshold rank, where the selected candidate entity is resolved to the ambiguous slot, and storing the selected candidate entity together with the ambiguous mention as a mention-entity pair as part of the user profile data responsive to receiving the selection from the user.Type: GrantFiled: September 19, 2018Date of Patent: June 4, 2024Assignee: Meta Platforms, Inc.Inventors: Xiaohu Liu, Baiyang Liu
-
Patent number: 11996087Abstract: Methods and systems are described for recognizing, based on a voice input, a user and/or a voice command. An algorithm is described herein that processes data associated with a voice input. The data may indicate characteristics of the voice such as a gender, an age, or accent associated with the voice and other metadata. For example, the system may process the data and determine the gender of a voice. The determined characteristics may be used as an input into a voice recognition engine to improve the accuracy of identifying the user who spoke the voice input and identifying a voice command associated with the voice input. For example, the determined gender may be used as a parameter to improve the accuracy of an identified user (e.g., the speaker) or command. The algorithm may adjust, based on gender, parameters such as confidence thresholds used to match voices and voice commands.Type: GrantFiled: April 30, 2021Date of Patent: May 28, 2024Assignee: Comcast Cable Communications, LLCInventors: Navdeep Jain, Hongcheng Wang
-
Patent number: 11995163Abstract: A biometric information authentication device includes a control unit to, during biometric authentication by comparing biometric information of a user read by a biometric information sensor to preliminarily registered biometric information of a registered person, issue a notification to encourage the user to register different biometric information when a predetermined condition for determining that a biometric authentication after a current biometric authentication is highly likely to fail is met.Type: GrantFiled: August 6, 2019Date of Patent: May 28, 2024Assignees: KABUSHIKI KAISHA TOKAI RIKA DENKI SEISAKUSHO, TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: Rijin Owaki, Yosuke Ohashi, Takahiko Ando, Yuya Goto, Yasuhisa Ohta, Naoyuki Takada, Daisuke Ogawa
-
Patent number: 11948066Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing sequences using convolutional neural networks. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.Type: GrantFiled: July 14, 2021Date of Patent: April 2, 2024Assignee: DeepMind Technologies LimitedInventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals, Lasse Espeholt
-
Patent number: 11942078Abstract: A computer-implemented method is provided for improving accuracy recognition of digital speech. The method includes receiving the digital speech. The method further includes splitting the digital speech into overlapping chunks. The method also includes computing a bidirectional encoder embedding of each of the overlapping chunks to obtain bidirectional encoder embeddings. The method additionally includes combining the bidirectional encoder embeddings. The method further includes interpreting, by a speech recognition system, the digital speech using the combined bidirectional encoder embeddings.Type: GrantFiled: February 26, 2021Date of Patent: March 26, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: George Andrei Saon
-
Patent number: 11900926Abstract: Examples of the present disclosure describe systems and methods for dynamically expanding acronyms in audio content. In aspects, a user access of an audio resource may be detected. The audio content of the audio resource may be evaluated to identify acronyms. One or more of the identified acronyms may be evaluated based on a user-specific context of the user and/or a global context associated with the user. Based on the evaluated context(s), expansion candidates and corresponding confidence scores may be determined for each identified acronym. Based on the confidence scores, an expansion candidate may be selected and used to replace the identified acronym when the audio content is consumed by the user.Type: GrantFiled: October 5, 2020Date of Patent: February 13, 2024Assignee: Microsoft Technology Licensing, LLCInventor: Amelia Bateman
-
Patent number: 11868725Abstract: Provided are a server, a client device, and operation methods thereof for training a language model. The server, the client device, and the operation methods thereof identify a word or phrase including a named entity that is incorrectly pronounced by a user or is difficult for the user to accurately pronounce from an input text for use in training a natural language understanding (NLU) model, generate text candidates for use in training the NLU model by replacing the identified word or phrase with a word or phrase predicted to be uttered by the user and having high phonetic similarity to the identified word or phrase, and train the NLU model by using the generated text candidates.Type: GrantFiled: January 4, 2021Date of Patent: January 9, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hejung Yang, Kwangyoun Kim, Sungsoo Kim
-
Patent number: 11841932Abstract: A system and method may adjust the threshold or other settings in a biometric comparison system, or provide a report or display of parameters. Over a series of comparisons of authentication biometric samples (e.g. authentication VPs) to enrollment biometric samples (e.g. enrollment VPs), the authentication samples may be stored, and scores resulting from the biometric comparisons may be stored in a first set of scores. A second set of biometric comparisons may be created, each using a pairing of a stored authentication biometric sample and an enrollment biometric sample, each biometric comparison resulting in a score, the scores forming a second set of scores. The first and second sets of scores may be combined to produce a third set of scores, and an iterative process may be performed over the third set of scores to update the parameters of the Gaussian distribution of the third set of scores.Type: GrantFiled: April 14, 2020Date of Patent: December 12, 2023Assignee: Nice Ltd.Inventors: Roman Frenkel, Matan Keret, Amit Sharon
-
Patent number: 11837249Abstract: Systems, methods and non-transitory computer readable media for processing audio and visually presenting information are provided. Audio data captured by one or more audio sensors included in a wearable apparatus from an environment of a wearer of the wearable apparatus may be obtained. The audio data may be analyzed to obtain textual information. The audio data may be analyzed to associate different portions of the textual information with different speakers. A head mounted display system may be used to present each portion of the textual information in a presentation region associated with the speaker associated with the portion of the textual information.Type: GrantFiled: August 7, 2021Date of Patent: December 5, 2023Inventor: Ron Zass
-
Patent number: 11837236Abstract: This speech processing device is provided with: a contribution degree estimation means which calculates a contribution degree representing a quality of a segment of the speech signal; and a speaker feature calculation means which calculates a feature from the speech signal, for recognizing attribute information of the speech signal, using the contribution degree as a weight of the segment of the speech signal.Type: GrantFiled: December 8, 2021Date of Patent: December 5, 2023Assignee: NEC CORPORATIONInventors: Hitoshi Yamamoto, Takafumi Koshinaka
-
Patent number: 11823659Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.Type: GrantFiled: December 11, 2019Date of Patent: November 21, 2023Assignee: Amazon Technologies, Inc.Inventors: Julia Reinspach, Oleg Rokhlenko, Ramakanthachary Gottumukkala, Giovanni Clemente, Ankit Agrawal, Swayam Bhardwaj, Guy Michaeli, Vaidyanathan Puthucode Krishnamoorthy, Costantino Vlachos, Nalledath P. Vinodkrishnan, Shaun M. Vickers, Sethuraman Ramachandran, Charles C. Moore
-
Patent number: 11797942Abstract: Aspects of this disclosure relate to apparatus and method for applicant scoring. Apparatus includes at least a processor and a memory communicatively connected to the at least a processor. Processor is configured to receive a user-associated data set to be stored on a resource data storage system, parse the user-associated data set into a keyword record, generate a user metric, based on a plurality of weighted values, as a function of the keyword record, receive identification of a posting of a plurality of postings, and match the posting of the plurality of postings to a user as a function of the user metric.Type: GrantFiled: March 9, 2022Date of Patent: October 24, 2023Assignee: MY JOB MATCHER, INC.Inventor: Arran Stewart
-
Patent number: 11776540Abstract: A system configured to enable remote control to allow a first user to provide assistance to a second user. The system may receive a command from the second user granting remote control to the first user, enabling the first user to initiate a voice command on behalf of the second user. In some examples, the system may enable the remote control by treating a voice command originating from the first user as though it originated from the second user instead. For example, the system may receive the voice command from a first device associated with the first user but may route the voice command as though it was received by a second device associated with the second user.Type: GrantFiled: January 6, 2020Date of Patent: October 3, 2023Assignee: Amazon Technologies, Inc.Inventors: Peng Wang, Pathivada Rajsekhar Naidu
-
Patent number: 11756534Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.Type: GrantFiled: January 26, 2022Date of Patent: September 12, 2023Assignee: Google LLCInventors: Bo Li, Ron Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
-
Patent number: 11727942Abstract: Systems and methods may generate, by a computer, a voice model for an enrollee based upon a set of one or more features extracted from a first audio sample received at a first time; receive at a second time a second audio sample associated with a caller; generate a likelihood score for the second audio sample by applying the voice model associated with the enrollee on the set of features extracted from the second audio sample associated with the caller, the likelihood score indicating a likelihood that the caller is the enrollee; calibrate the likelihood score based upon a time interval from the first time to the second time and at least one of: an enrollee age at the first time and an enrollee gender; and authenticate the caller as the enrollee upon the computer determining that the likelihood score satisfies a predetermined threshold score.Type: GrantFiled: May 17, 2022Date of Patent: August 15, 2023Assignee: PINDROP SECURITY, INC.Inventors: Elie Khoury, Matthew Garland
-
Patent number: 11688300Abstract: There is provided herein a method for assessing a speech/lingual quality of a subject, the method comprising: providing a content-containing stimulus to a user; recording the user's vocal response to the stimulus and/or to instructions related thereto; processing the user's recorded vocal response to measure/extract/compute at least one linguistics (prosodic) parameter and at least one acoustic parameter; transforming the user's vocal response into a transformed text section, which is based on a processing unit's interpretation of the user's verbal response; comparing the transformed text section to a predetermined text section, which represents the user's expected; and computing an output signal indicative of at least one speech/lingual quality of the user, based at least on data resulted from the texts comparison, the at least one measured/extracted/computed linguistic parameter and the at least one acoustic parameter.Type: GrantFiled: April 17, 2019Date of Patent: June 27, 2023Assignee: AMPLIO LEARNING TECHNOLOGIES LTD.Inventors: Yoav Medan, Yair Shapira, Liron Mick
-
Patent number: 11676604Abstract: Systems, methods, and computer-readable storage media for generating a quantifiable communication profile of a participant, then using that communication profile to identify fake media and/or verify media as authentic. An exemplary system can receive communications in which a participant was one of the parties communicating, perform natural language processing and statistical analyses on the received communications, and generate a participant communication profile which details how the participant communicates and prefers to be communicated with. The system can also receive media purported to be from the participant, perform natural language processing on the purported media, compare the syntax of the purported media to the participant's communication profile, and identify, based on word usage, the media as a fake.Type: GrantFiled: June 8, 2020Date of Patent: June 13, 2023Assignee: Capital One Services, LLCInventors: Christopher Camenares, Joseph Boayue, Lee Adcock, Ana Cruz, Nahid Farhady Ghalaty
-
Patent number: 11670291Abstract: Systems, methods, and storage media for providing an interface for textual editing through speech are disclosed. One aspect of the disclosure may include storing a narrative textual message and presenting the narrative textual message via a client computing platform. Speech uttered by a user may be received. Speech recognition may be performed on the speech to determine words spoken by the user. The words determined from the speech may be analyzed to detect whether an editing command was spoken. The editing command may include a parameter and an edited value for the parameter. The editing command may be processed to insert the edited value for the parameter or replace an initial value to the parameter with the edited value. Information may be transmitted to the client computing platform that causes the narrative textual message to be visually presented including edits based on the editing command.Type: GrantFiled: February 22, 2019Date of Patent: June 6, 2023Assignee: Suki AI, Inc.Inventors: Punit Soni, Karthik Rajan, Erin Palm, Matthew Pallakoff, Sanket Agarwal
-
Patent number: 11664032Abstract: Multi-agent input coordination can be used to for acoustic collaboration of multiple listening agents deployed in smart devices on a premises, improving the accuracy of identifying requests and specifying where that request should be honored, improving quality of detection, and providing better understanding of user commands and user intent throughout the premises. A processor or processors such as those in a smart speaker can identify audio requests received through at least two agents in a network and determine at which of the agents to actively process a selected audio request. The identification can make use of techniques such as location context and secondary trait analysis. The audio request can include simultaneous audio requests received through at least two agents, differing audio requests received from different requesters, or both.Type: GrantFiled: October 12, 2021Date of Patent: May 30, 2023Assignee: AT&T Intellectual Property I, L.P.Inventors: James Pratt, Timothy Innes, Eric Zavesky, Nigel Bradley
-
Patent number: 11636860Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.Type: GrantFiled: July 21, 2020Date of Patent: April 25, 2023Assignee: Verint Systems Ltd.Inventors: Alex Gorodetski, Oana Sidi, Ron Wein, Ido Shapira
-
Patent number: 11615239Abstract: The present disclosure relates to systems for identifying instances of natural language input, determining intent classifications associated with instances of natural language input, and generating responses based on the determined intent classifications. In particular, the disclosed systems intelligently identify and group instances of natural language input based on characteristics of the user input. Additionally, the disclosed systems determine intent classifications for the instances of natural language input based message queuing in order to delay responses to the user input in ways that increase accuracy of the responses, while retaining a conversational aspect of the ongoing chat. Moreover, in one or more embodiments, the disclosed systems generate responses utilizing natural language.Type: GrantFiled: March 31, 2020Date of Patent: March 28, 2023Assignee: Adobe Inc.Inventors: Oliver Brdiczka, Ajinkya Kale, Piyush Chandra, Tracy King, Abhishek Gupta, Sourabh Goel, Nitin Garg, Deepika Naryani, Feroz Ahmad, Vikas Sagar
-
Patent number: 11582237Abstract: Systems and methods for privacy-protecting hybrid cloud and premise stream processing are disclosed. In one embodiment, in an information processing device comprising at least one computer processor, a method for processing a voice communication including restricted content may include: (1) receiving from an electronic device, a customer communication; (2) identifying restricted content in the customer communication; (3) masking or marking the restricted content in the customer communication; (4) communicating the customer communication with the masked or marked restricted content to a cloud processor; (5) receiving a processed responsive communication comprising the masked or marked restricted content from the cloud processor; (6) unmasking or unmarking the restricted content in the processed responsive communication; and (7) communicating the processed responsive communication comprising the unmasked or unmarked restricted content to the electronic device.Type: GrantFiled: October 9, 2020Date of Patent: February 14, 2023Assignee: JPMORGAN CHASE BANK, N.A.Inventors: John F. Buford, John Roussochatzakis, Anjur Sundaresan Krishnakumar, Constantine Barzacanos, Uthra J. Ramanujam
-
Patent number: 11580989Abstract: A training method of training a speaker identification model which receives voice data as an input and outputs speaker identification information for identifying a speaker of an utterance included in the voice data is provided. The training method includes: performing voice quality conversion of first voice data of a first speaker to generate second voice data of a second speaker; and performing training of the speaker identification model using, as training data, the first voice data and the second voice data.Type: GrantFiled: August 18, 2020Date of Patent: February 14, 2023Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICAInventors: Misaki Doi, Takahiro Kamai, Kousuke Itakura
-
Patent number: 11580441Abstract: A model training method and an apparatus thereof are provided. The method includes reading a portion of sample data in a sample full set to form a sample subset; mapping a model parameter related to the portion of sample data from a first feature component for the sample full set to a second feature component for the sample subset; and training a model based on the portion of sample data having the second feature component. A size of a copy of model parameters(s) on a sample computer can be reduced after mapping, thus greatly reducing an amount of training data and minimizing the occupancy of memory of the computer. Memory of a sample computer is used to place vectors, and store and load samples, thereby performing machine learning and training large-scale models with relatively low resource overhead under a condition of minimizing the loss of efficiency.Type: GrantFiled: September 28, 2018Date of Patent: February 14, 2023Inventors: Yi Ding, Jin Yu, Huaidong Xiong, Xu Chen
-
Patent number: 11550904Abstract: A system includes memory, a processor in communication with the memory. The processor is programmed to define a fingerprint that includes a baseline measurement of a physical attribute of at least a separate processor during an enrollment period of the system, wherein the enrollment period includes measuring the physical attribute of the processor prior to runtime operation, receiving a runtime measurement from a sensor regarding the physical attribute of at least the separate processor during runtime, comparing the runtime measurement of the physical attribute to the fingerprint, and outputting a multi-dimensional domain image in response to the runtime measurement.Type: GrantFiled: August 25, 2020Date of Patent: January 10, 2023Inventors: Jorge Guajardo Merchan, Stefan Gehrer, Benjamin Hettwer, Samarjit Das
-
Patent number: 11545154Abstract: Disclosed is a method and apparatus with recognition for speaker recognition. The method includes determining whether an input feature vector corresponding to a voice signal of a speaker meets a candidate similarity criterion with at least one registered data included in a registration database, selectively, based on a result of the determining of whether the input feature vector meets the candidate similarity criterion, constructing a candidate list based on the input feature vector, determining whether a candidate input feature vector, among one or more candidate input feature vectors constructed in the candidate list in the selective constructing of the candidate list, meets a registration update similarity criterion with the at least one registered data, and selectively, based on a result of the determination of whether the candidate input feature vector meets the registration update similarity criterion, updating the registration database based on the candidate input feature vector.Type: GrantFiled: May 12, 2020Date of Patent: January 3, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Sung-Un Park, Kyuhong Kim
-
Patent number: 11545146Abstract: A user device configured to perform wake-up word detection in a target language.Type: GrantFiled: November 10, 2016Date of Patent: January 3, 2023Assignee: Cerence Operating CompanyInventors: Xiao-Lin Ren, Jianzhong Teng
-
Patent number: 11528248Abstract: Systems, computer program products, and methods are described herein for intelligent multimodal classification in a distributed technical environment. The present invention is configured to retrieve one or more multimodal communications from a data repository; initiate one or more feature extraction algorithms on the one or more communication modalities to extract one or more features; generate a training dataset based on at least the one or more features extracted from the one or more communication modalities; initiate one or more machine learning algorithms on the training dataset to generate a first set of parameters; receive an unseen multimodal communication; generate an unseen dataset based on at least the unseen multimodal communication; classify, using the first set of parameters, the unseen multimodal communication into one or more class labels; and initiate an execution of one or more actions on the unseen multimodal communication based on at least the classification.Type: GrantFiled: June 10, 2020Date of Patent: December 13, 2022Assignee: BANK OF AMERICA CORPORATIONInventors: Harikrishnan Rajeev, Vinita Gummalla
-
Patent number: 11521642Abstract: Methods and systems include sending recording data of a call to a first server and a second server, wherein the recording data includes a first voice of a first participant of the call and a second voice of a second participant of the call; receiving, from the first server, a first emotion score representing a degree of a first emotion associated with the first voice, and a second emotion score representing a degree of a second emotion associated with the first voice; receiving, from the second server, a first sentiment score, a second sentiment score, and a third sentiment score; determining a quality score and classification data for the recording data based on the first emotion score, the second emotion score, the first sentiment score, the second sentiment score, and the third sentiment score; and outputting the quality score and the classification data for visualization of the recording data.Type: GrantFiled: November 3, 2020Date of Patent: December 6, 2022Assignee: Fidelity Information Services, LLCInventor: Rajiv Ramanjani
-
Patent number: 11514903Abstract: The present technology relates to an information processing device and an information processing method that make it possible to generate interaction data with less cost. Provided is the information processing device including a processor that generates, on the basis of interaction history information, a coupling context to be coupled to a context of interest to be noticed among a plurality of contexts. This makes it possible to generate interaction data with less cost. The present technology is applicable as server-side service of a voice interaction system, for example.Type: GrantFiled: July 25, 2018Date of Patent: November 29, 2022Assignee: SONY CORPORATIONInventor: Junki Ohmura
-
Patent number: 11443750Abstract: A method and apparatus for authenticating a user based on an utterance input includes obtaining an input audio signal based on the utterance input of the user; obtaining, from the input audio signal, at least one audio signal of an utterance section and at least one audio signal of a non-utterance section; generating environment information indicating an environment in which the utterance input is received, based on the at least one audio signal of the non-utterance section; obtaining a result of a comparison between the generated environment information and registration environment information indicating an environment in which a registration utterance input corresponding to a previously registered registration audio signal corresponding to the user is received; adjusting an authentication criterion for authenticating the user based on the result of the comparison; and authenticating the user based on the adjusted authentication criterion and the input audio signal.Type: GrantFiled: December 2, 2019Date of Patent: September 13, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Keunseok Cho, Jaeyoung Roh, Donghan Jang, Jiwon Hyung, Jaewon Lee
-
Patent number: 11417327Abstract: An electronic apparatus is provided. The electronic device includes: a storage configured to store recognition related information and misrecognition related information of a trigger word for entering a speech recognition mode; and a processor configured to identify whether or not the speech recognition mode is activated on the basis of characteristic information of a received uttered speech and the recognition related information, identify a similarity between text information of the received uttered speech and text information of the trigger word, and update at least one of the recognition related information or the misrecognition related information on the basis of whether or not the speech recognition mode is activated and the similarity.Type: GrantFiled: November 27, 2019Date of Patent: August 16, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: Chanhee Choi
-
Patent number: 11410645Abstract: A user device configured to perform wake-up word detection in a target language.Type: GrantFiled: November 10, 2016Date of Patent: August 9, 2022Assignee: Cerence Operating CompanyInventors: Xiao-Lin Ren, Jianzhong Teng
-
Patent number: 11404067Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.Type: GrantFiled: June 15, 2020Date of Patent: August 2, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
-
Patent number: 11335353Abstract: A score indicating a likelihood that a first subject is the same as a second subject may be calibrated to compensate for aging of the first subject between samples of age-sensitive biometric characteristics. Age of the first subject obtained at a first sample time and age of the second subject obtained at a second sample time may be averaged, and an age approximation may be generated based on at least the age average and an interval between the first and second samples. The age approximation, the interval between the first and second sample times, and an obtained gender of the subject are used to calibrate the likelihood score.Type: GrantFiled: June 1, 2020Date of Patent: May 17, 2022Assignee: PINDROP SECURITY, INC.Inventors: Elie Khoury, Matthew Garland
-
Patent number: 11328711Abstract: A user adaptive conversation apparatus generating a talk for a conversation based on emotional and ethical states of a user. A voice recognition unit converts a talk of the user in a conversational situation into a natural language script form to generate talk information. An artificial visualization unit generates situation information by recognizing talking situation from a video and generates intention information indicating an intention of the talk. A natural language analysis unit converts the situation information and the intention information into the natural language script form. A natural language analysis unit analyzes the talk information, the intention information, and the situation information.Type: GrantFiled: July 5, 2019Date of Patent: May 10, 2022Assignee: KOREA ELECTRONICS TECHNOLOGY INSTITUTEInventors: Saim Shin, Hyedong Jung, Jinyea Jang
-
Patent number: 11322155Abstract: A method and apparatus for establishing a voiceprint model, a computer device, and a storage medium are described herein. The method includes: collecting speech acoustic features in a speech signal to form a plurality of cluster structures; calculating an average value and a standard deviation of the plurality of cluster structures and then performing coordinate transformation and activation function calculation to obtain a feature vector; and obtaining a voiceprint model based on the feature vector.Type: GrantFiled: July 6, 2018Date of Patent: May 3, 2022Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.Inventors: Yuanzhe Cai, Jianzong Wang, Ning Cheng, Jing Xiao
-
Patent number: 11233490Abstract: A method includes detecting an input that triggers a virtual assistance (VA) on an electronic downdevice (ED) to perform a task that includes outputting audio content through a speaker associated with the ED. The method includes identifying a type of the audio content to be outputted through the speaker. The method includes determining whether a registered user of the ED is present in proximity to the ED. Each registered user is associated with a unique user identifier. The method includes, in response to determining that no registered user is present in proximity to the ED, outputting the audio content via the speaker at a current volume level of the ED. The method includes in response to determining that a registered user is in proximity to the ED, outputting the audio content at a selected, preferred volume level based on pre-determined or pre-established volume preference settings of the registered user.Type: GrantFiled: November 21, 2019Date of Patent: January 25, 2022Assignee: Motorola Mobility LLCInventors: Wendel Malta De Assis, Amit Kumar Agrawal, Lucie Corinne Evelyne Papelier