Update Patterns Patents (Class 704/244)
-
Patent number: 12198677Abstract: A method of end-to-end speaker diarization (EESD) using neural speaker clustering, performed by at least one processor, is provided. The method includes generating a set of speech labels corresponding to a set of speakers based on an input stream. The speech labels indicate whether dialogue of a speaker is speech or non-speech. The method further includes generating dialogue based on the set of speakers, extracting speaker embeddings from the dialogue, mapping the speaker embeddings to a cluster identification (ID), an overlapped speech value, or a non-speech value based on a neural network, and outputting EESD labels based on the mapping.Type: GrantFiled: May 27, 2022Date of Patent: January 14, 2025Assignee: TENCENT AMERICA LLCInventors: Chunlei Zhang, Dong Yu
-
Patent number: 12147511Abstract: A system and method for controlling the display of sensitive information in a home work environment is provided. The method comprises: providing a work computing device including a first processor for executing a web-based computer application accessible over a network for displaying information contained in at least one data field; and providing an administrative computing device in communication with the work computing device including a second processor configured to execute computer executable instructions for designating the data field as either restricted or unrestricted. When the data field is designated as restricted, the computer executable instructions mask the information contained in the data field so that the information is not viewable on a display of the work computing device. The system and method also provide for the selective unmasking of the masked information using the work computing device, with data associated with the unmasking being communicated to the administrative computing device.Type: GrantFiled: November 23, 2021Date of Patent: November 19, 2024Assignee: Sutherland Global Services Inc.Inventors: Manikhandan Venugopal, Shaikh Ashif, Ganesan Ramalingam, Amin A. Sarfraz, Kumar T. Suresh
-
Patent number: 12106755Abstract: Techniques are described herein for warm word arbitration between automated assistant devices. A method includes: determining that warm word arbitration is to be initiated between a first assistant device and one or more additional assistant devices, including a second assistant device; broadcasting, by the first assistant device, to the one or more additional assistant devices, an active set of warm words for the first assistant device; for each of the one or more additional assistant devices, receiving, from the additional assistant device, an active set of warm words for the additional assistant device; identifying a matching warm word included in the active set of warm words for the first assistant device and included in the active set of warm words for the second assistant device; and enabling or disabling detection of the matching warm word by the first assistant device, in response to identifying the matching warm word.Type: GrantFiled: January 11, 2022Date of Patent: October 1, 2024Assignee: GOOGLE LLCInventors: Matthew Sharifi, Victor Carbune
-
Patent number: 12032717Abstract: One example method includes transcribing a portion of the audio component to create a transcription file that includes text, searching the text of the transcription file and identifying information in the text that may include personal information, defining a textual window that includes the information, evaluating the text in the textual window to identify personal information, and masking the personal information in the audio component of the recording. The personal information may be masked with information of a non-personal nature.Type: GrantFiled: March 27, 2020Date of Patent: July 9, 2024Assignee: EMC IP HOLDING COMPANY LLCInventors: Idan Richman Goshen, Avitan Gefen
-
Patent number: 11996115Abstract: A sound processing apparatus includes a feature value extractor configured to perform a Fourier transform and then a cepstral analysis of a sound signal and to extract, as feature values of the sound signal, values including frequency components obtained by the Fourier transform of the sound signal and a value based on a result obtained by the cepstral analysis of the sound signal.Type: GrantFiled: December 18, 2019Date of Patent: May 28, 2024Assignee: NEC CORPORATIONInventor: Mitsuru Sendoda
-
Patent number: 11947702Abstract: In aspects of personal content managed during device screen recording, a wireless device has a display screen to display digital image content, and a screen recording session captures the digital image content and audio data. The wireless device implements a content control module that determines the screen recording session captures personal content associated with a user of the wireless device, the personal content being captured as part of the digital image content or the audio data. The content control module can generate a user screen recording having a user authorization access level, the user screen recording including the digital image content and/or the audio data, as well as the personal content unaltered for user review. The content control module can also generate a shareable screen recording having a share authorization access level, the shareable screen recording including the digital image content and/or the audio data with the personal content obfuscated.Type: GrantFiled: January 21, 2021Date of Patent: April 2, 2024Assignee: Motorola Mobility LLCInventors: Amit Kumar Agrawal, Gautham Prabhakar Natakala, Shaung Wu
-
Patent number: 11886542Abstract: Systems and processes for prediction using generative adversarial network and distillation technology are provided. For example, an input is received at a first portion of a language model. A first output distribution is obtained, based on the input, from the first portion of the language model. Using a first training model, the language model is adjusted based on the first output distribution. The first output distribution is received at a second portion of the language model. A first representation of the input is obtained, based on the first output distribution, from the second portion of the language model. The language model is adjusted, using a second training model, based on the first representation of the input. Using the adjusted language model, an output is provided based on a received user input.Type: GrantFiled: May 20, 2021Date of Patent: January 30, 2024Assignee: Apple Inc.Inventor: Jerome R. Bellegarda
-
Patent number: 11727918Abstract: In some implementations, a set of audio recordings capturing utterances of a user is received by a first speech-enabled device. Based on the set of audio recordings, the first speech-enabled device generates a first user voice recognition model for use in subsequently recognizing a voice of the user at the first speech-enabled device. Further, a particular user account associated with the first voice recognition model is determined, and an indication that a second speech-enabled device that is associated with the particular user account is received. In response to receiving the indication, the set of audio recordings is provided to the second speech-enabled device. Based on the set of audio recordings, the second speech-enabled device generates a second user voice recognition model for use in subsequently recognizing the voice of the user at the second speech-enabled device.Type: GrantFiled: July 14, 2021Date of Patent: August 15, 2023Assignee: GOOGLE LLCInventors: Ignacio Lopez Moreno, Diego Melendo Casado
-
Patent number: 11703939Abstract: The present disclosure provides a signal processing device, including a signal collector, an instruction converter, and a processor. Examples of the present disclosure may achieve precise recognition of users' intentions and bring operational conveniences to users.Type: GrantFiled: October 30, 2018Date of Patent: July 18, 2023Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTDInventors: Tianshi Chen, Shuai Hu, Shengyuan Zhou, Xishan Zhang
-
Patent number: 11646038Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.Type: GrantFiled: November 17, 2020Date of Patent: May 9, 2023Assignee: NICE LTD.Inventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
-
Patent number: 11645515Abstract: Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes executing a set of analyses and integrating the results of the analyses into a determination as to whether a training data set is poisonous based on determining if resultant activation clusters are poisoned.Type: GrantFiled: September 16, 2019Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Nathalie Baracaldo Angel, Bryant Chen, Biplav Srivastava, Heiko H. Ludwig
-
Patent number: 11586585Abstract: Systems and methods described herein facilitate the search and presentation of historical data for wireless network usage and provide a mechanism for high-redundancy, low-latency record retrieval of data from large data sets. Network devices divide data for a historical data store into separate record type groups, store metadata for each record type in an application database, partition each record type group by date in a historical record database that is different from the application database, and form, within each date partition, buckets of common hash values of a key parameter from each record. When a user performs a query, the network devices generate a record-specific query form based on the record type metadata to obtain lookup parameters; generate a search hash value using a key parameter from the lookup parameters; and generate a query expression based on the record type, lookup parameters, and the search hash value.Type: GrantFiled: January 6, 2021Date of Patent: February 21, 2023Assignee: Verizon Patent and Licensing Inc.Inventors: David C. Eads, Robert Glenn Capps, Jr., Edward M. Foltz, Hema G. Chhatpar
-
Patent number: 11580455Abstract: Techniques and solutions are described for facilitating the use of machine learning techniques. In some cases, filters can be defined for multiple segments of a training data set. Model segments corresponding to respective segments can be trained using an appropriate subset of the training data set. When a request for a machine learning result is made, filter criteria for the request can be determined and an appropriate model segment can be selected and used for processing the request. One or more hyperparameter values can be defined for a machine learning scenario. When a machine learning scenario is selected for execution, the one or more hyperparameter values for the machine learning scenario can be used to configure a machine learning algorithm used by the machine learning scenario.Type: GrantFiled: April 1, 2020Date of Patent: February 14, 2023Assignee: SAP SEInventor: Siar Sarferaz
-
Patent number: 11550873Abstract: A method includes: generating a plurality of individuals of a current generation in accordance with a plurality of individuals of a previous generation to acquire values of an objective function for individuals each representing a variable by evolutionary computation; calculating, for each of partial individuals of the plurality of individuals of the current generation generated by the generating processing, a first value of the objective function by a predetermined method; approximately calculating, for each of the plurality of individuals of the current generation, a second value of the objective function with lower precision than the predetermined method; computing a fitness difference representing a difference between the plurality of individuals of the current generation in accordance with the first value or the second value; and controlling the precision of the approximate calculation based on the fitness difference and a precision difference between the first value and the second value.Type: GrantFiled: March 27, 2020Date of Patent: January 10, 2023Assignee: FUJITSU LIMITEDInventor: Yukito Tsunoda
-
Patent number: 11501767Abstract: The invention relates to a method for operating a motor vehicle having an operating device, which includes a speech recognition and language determination device. A recognition of a voice input of a user of the motor vehicle, and a check as to whether a language of the voice input corresponds to the first operating language take place in a first operating mode with a first operating language. Depending on a result of the checking process, a confidence value is assigned to the voice input, which describes a probability with which the language of the voice input is the second operating language. Depending on the assigned confidence value, a query signal is generated, which describes a request, understandable in a second operating language, to the user for indicating the operating mode to be set or the operating language to be set. In response to a received operating signal, the operating mode to be set or the operating language to be set are set.Type: GrantFiled: November 28, 2017Date of Patent: November 15, 2022Assignee: Audi AGInventors: Christian Al Haddad, Stefan Maiwald
-
Patent number: 11367443Abstract: Disclosed is an electronic device and a method for controlling the electronic device. The electronic device includes: a microphone, a communication interface, a memory for storing at least one instruction, and a processor configured to execute the at least one instruction to: determine whether a user is present around the electronic device based on voice data of the user obtained via the microphone, determine a device group including the electronic device and at least one other electronic device present around the electronic device, identify at least one device from the device group as a hub device to perform a voice recognition, and based on identifying the electronic device as the hub device, obtain, through the communication interface, a voice data of the user from one or more of the at least one other electronic device, and perform the voice recognition.Type: GrantFiled: December 16, 2019Date of Patent: June 21, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Sangwon Ahn, Seongil Hahm, Jeongin Kim, Seongho Byeon, Jaesick Shin, Junsik Jeong
-
Patent number: 11335329Abstract: Performance of Automatic Speech Recognition (ASR) for robustness against real world noises and channel distortions is critical. Embodiments herein provide method and system for generating synthetic multi-conditioned data sets for additive noise and channel distortion for training multi-conditioned acoustic models for robust ASR. The method provides a generative noise model generating plurality of types of noise signals for additive noise based on weighted linear combination of plurality of noise basis signals and channel distortion based on estimated channel responses. The generative noise model is a parametric model, wherein basis function selection, number of basis functions to be combined linearly and weightages to be applied to the combinations is tunable, thereby enabling generation of wide variety of noise signals. Further, the noise signals are added to set of training speech utterances under set of constraints providing the multi-conditioned data sets, imitating real world effects.Type: GrantFiled: March 24, 2020Date of Patent: May 17, 2022Assignee: Tata Consultancy Services LimitedInventors: Meetkumar Hemakshu Soni, Sonal Joshi, Ashish Panda
-
Patent number: 11314942Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.Type: GrantFiled: March 20, 2020Date of Patent: April 26, 2022Assignee: Interactions LLCInventors: Ethan Selfridge, Michael Johnston, Robert Lifgren, James Dreher, John Leonard
-
Patent number: 11302325Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.Type: GrantFiled: April 8, 2020Date of Patent: April 12, 2022Assignee: Sony Interactive Entertainment Inc.Inventors: Marie Kitajima, Masanori Omote
-
Patent number: 11227606Abstract: A compact, self-authenticating, and speaker-verifiable record of an audio communication involving one or more persons comprises a record, encoded on a non-transitory, computer-readable medium, that consists essentially of: a voiceprint for each person whose voice is encoded in the record; a plurality of transcription records, where each transcription record consists essentially of a computer-generated speech-to-text decoding of an utterance and voiceprint associating information that associates a speaker of the utterance with one of the voiceprints stored in the record; and self-authenticating information sufficient to determine whether any of the information encoded in the communication record has been altered.Type: GrantFiled: September 30, 2019Date of Patent: January 18, 2022Assignee: Medallia, Inc.Inventors: Wayne Ramprashad, David Garrod
-
Patent number: 11151981Abstract: A computer implemented method, apparatus, and computer program product for a sound system. Speech recognition is performed on input audio data comprising speech input to a sound system. Speech recognition is additionally performed on at least one instance of output audio data comprising speech reproduced by one or more audio speakers of the sound system. A difference between a result of speech recognition performed on the input audio data and a result of speech recognition performed on an instance of corresponding output audio data is determined. The quality of the reproduced speech is determined as unsatisfactory when the difference is greater than or equal to a threshold. A corrective action may be performed, to improve the quality of the speech reproduced by the sound system, if it is determined that the speech quality of the reproduced sound is unsatisfactory.Type: GrantFiled: October 10, 2019Date of Patent: October 19, 2021Assignee: International Business Machines CorporationInventors: Alexander John Naylor-Teece, Andrew James Dunnings, Oliver Paul Masters
-
Patent number: 11145314Abstract: Embodiments of the present disclosure provide a method and apparatus for voice identification, a device and a computer readable storage medium. The method may include: for an inputted voice signal, obtaining a first piece of decoded acoustic information by a first acoustic model and obtaining a second piece of decoded acoustic information by a second acoustic model, where the second acoustic model being generated by joint modeling of acoustic model and language model. The method may further include determining a first group of candidate identification results based on the first piece of decoded acoustic information, determining a second group of candidate identification results based on the second piece of decoded acoustic information, and then determining a final identification result for the voice signal based on the first group of candidate identification results and the second group of candidate identification results.Type: GrantFiled: March 6, 2020Date of Patent: October 12, 2021Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Xingyuan Peng, Junyao Shao, Lei Jia
-
Patent number: 11145309Abstract: An apparatus includes processor(s) to: use an acoustic model to generate a first set of probabilities of speech sounds uttered within speech audio; derive at least a first candidate word most likely spoken in the speech audio using the first set; analyze the first set to derive a degree of uncertainty therefor; compare the degree of uncertainty to a threshold; in response to at least the degree of uncertainty being less than the threshold, select the first candidate word as a next word most likely spoken in the speech audio; in response to at least the degree of uncertainty being greater than the threshold, select, as the next word most likely spoken in the speech audio, a second candidate word indicated as being most likely spoken based on a second set of probabilities generated by a language model; and add the next word most likely spoken to a transcript.Type: GrantFiled: March 18, 2021Date of Patent: October 12, 2021Assignee: SAS INSTITUTE INC.Inventor: Xu Yang
-
Patent number: 11087741Abstract: Embodiments of the present disclosure include methods, apparatuses, devices, and computer readable storage mediums for processing far-field environmental noise. The method can comprise processing collected far-field environmental noise to a noise segment in a predetermined format. The method can further comprise establishing a far-field voice recognition model based on the noise segment and a near-field voice segment; and determining validity of the noise segment based on the far-field voice recognition model. The solution of the present disclosure can optimize anti-noise performance of the far-field voice recognition model by differentiated training of noise in different user scenarios of a far-field voice recognition product.Type: GrantFiled: January 22, 2019Date of Patent: August 10, 2021Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Jianwei Sun, Chao Li, Xin Li, Weixin Zhu, Ming Wen
-
Patent number: 11087743Abstract: In some implementations, an utterance is determined to include a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword. In response to determining that an utterance includes a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword, at least a portion of the utterance is stored as a new sample. A second set of samples of the particular user speaking the utterance is obtained, where the second set of samples includes the new sample and less than all the samples in the first set of samples. A second utterance is determined to include the particular user speaking the hotword based at least on the second set of samples of the user speaking the hotword.Type: GrantFiled: November 13, 2019Date of Patent: August 10, 2021Assignee: GOOGLE LLCInventors: Ignacio Lopez Moreno, Diego Melendo Casado
-
Patent number: 11074909Abstract: Provided are a device for recognizing a speech input including a named entity from a user and an operating method thereof. The device is configured to: generate a weighted finite state transducer model by using a vocabulary list including a plurality of named entities; obtain a first string from a speech input received from a user, by using a first decoding model; obtain a second string by using a second decoding model that uses the weighted finite state transducer model, the second string including a word sequence, which corresponds to at least one named entity, and an unrecognized word sequence not identified as a named entity; and output a text corresponding to the speech input by substituting the unrecognized word sequence of the second string with a word sequence included in the first string.Type: GrantFiled: June 26, 2020Date of Patent: July 27, 2021Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Kyungmin Lee, Youngho Han, Sangyoon Kim, Donguk Jung, Aahwan Kudumula, Changwoo Han
-
Patent number: 11069337Abstract: A voice-content control device includes a voice classifying unit configured to analyze a voice spoken by a user and acquired by a voice acquiring unit to classify the voice as either one of a first voice or a second voice, a process executing unit configured to analyze the acquired voice to execute processing required by the user, and a voice-content generating unit configured to generate, based on content of the executed processing, output sentence that is text data for a voice to be output to the user, wherein the voice-content generating unit is further configured to generate a first output sentence as the output sentence when the analyzed voice has been classified as the first voice, and generate a second output sentence in which information is omitted as compared to the first output sentence as the output sentence when the analyzed voice has been classified as the second voice.Type: GrantFiled: March 4, 2019Date of Patent: July 20, 2021Assignee: JVC KENWOOD CorporationInventor: Tatsumi Naganuma
-
Patent number: 11043223Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.Type: GrantFiled: June 19, 2020Date of Patent: June 22, 2021Assignee: Advanced New Technologies Co., Ltd.Inventor: Qing Ling
-
Patent number: 11021113Abstract: A camera module includes a camera imaging a region outside a rear end portion of a vehicle and a storage storing first and second dictionary information corresponding to a first area and a second area. When the camera take an image of a pedestrian and a detected latitude and longitude correspond to the first area, the camera module recognizes the image of the pedestrian based on the first dictionary information, outputs a first vehicle control signal based on a recognition result, and outputs a status the first dictionary information is used. When the camera takes an image of a pedestrian and a detected latitude and longitude correspond to the second area, the camera module recognizes the image of the pedestrian based on the second dictionary information, outputs a second vehicle control signal based on a recognition result, and outputs a status that the second dictionary information is used.Type: GrantFiled: March 3, 2020Date of Patent: June 1, 2021Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.Inventors: Teruo Sakamoto, Sangwon Kim
-
Patent number: 11024287Abstract: A method, a device, and a storage medium for correcting an error in a speech recognition result are provided. The method includes: performing phonetic notation on a speech recognition result to be corrected, to obtain a pinyin corresponding to the speech recognition result; obtaining one or more candidate texts according to the pinyin, and determining an optimum candidate text from the one or more candidate texts; judging whether the optimum candidate text satisfies a preset condition; and determining the optimum candidate text as a corrected result of the speech recognition result to be corrected in response to satisfying the preset condition.Type: GrantFiled: January 25, 2017Date of Patent: June 1, 2021Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventor: Shujie Yao
-
Patent number: 11017783Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.Type: GrantFiled: March 8, 2019Date of Patent: May 25, 2021Assignee: QUALCOMM IncorporatedInventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
-
Patent number: 10957322Abstract: Provided is a speech processing apparatus including a word string estimation unit that estimates a word string equivalent to input speech among word strings included in dictionary data, and a calculation unit that calculates, for an element part constituting the word string estimated by the word string estimation unit, a certainty factor in which a content of the element part is equivalent to a content of a corresponding part in the input speech.Type: GrantFiled: May 31, 2017Date of Patent: March 23, 2021Assignee: SONY CORPORATIONInventors: Emiru Tsunoo, Toshiyuki Kumakura
-
Patent number: 10929606Abstract: A method for intelligent assistance includes identifying one or more insertion points within an input comprising text for providing additional information. A follow-up expression that includes at least a portion of the input and the additional information at the one or more insertion points is generated for clarifying or supplementing meaning of the input.Type: GrantFiled: February 23, 2018Date of Patent: February 23, 2021Assignee: Samsung Electronics Co., Ltd.Inventors: Justin C. Martineau, Avik Ray, Hongxia Jin
-
Patent number: 10909468Abstract: In one embodiment, a set of training data consisting of inliers may be obtained. A supervised classification model may be trained using the set of training data to identify outliers. The supervised classification model may be applied to generate an anomaly score for a data point. It may be determined whether the data point is an outlier based, at least in part, upon the anomaly score.Type: GrantFiled: February 27, 2015Date of Patent: February 2, 2021Assignee: Verizon Media Inc.Inventors: Makoto Yamada, Chao Qin, Hua Ouyang, Achint Thomas, Yi Chang
-
Patent number: 10885920Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.Type: GrantFiled: December 31, 2018Date of Patent: January 5, 2021Assignee: NICE LTDInventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
-
Patent number: 10885899Abstract: A method includes receiving initial training data associated with a trigger phrase in a device and training a voice model in the device using the initial training data. The voice model is used to identify a plurality of voice commands in the device initiated using the trigger phrase. Collection of additional training data from the plurality of voice commands and retraining of the voice model in the device are iteratively performed using the additional training data. A device includes a microphone and a processor to receive initial training data associated with a trigger phrase using the microphone, train a voice model device using the initial training data, use the voice model to identify a plurality of voice commands initiated using the trigger phrase, and iteratively collect additional training data from the plurality of voice commands and retrain the voice model in the device using the additional training data.Type: GrantFiled: October 9, 2018Date of Patent: January 5, 2021Assignee: Motorola Mobility LLCInventors: Boby Iyer, Amit Kumar Agrawal
-
Patent number: 10878068Abstract: An authentication system, comprising: one or more inputs, for receiving biometric input signals from a user; a routing module, configured to selectively route the biometric input signals from the one or more inputs to one or more of a plurality of components, the plurality of components including a biometric authentication module, for processing the biometric input signals and generating an authentication result; and a security module, for receiving a control instruction for the routing module, determining whether or not the control instruction complies with one or more rules, and controlling the routing module based on the control instruction responsive to a determination that the control instruction complies with the one or more rules.Type: GrantFiled: August 3, 2017Date of Patent: December 29, 2020Assignee: Cirrus Logic, Inc.Inventors: Ryan Roberts, Michael Page
-
Patent number: 10831442Abstract: An approach is provided that receives, from a user, an amalgamation at a digital assistant. The amalgamation includes one or more words spoken by the user that are captured by a digital microphone and a set of digital images corresponding to one or more gestures that are performed by the user with the digital images captured by a digital camera. The system then determines an action that is responsive to the amalgamation and then performs the determined action.Type: GrantFiled: October 19, 2018Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Jeremy R. Fox, Gregory J. Boss, Kelley Anders, Sarbajit K. Rakshit
-
Patent number: 10826857Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives a message from a client device. The program further determines a language from a plurality of languages associated with the message. The program also determines a model from a plurality of models that corresponds to the determined language. Based on the determined model, the program further determines a function from a plurality of functions provided by a computing device that is associated with the message. The program also sends the computing device a request to perform the function.Type: GrantFiled: October 20, 2017Date of Patent: November 3, 2020Assignee: SAP SEInventors: Christopher Trudeau, John Dietz, Amanda Casari, Richard Puckett
-
Patent number: 10818299Abstract: A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step.Type: GrantFiled: May 12, 2014Date of Patent: October 27, 2020Assignee: Nuance Communications, Inc.Inventors: David Jaramillo, Gerald M. McCobb
-
Patent number: 10770062Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.Type: GrantFiled: September 9, 2019Date of Patent: September 8, 2020Assignee: INTUIT INC.Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
-
Patent number: 10762904Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.Type: GrantFiled: February 24, 2017Date of Patent: September 1, 2020Assignee: Samsung Electronics Co., Ltd.Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
-
Patent number: 10714094Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.Type: GrantFiled: January 12, 2018Date of Patent: July 14, 2020Assignee: Alibaba Group Holding LimitedInventor: Qing Ling
-
Patent number: 10652655Abstract: A volume and speech frequency level adjustment method, system, and computer program product include learning a preferred level and a characteristic of at least one of volume and speech frequency from a historical conference conversation, detecting a context characteristic of an ongoing conversation and an interaction of a user with a device, determining a cognitive state and a contextual situation of the user in relation to the ongoing conversation as a function of at least one of the context characteristic, a preferred level and a characteristic of the volume or the speech frequency, and the interaction, determining at least one factor to trigger an audio level modulation based on the function, and dynamically adjusting audio levels of the ongoing conversation for the user based on the at least one factor.Type: GrantFiled: April 30, 2019Date of Patent: May 12, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Komminist Weldemariam, Abdigani Diriye, Michael S. Gordon, Heike E. Riel
-
Patent number: 10636419Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.Type: GrantFiled: December 6, 2017Date of Patent: April 28, 2020Assignee: Sony Interactive Entertainment Inc.Inventors: Marie Kitajima, Masanori Omote
-
Patent number: 10629184Abstract: Cepstral variance normalization is described for audio feature extraction.Type: GrantFiled: December 22, 2014Date of Patent: April 21, 2020Assignee: Intel CorporationInventors: Tobias Bocklet, Adam Marek
-
Patent number: 10592604Abstract: Techniques for inverse text normalization are provided. In some examples, speech input is received and a spoken-form text representation of the speech input is generated. The spoken-form text representation includes a token sequence. A feature representation is determined for the spoken-form text representation and a sequence of labels is determined based on the feature representation. The sequence of labels is assigned to the token sequence and specifies a plurality of edit operations to perform on the token sequence. Each edit operation of the plurality of edit operations corresponds to one of a plurality of predetermined types of edit operations. A written-form text representation of the speech input is generated by applying the plurality of edit operations to the token sequence in accordance with the sequence of labels. A task responsive to the speech input is performed using the generated written-form text representation.Type: GrantFiled: June 29, 2018Date of Patent: March 17, 2020Assignee: Apple Inc.Inventors: Ernest J. Pusateri, Bharat Ram Ambati, Elizabeth S. Brooks, Donald R. McAllaster, Venkatesh Nagesha, Ondrej Platek
-
Patent number: 10573296Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.Type: GrantFiled: December 10, 2018Date of Patent: February 25, 2020Assignee: Apprente LLCInventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
-
Patent number: 10565191Abstract: Systems and methods for utilizing a cognitive device are disclosed. A method includes: receiving, by a computer device, a query from a cognitive device; processing, by the computer device, the query to generate a processed query; transmitting, by the computer device, the processed query to a mobile device; receiving, by the computer device, an action query result from the mobile device based on the mobile device receiving the processed query and performing an action query; transmitting, by the computer device, the action query result to the cognitive device based on receiving the action query result.Type: GrantFiled: June 5, 2017Date of Patent: February 18, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Trent W. Boyer
-
Patent number: 10559299Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.Type: GrantFiled: June 10, 2019Date of Patent: February 11, 2020Assignee: Apprente LLCInventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz