Update Patterns Patents (Class 704/244)
  • Patent number: 11947702
    Abstract: In aspects of personal content managed during device screen recording, a wireless device has a display screen to display digital image content, and a screen recording session captures the digital image content and audio data. The wireless device implements a content control module that determines the screen recording session captures personal content associated with a user of the wireless device, the personal content being captured as part of the digital image content or the audio data. The content control module can generate a user screen recording having a user authorization access level, the user screen recording including the digital image content and/or the audio data, as well as the personal content unaltered for user review. The content control module can also generate a shareable screen recording having a share authorization access level, the shareable screen recording including the digital image content and/or the audio data with the personal content obfuscated.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: April 2, 2024
    Assignee: Motorola Mobility LLC
    Inventors: Amit Kumar Agrawal, Gautham Prabhakar Natakala, Shaung Wu
  • Patent number: 11886542
    Abstract: Systems and processes for prediction using generative adversarial network and distillation technology are provided. For example, an input is received at a first portion of a language model. A first output distribution is obtained, based on the input, from the first portion of the language model. Using a first training model, the language model is adjusted based on the first output distribution. The first output distribution is received at a second portion of the language model. A first representation of the input is obtained, based on the first output distribution, from the second portion of the language model. The language model is adjusted, using a second training model, based on the first representation of the input. Using the adjusted language model, an output is provided based on a received user input.
    Type: Grant
    Filed: May 20, 2021
    Date of Patent: January 30, 2024
    Assignee: Apple Inc.
    Inventor: Jerome R. Bellegarda
  • Patent number: 11727918
    Abstract: In some implementations, a set of audio recordings capturing utterances of a user is received by a first speech-enabled device. Based on the set of audio recordings, the first speech-enabled device generates a first user voice recognition model for use in subsequently recognizing a voice of the user at the first speech-enabled device. Further, a particular user account associated with the first voice recognition model is determined, and an indication that a second speech-enabled device that is associated with the particular user account is received. In response to receiving the indication, the set of audio recordings is provided to the second speech-enabled device. Based on the set of audio recordings, the second speech-enabled device generates a second user voice recognition model for use in subsequently recognizing the voice of the user at the second speech-enabled device.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: August 15, 2023
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Diego Melendo Casado
  • Patent number: 11703939
    Abstract: The present disclosure provides a signal processing device, including a signal collector, an instruction converter, and a processor. Examples of the present disclosure may achieve precise recognition of users' intentions and bring operational conveniences to users.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: July 18, 2023
    Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD
    Inventors: Tianshi Chen, Shuai Hu, Shengyuan Zhou, Xishan Zhang
  • Patent number: 11646038
    Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.
    Type: Grant
    Filed: November 17, 2020
    Date of Patent: May 9, 2023
    Assignee: NICE LTD.
    Inventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
  • Patent number: 11645515
    Abstract: Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes executing a set of analyses and integrating the results of the analyses into a determination as to whether a training data set is poisonous based on determining if resultant activation clusters are poisoned.
    Type: Grant
    Filed: September 16, 2019
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Nathalie Baracaldo Angel, Bryant Chen, Biplav Srivastava, Heiko H. Ludwig
  • Patent number: 11586585
    Abstract: Systems and methods described herein facilitate the search and presentation of historical data for wireless network usage and provide a mechanism for high-redundancy, low-latency record retrieval of data from large data sets. Network devices divide data for a historical data store into separate record type groups, store metadata for each record type in an application database, partition each record type group by date in a historical record database that is different from the application database, and form, within each date partition, buckets of common hash values of a key parameter from each record. When a user performs a query, the network devices generate a record-specific query form based on the record type metadata to obtain lookup parameters; generate a search hash value using a key parameter from the lookup parameters; and generate a query expression based on the record type, lookup parameters, and the search hash value.
    Type: Grant
    Filed: January 6, 2021
    Date of Patent: February 21, 2023
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: David C. Eads, Robert Glenn Capps, Jr., Edward M. Foltz, Hema G. Chhatpar
  • Patent number: 11580455
    Abstract: Techniques and solutions are described for facilitating the use of machine learning techniques. In some cases, filters can be defined for multiple segments of a training data set. Model segments corresponding to respective segments can be trained using an appropriate subset of the training data set. When a request for a machine learning result is made, filter criteria for the request can be determined and an appropriate model segment can be selected and used for processing the request. One or more hyperparameter values can be defined for a machine learning scenario. When a machine learning scenario is selected for execution, the one or more hyperparameter values for the machine learning scenario can be used to configure a machine learning algorithm used by the machine learning scenario.
    Type: Grant
    Filed: April 1, 2020
    Date of Patent: February 14, 2023
    Assignee: SAP SE
    Inventor: Siar Sarferaz
  • Patent number: 11550873
    Abstract: A method includes: generating a plurality of individuals of a current generation in accordance with a plurality of individuals of a previous generation to acquire values of an objective function for individuals each representing a variable by evolutionary computation; calculating, for each of partial individuals of the plurality of individuals of the current generation generated by the generating processing, a first value of the objective function by a predetermined method; approximately calculating, for each of the plurality of individuals of the current generation, a second value of the objective function with lower precision than the predetermined method; computing a fitness difference representing a difference between the plurality of individuals of the current generation in accordance with the first value or the second value; and controlling the precision of the approximate calculation based on the fitness difference and a precision difference between the first value and the second value.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: January 10, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Yukito Tsunoda
  • Patent number: 11501767
    Abstract: The invention relates to a method for operating a motor vehicle having an operating device, which includes a speech recognition and language determination device. A recognition of a voice input of a user of the motor vehicle, and a check as to whether a language of the voice input corresponds to the first operating language take place in a first operating mode with a first operating language. Depending on a result of the checking process, a confidence value is assigned to the voice input, which describes a probability with which the language of the voice input is the second operating language. Depending on the assigned confidence value, a query signal is generated, which describes a request, understandable in a second operating language, to the user for indicating the operating mode to be set or the operating language to be set. In response to a received operating signal, the operating mode to be set or the operating language to be set are set.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: November 15, 2022
    Assignee: Audi AG
    Inventors: Christian Al Haddad, Stefan Maiwald
  • Patent number: 11367443
    Abstract: Disclosed is an electronic device and a method for controlling the electronic device. The electronic device includes: a microphone, a communication interface, a memory for storing at least one instruction, and a processor configured to execute the at least one instruction to: determine whether a user is present around the electronic device based on voice data of the user obtained via the microphone, determine a device group including the electronic device and at least one other electronic device present around the electronic device, identify at least one device from the device group as a hub device to perform a voice recognition, and based on identifying the electronic device as the hub device, obtain, through the communication interface, a voice data of the user from one or more of the at least one other electronic device, and perform the voice recognition.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: June 21, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sangwon Ahn, Seongil Hahm, Jeongin Kim, Seongho Byeon, Jaesick Shin, Junsik Jeong
  • Patent number: 11335329
    Abstract: Performance of Automatic Speech Recognition (ASR) for robustness against real world noises and channel distortions is critical. Embodiments herein provide method and system for generating synthetic multi-conditioned data sets for additive noise and channel distortion for training multi-conditioned acoustic models for robust ASR. The method provides a generative noise model generating plurality of types of noise signals for additive noise based on weighted linear combination of plurality of noise basis signals and channel distortion based on estimated channel responses. The generative noise model is a parametric model, wherein basis function selection, number of basis functions to be combined linearly and weightages to be applied to the combinations is tunable, thereby enabling generation of wide variety of noise signals. Further, the noise signals are added to set of training speech utterances under set of constraints providing the multi-conditioned data sets, imitating real world effects.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: May 17, 2022
    Assignee: Tata Consultancy Services Limited
    Inventors: Meetkumar Hemakshu Soni, Sonal Joshi, Ashish Panda
  • Patent number: 11314942
    Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: April 26, 2022
    Assignee: Interactions LLC
    Inventors: Ethan Selfridge, Michael Johnston, Robert Lifgren, James Dreher, John Leonard
  • Patent number: 11302325
    Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: April 12, 2022
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Marie Kitajima, Masanori Omote
  • Patent number: 11227606
    Abstract: A compact, self-authenticating, and speaker-verifiable record of an audio communication involving one or more persons comprises a record, encoded on a non-transitory, computer-readable medium, that consists essentially of: a voiceprint for each person whose voice is encoded in the record; a plurality of transcription records, where each transcription record consists essentially of a computer-generated speech-to-text decoding of an utterance and voiceprint associating information that associates a speaker of the utterance with one of the voiceprints stored in the record; and self-authenticating information sufficient to determine whether any of the information encoded in the communication record has been altered.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: January 18, 2022
    Assignee: Medallia, Inc.
    Inventors: Wayne Ramprashad, David Garrod
  • Patent number: 11151981
    Abstract: A computer implemented method, apparatus, and computer program product for a sound system. Speech recognition is performed on input audio data comprising speech input to a sound system. Speech recognition is additionally performed on at least one instance of output audio data comprising speech reproduced by one or more audio speakers of the sound system. A difference between a result of speech recognition performed on the input audio data and a result of speech recognition performed on an instance of corresponding output audio data is determined. The quality of the reproduced speech is determined as unsatisfactory when the difference is greater than or equal to a threshold. A corrective action may be performed, to improve the quality of the speech reproduced by the sound system, if it is determined that the speech quality of the reproduced sound is unsatisfactory.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Alexander John Naylor-Teece, Andrew James Dunnings, Oliver Paul Masters
  • Patent number: 11145314
    Abstract: Embodiments of the present disclosure provide a method and apparatus for voice identification, a device and a computer readable storage medium. The method may include: for an inputted voice signal, obtaining a first piece of decoded acoustic information by a first acoustic model and obtaining a second piece of decoded acoustic information by a second acoustic model, where the second acoustic model being generated by joint modeling of acoustic model and language model. The method may further include determining a first group of candidate identification results based on the first piece of decoded acoustic information, determining a second group of candidate identification results based on the second piece of decoded acoustic information, and then determining a final identification result for the voice signal based on the first group of candidate identification results and the second group of candidate identification results.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: October 12, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Xingyuan Peng, Junyao Shao, Lei Jia
  • Patent number: 11145309
    Abstract: An apparatus includes processor(s) to: use an acoustic model to generate a first set of probabilities of speech sounds uttered within speech audio; derive at least a first candidate word most likely spoken in the speech audio using the first set; analyze the first set to derive a degree of uncertainty therefor; compare the degree of uncertainty to a threshold; in response to at least the degree of uncertainty being less than the threshold, select the first candidate word as a next word most likely spoken in the speech audio; in response to at least the degree of uncertainty being greater than the threshold, select, as the next word most likely spoken in the speech audio, a second candidate word indicated as being most likely spoken based on a second set of probabilities generated by a language model; and add the next word most likely spoken to a transcript.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: October 12, 2021
    Assignee: SAS INSTITUTE INC.
    Inventor: Xu Yang
  • Patent number: 11087741
    Abstract: Embodiments of the present disclosure include methods, apparatuses, devices, and computer readable storage mediums for processing far-field environmental noise. The method can comprise processing collected far-field environmental noise to a noise segment in a predetermined format. The method can further comprise establishing a far-field voice recognition model based on the noise segment and a near-field voice segment; and determining validity of the noise segment based on the far-field voice recognition model. The solution of the present disclosure can optimize anti-noise performance of the far-field voice recognition model by differentiated training of noise in different user scenarios of a far-field voice recognition product.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: August 10, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Jianwei Sun, Chao Li, Xin Li, Weixin Zhu, Ming Wen
  • Patent number: 11087743
    Abstract: In some implementations, an utterance is determined to include a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword. In response to determining that an utterance includes a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword, at least a portion of the utterance is stored as a new sample. A second set of samples of the particular user speaking the utterance is obtained, where the second set of samples includes the new sample and less than all the samples in the first set of samples. A second utterance is determined to include the particular user speaking the hotword based at least on the second set of samples of the user speaking the hotword.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: August 10, 2021
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Diego Melendo Casado
  • Patent number: 11074909
    Abstract: Provided are a device for recognizing a speech input including a named entity from a user and an operating method thereof. The device is configured to: generate a weighted finite state transducer model by using a vocabulary list including a plurality of named entities; obtain a first string from a speech input received from a user, by using a first decoding model; obtain a second string by using a second decoding model that uses the weighted finite state transducer model, the second string including a word sequence, which corresponds to at least one named entity, and an unrecognized word sequence not identified as a named entity; and output a text corresponding to the speech input by substituting the unrecognized word sequence of the second string with a word sequence included in the first string.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: July 27, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kyungmin Lee, Youngho Han, Sangyoon Kim, Donguk Jung, Aahwan Kudumula, Changwoo Han
  • Patent number: 11069337
    Abstract: A voice-content control device includes a voice classifying unit configured to analyze a voice spoken by a user and acquired by a voice acquiring unit to classify the voice as either one of a first voice or a second voice, a process executing unit configured to analyze the acquired voice to execute processing required by the user, and a voice-content generating unit configured to generate, based on content of the executed processing, output sentence that is text data for a voice to be output to the user, wherein the voice-content generating unit is further configured to generate a first output sentence as the output sentence when the analyzed voice has been classified as the first voice, and generate a second output sentence in which information is omitted as compared to the first output sentence as the output sentence when the analyzed voice has been classified as the second voice.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: July 20, 2021
    Assignee: JVC KENWOOD Corporation
    Inventor: Tatsumi Naganuma
  • Patent number: 11043223
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: June 19, 2020
    Date of Patent: June 22, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventor: Qing Ling
  • Patent number: 11024287
    Abstract: A method, a device, and a storage medium for correcting an error in a speech recognition result are provided. The method includes: performing phonetic notation on a speech recognition result to be corrected, to obtain a pinyin corresponding to the speech recognition result; obtaining one or more candidate texts according to the pinyin, and determining an optimum candidate text from the one or more candidate texts; judging whether the optimum candidate text satisfies a preset condition; and determining the optimum candidate text as a corrected result of the speech recognition result to be corrected in response to satisfying the preset condition.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: June 1, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Shujie Yao
  • Patent number: 11021113
    Abstract: A camera module includes a camera imaging a region outside a rear end portion of a vehicle and a storage storing first and second dictionary information corresponding to a first area and a second area. When the camera take an image of a pedestrian and a detected latitude and longitude correspond to the first area, the camera module recognizes the image of the pedestrian based on the first dictionary information, outputs a first vehicle control signal based on a recognition result, and outputs a status the first dictionary information is used. When the camera takes an image of a pedestrian and a detected latitude and longitude correspond to the second area, the camera module recognizes the image of the pedestrian based on the second dictionary information, outputs a second vehicle control signal based on a recognition result, and outputs a status that the second dictionary information is used.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: June 1, 2021
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Teruo Sakamoto, Sangwon Kim
  • Patent number: 11017783
    Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
  • Patent number: 10957322
    Abstract: Provided is a speech processing apparatus including a word string estimation unit that estimates a word string equivalent to input speech among word strings included in dictionary data, and a calculation unit that calculates, for an element part constituting the word string estimated by the word string estimation unit, a certainty factor in which a content of the element part is equivalent to a content of a corresponding part in the input speech.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: March 23, 2021
    Assignee: SONY CORPORATION
    Inventors: Emiru Tsunoo, Toshiyuki Kumakura
  • Patent number: 10929606
    Abstract: A method for intelligent assistance includes identifying one or more insertion points within an input comprising text for providing additional information. A follow-up expression that includes at least a portion of the input and the additional information at the one or more insertion points is generated for clarifying or supplementing meaning of the input.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: February 23, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Justin C. Martineau, Avik Ray, Hongxia Jin
  • Patent number: 10909468
    Abstract: In one embodiment, a set of training data consisting of inliers may be obtained. A supervised classification model may be trained using the set of training data to identify outliers. The supervised classification model may be applied to generate an anomaly score for a data point. It may be determined whether the data point is an outlier based, at least in part, upon the anomaly score.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: February 2, 2021
    Assignee: Verizon Media Inc.
    Inventors: Makoto Yamada, Chao Qin, Hua Ouyang, Achint Thomas, Yi Chang
  • Patent number: 10885920
    Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: January 5, 2021
    Assignee: NICE LTD
    Inventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
  • Patent number: 10885899
    Abstract: A method includes receiving initial training data associated with a trigger phrase in a device and training a voice model in the device using the initial training data. The voice model is used to identify a plurality of voice commands in the device initiated using the trigger phrase. Collection of additional training data from the plurality of voice commands and retraining of the voice model in the device are iteratively performed using the additional training data. A device includes a microphone and a processor to receive initial training data associated with a trigger phrase using the microphone, train a voice model device using the initial training data, use the voice model to identify a plurality of voice commands initiated using the trigger phrase, and iteratively collect additional training data from the plurality of voice commands and retrain the voice model in the device using the additional training data.
    Type: Grant
    Filed: October 9, 2018
    Date of Patent: January 5, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Boby Iyer, Amit Kumar Agrawal
  • Patent number: 10878068
    Abstract: An authentication system, comprising: one or more inputs, for receiving biometric input signals from a user; a routing module, configured to selectively route the biometric input signals from the one or more inputs to one or more of a plurality of components, the plurality of components including a biometric authentication module, for processing the biometric input signals and generating an authentication result; and a security module, for receiving a control instruction for the routing module, determining whether or not the control instruction complies with one or more rules, and controlling the routing module based on the control instruction responsive to a determination that the control instruction complies with the one or more rules.
    Type: Grant
    Filed: August 3, 2017
    Date of Patent: December 29, 2020
    Assignee: Cirrus Logic, Inc.
    Inventors: Ryan Roberts, Michael Page
  • Patent number: 10831442
    Abstract: An approach is provided that receives, from a user, an amalgamation at a digital assistant. The amalgamation includes one or more words spoken by the user that are captured by a digital microphone and a set of digital images corresponding to one or more gestures that are performed by the user with the digital images captured by a digital camera. The system then determines an action that is responsive to the amalgamation and then performs the determined action.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Jeremy R. Fox, Gregory J. Boss, Kelley Anders, Sarbajit K. Rakshit
  • Patent number: 10826857
    Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives a message from a client device. The program further determines a language from a plurality of languages associated with the message. The program also determines a model from a plurality of models that corresponds to the determined language. Based on the determined model, the program further determines a function from a plurality of functions provided by a computing device that is associated with the message. The program also sends the computing device a request to perform the function.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventors: Christopher Trudeau, John Dietz, Amanda Casari, Richard Puckett
  • Patent number: 10818299
    Abstract: A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: October 27, 2020
    Assignee: Nuance Communications, Inc.
    Inventors: David Jaramillo, Gerald M. McCobb
  • Patent number: 10770062
    Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: September 8, 2020
    Assignee: INTUIT INC.
    Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
  • Patent number: 10762904
    Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: September 1, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
  • Patent number: 10714094
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: July 14, 2020
    Assignee: Alibaba Group Holding Limited
    Inventor: Qing Ling
  • Patent number: 10652655
    Abstract: A volume and speech frequency level adjustment method, system, and computer program product include learning a preferred level and a characteristic of at least one of volume and speech frequency from a historical conference conversation, detecting a context characteristic of an ongoing conversation and an interaction of a user with a device, determining a cognitive state and a contextual situation of the user in relation to the ongoing conversation as a function of at least one of the context characteristic, a preferred level and a characteristic of the volume or the speech frequency, and the interaction, determining at least one factor to trigger an audio level modulation based on the function, and dynamically adjusting audio levels of the ongoing conversation for the user based on the at least one factor.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 12, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Komminist Weldemariam, Abdigani Diriye, Michael S. Gordon, Heike E. Riel
  • Patent number: 10636419
    Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: April 28, 2020
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Marie Kitajima, Masanori Omote
  • Patent number: 10629184
    Abstract: Cepstral variance normalization is described for audio feature extraction.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: April 21, 2020
    Assignee: Intel Corporation
    Inventors: Tobias Bocklet, Adam Marek
  • Patent number: 10592604
    Abstract: Techniques for inverse text normalization are provided. In some examples, speech input is received and a spoken-form text representation of the speech input is generated. The spoken-form text representation includes a token sequence. A feature representation is determined for the spoken-form text representation and a sequence of labels is determined based on the feature representation. The sequence of labels is assigned to the token sequence and specifies a plurality of edit operations to perform on the token sequence. Each edit operation of the plurality of edit operations corresponds to one of a plurality of predetermined types of edit operations. A written-form text representation of the speech input is generated by applying the plurality of edit operations to the token sequence in accordance with the sequence of labels. A task responsive to the speech input is performed using the generated written-form text representation.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: March 17, 2020
    Assignee: Apple Inc.
    Inventors: Ernest J. Pusateri, Bharat Ram Ambati, Elizabeth S. Brooks, Donald R. McAllaster, Venkatesh Nagesha, Ondrej Platek
  • Patent number: 10573296
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: February 25, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10565191
    Abstract: Systems and methods for utilizing a cognitive device are disclosed. A method includes: receiving, by a computer device, a query from a cognitive device; processing, by the computer device, the query to generate a processed query; transmitting, by the computer device, the processed query to a mobile device; receiving, by the computer device, an action query result from the mobile device based on the mobile device receiving the processed query and performing an action query; transmitting, by the computer device, the action query result to the cognitive device based on receiving the action query result.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Trent W. Boyer
  • Patent number: 10559299
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: February 11, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10553218
    Abstract: In a speaker recognition apparatus, audio features are extracted from a received recognition speech signal, and first order Gaussian mixture model (GMM) statistics are generated therefrom based on a universal background model that includes a plurality of speaker models. The first order GMM statistics are normalized with regard to a duration of the received speech signal. The deep neural network reduces a dimensionality of the normalized first order GMM statistics, and outputs a voiceprint corresponding to the recognition speech signal.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: February 4, 2020
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10535354
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: January 14, 2020
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10515640
    Abstract: An example apparatus for generating dialogue includes an audio receiver to receive audio data including speech. The apparatus also includes a verification score generator to generate a verification score based on the audio data. The apparatus further includes a user detector to detect that the verification score exceeds a lower threshold but does not exceed a higher threshold. The apparatus includes a dialogue generator to generate dialogue to solicit additional audio data to be used to generate an updated verification score in response to detecting that the verification score exceeds a lower threshold but does not exceed a higher threshold.
    Type: Grant
    Filed: November 8, 2017
    Date of Patent: December 24, 2019
    Assignee: Intel Corporation
    Inventors: Jonathan Huang, David Pearce, Willem M. Beltman
  • Patent number: 10453117
    Abstract: A system capable of performing natural language understanding (NLU) using different application domains in parallel. A model takes incoming query text and determines a list of potential supplemental intent categories corresponding to the text. Supplemental applications within those categories are then identified as likely candidates for responding to the query. Application specific domains, including NLU components for the particular supplemental applications, are then activated and process the query text in parallel. Further, certain system default domains may also process incoming queries substantially in parallel with the supplemental applications. The different results are scored and ranked to determine highest scoring NLU results.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: October 22, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Simon Peter Reavely, Rohit Prasad, Imre Attila Kiss, Manoj Sindhwani
  • Patent number: 10447315
    Abstract: In one embodiment, a system provides for optimizing an error rate of data through a communication channel. The system includes a data generator operable to generate a training sequence as a Markov code, and to propagate the training sequence through the communication channel. The system also includes a Soft Output Viterbi Algorithm (SOVA) detector operable to estimate data values of the training sequence after propagation through the communication channel. The system also includes an optimizer operable to compare the estimated data values to the generated training sequence, to determine an error rate based on the comparison, and to change the training sequence based on the Markov code to lower the error rate of the data through the communication channel.
    Type: Grant
    Filed: August 15, 2017
    Date of Patent: October 15, 2019
    Assignee: Seagate Technologies LLC
    Inventor: Raman Venkataramani