Update Patterns Patents (Class 704/244)
  • Patent number: 11367443
    Abstract: Disclosed is an electronic device and a method for controlling the electronic device. The electronic device includes: a microphone, a communication interface, a memory for storing at least one instruction, and a processor configured to execute the at least one instruction to: determine whether a user is present around the electronic device based on voice data of the user obtained via the microphone, determine a device group including the electronic device and at least one other electronic device present around the electronic device, identify at least one device from the device group as a hub device to perform a voice recognition, and based on identifying the electronic device as the hub device, obtain, through the communication interface, a voice data of the user from one or more of the at least one other electronic device, and perform the voice recognition.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: June 21, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sangwon Ahn, Seongil Hahm, Jeongin Kim, Seongho Byeon, Jaesick Shin, Junsik Jeong
  • Patent number: 11335329
    Abstract: Performance of Automatic Speech Recognition (ASR) for robustness against real world noises and channel distortions is critical. Embodiments herein provide method and system for generating synthetic multi-conditioned data sets for additive noise and channel distortion for training multi-conditioned acoustic models for robust ASR. The method provides a generative noise model generating plurality of types of noise signals for additive noise based on weighted linear combination of plurality of noise basis signals and channel distortion based on estimated channel responses. The generative noise model is a parametric model, wherein basis function selection, number of basis functions to be combined linearly and weightages to be applied to the combinations is tunable, thereby enabling generation of wide variety of noise signals. Further, the noise signals are added to set of training speech utterances under set of constraints providing the multi-conditioned data sets, imitating real world effects.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: May 17, 2022
    Assignee: Tata Consultancy Services Limited
    Inventors: Meetkumar Hemakshu Soni, Sonal Joshi, Ashish Panda
  • Patent number: 11314942
    Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: April 26, 2022
    Assignee: Interactions LLC
    Inventors: Ethan Selfridge, Michael Johnston, Robert Lifgren, James Dreher, John Leonard
  • Patent number: 11302325
    Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: April 12, 2022
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Marie Kitajima, Masanori Omote
  • Patent number: 11227606
    Abstract: A compact, self-authenticating, and speaker-verifiable record of an audio communication involving one or more persons comprises a record, encoded on a non-transitory, computer-readable medium, that consists essentially of: a voiceprint for each person whose voice is encoded in the record; a plurality of transcription records, where each transcription record consists essentially of a computer-generated speech-to-text decoding of an utterance and voiceprint associating information that associates a speaker of the utterance with one of the voiceprints stored in the record; and self-authenticating information sufficient to determine whether any of the information encoded in the communication record has been altered.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: January 18, 2022
    Assignee: Medallia, Inc.
    Inventors: Wayne Ramprashad, David Garrod
  • Patent number: 11151981
    Abstract: A computer implemented method, apparatus, and computer program product for a sound system. Speech recognition is performed on input audio data comprising speech input to a sound system. Speech recognition is additionally performed on at least one instance of output audio data comprising speech reproduced by one or more audio speakers of the sound system. A difference between a result of speech recognition performed on the input audio data and a result of speech recognition performed on an instance of corresponding output audio data is determined. The quality of the reproduced speech is determined as unsatisfactory when the difference is greater than or equal to a threshold. A corrective action may be performed, to improve the quality of the speech reproduced by the sound system, if it is determined that the speech quality of the reproduced sound is unsatisfactory.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Alexander John Naylor-Teece, Andrew James Dunnings, Oliver Paul Masters
  • Patent number: 11145309
    Abstract: An apparatus includes processor(s) to: use an acoustic model to generate a first set of probabilities of speech sounds uttered within speech audio; derive at least a first candidate word most likely spoken in the speech audio using the first set; analyze the first set to derive a degree of uncertainty therefor; compare the degree of uncertainty to a threshold; in response to at least the degree of uncertainty being less than the threshold, select the first candidate word as a next word most likely spoken in the speech audio; in response to at least the degree of uncertainty being greater than the threshold, select, as the next word most likely spoken in the speech audio, a second candidate word indicated as being most likely spoken based on a second set of probabilities generated by a language model; and add the next word most likely spoken to a transcript.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: October 12, 2021
    Assignee: SAS INSTITUTE INC.
    Inventor: Xu Yang
  • Patent number: 11145314
    Abstract: Embodiments of the present disclosure provide a method and apparatus for voice identification, a device and a computer readable storage medium. The method may include: for an inputted voice signal, obtaining a first piece of decoded acoustic information by a first acoustic model and obtaining a second piece of decoded acoustic information by a second acoustic model, where the second acoustic model being generated by joint modeling of acoustic model and language model. The method may further include determining a first group of candidate identification results based on the first piece of decoded acoustic information, determining a second group of candidate identification results based on the second piece of decoded acoustic information, and then determining a final identification result for the voice signal based on the first group of candidate identification results and the second group of candidate identification results.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: October 12, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Xingyuan Peng, Junyao Shao, Lei Jia
  • Patent number: 11087743
    Abstract: In some implementations, an utterance is determined to include a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword. In response to determining that an utterance includes a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword, at least a portion of the utterance is stored as a new sample. A second set of samples of the particular user speaking the utterance is obtained, where the second set of samples includes the new sample and less than all the samples in the first set of samples. A second utterance is determined to include the particular user speaking the hotword based at least on the second set of samples of the user speaking the hotword.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: August 10, 2021
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Diego Melendo Casado
  • Patent number: 11087741
    Abstract: Embodiments of the present disclosure include methods, apparatuses, devices, and computer readable storage mediums for processing far-field environmental noise. The method can comprise processing collected far-field environmental noise to a noise segment in a predetermined format. The method can further comprise establishing a far-field voice recognition model based on the noise segment and a near-field voice segment; and determining validity of the noise segment based on the far-field voice recognition model. The solution of the present disclosure can optimize anti-noise performance of the far-field voice recognition model by differentiated training of noise in different user scenarios of a far-field voice recognition product.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: August 10, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Jianwei Sun, Chao Li, Xin Li, Weixin Zhu, Ming Wen
  • Patent number: 11074909
    Abstract: Provided are a device for recognizing a speech input including a named entity from a user and an operating method thereof. The device is configured to: generate a weighted finite state transducer model by using a vocabulary list including a plurality of named entities; obtain a first string from a speech input received from a user, by using a first decoding model; obtain a second string by using a second decoding model that uses the weighted finite state transducer model, the second string including a word sequence, which corresponds to at least one named entity, and an unrecognized word sequence not identified as a named entity; and output a text corresponding to the speech input by substituting the unrecognized word sequence of the second string with a word sequence included in the first string.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: July 27, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kyungmin Lee, Youngho Han, Sangyoon Kim, Donguk Jung, Aahwan Kudumula, Changwoo Han
  • Patent number: 11069337
    Abstract: A voice-content control device includes a voice classifying unit configured to analyze a voice spoken by a user and acquired by a voice acquiring unit to classify the voice as either one of a first voice or a second voice, a process executing unit configured to analyze the acquired voice to execute processing required by the user, and a voice-content generating unit configured to generate, based on content of the executed processing, output sentence that is text data for a voice to be output to the user, wherein the voice-content generating unit is further configured to generate a first output sentence as the output sentence when the analyzed voice has been classified as the first voice, and generate a second output sentence in which information is omitted as compared to the first output sentence as the output sentence when the analyzed voice has been classified as the second voice.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: July 20, 2021
    Assignee: JVC KENWOOD Corporation
    Inventor: Tatsumi Naganuma
  • Patent number: 11043223
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: June 19, 2020
    Date of Patent: June 22, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventor: Qing Ling
  • Patent number: 11021113
    Abstract: A camera module includes a camera imaging a region outside a rear end portion of a vehicle and a storage storing first and second dictionary information corresponding to a first area and a second area. When the camera take an image of a pedestrian and a detected latitude and longitude correspond to the first area, the camera module recognizes the image of the pedestrian based on the first dictionary information, outputs a first vehicle control signal based on a recognition result, and outputs a status the first dictionary information is used. When the camera takes an image of a pedestrian and a detected latitude and longitude correspond to the second area, the camera module recognizes the image of the pedestrian based on the second dictionary information, outputs a second vehicle control signal based on a recognition result, and outputs a status that the second dictionary information is used.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: June 1, 2021
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Teruo Sakamoto, Sangwon Kim
  • Patent number: 11024287
    Abstract: A method, a device, and a storage medium for correcting an error in a speech recognition result are provided. The method includes: performing phonetic notation on a speech recognition result to be corrected, to obtain a pinyin corresponding to the speech recognition result; obtaining one or more candidate texts according to the pinyin, and determining an optimum candidate text from the one or more candidate texts; judging whether the optimum candidate text satisfies a preset condition; and determining the optimum candidate text as a corrected result of the speech recognition result to be corrected in response to satisfying the preset condition.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: June 1, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Shujie Yao
  • Patent number: 11017783
    Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
  • Patent number: 10957322
    Abstract: Provided is a speech processing apparatus including a word string estimation unit that estimates a word string equivalent to input speech among word strings included in dictionary data, and a calculation unit that calculates, for an element part constituting the word string estimated by the word string estimation unit, a certainty factor in which a content of the element part is equivalent to a content of a corresponding part in the input speech.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: March 23, 2021
    Assignee: SONY CORPORATION
    Inventors: Emiru Tsunoo, Toshiyuki Kumakura
  • Patent number: 10929606
    Abstract: A method for intelligent assistance includes identifying one or more insertion points within an input comprising text for providing additional information. A follow-up expression that includes at least a portion of the input and the additional information at the one or more insertion points is generated for clarifying or supplementing meaning of the input.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: February 23, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Justin C. Martineau, Avik Ray, Hongxia Jin
  • Patent number: 10909468
    Abstract: In one embodiment, a set of training data consisting of inliers may be obtained. A supervised classification model may be trained using the set of training data to identify outliers. The supervised classification model may be applied to generate an anomaly score for a data point. It may be determined whether the data point is an outlier based, at least in part, upon the anomaly score.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: February 2, 2021
    Assignee: Verizon Media Inc.
    Inventors: Makoto Yamada, Chao Qin, Hua Ouyang, Achint Thomas, Yi Chang
  • Patent number: 10885920
    Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: January 5, 2021
    Assignee: NICE LTD
    Inventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
  • Patent number: 10885899
    Abstract: A method includes receiving initial training data associated with a trigger phrase in a device and training a voice model in the device using the initial training data. The voice model is used to identify a plurality of voice commands in the device initiated using the trigger phrase. Collection of additional training data from the plurality of voice commands and retraining of the voice model in the device are iteratively performed using the additional training data. A device includes a microphone and a processor to receive initial training data associated with a trigger phrase using the microphone, train a voice model device using the initial training data, use the voice model to identify a plurality of voice commands initiated using the trigger phrase, and iteratively collect additional training data from the plurality of voice commands and retrain the voice model in the device using the additional training data.
    Type: Grant
    Filed: October 9, 2018
    Date of Patent: January 5, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Boby Iyer, Amit Kumar Agrawal
  • Patent number: 10878068
    Abstract: An authentication system, comprising: one or more inputs, for receiving biometric input signals from a user; a routing module, configured to selectively route the biometric input signals from the one or more inputs to one or more of a plurality of components, the plurality of components including a biometric authentication module, for processing the biometric input signals and generating an authentication result; and a security module, for receiving a control instruction for the routing module, determining whether or not the control instruction complies with one or more rules, and controlling the routing module based on the control instruction responsive to a determination that the control instruction complies with the one or more rules.
    Type: Grant
    Filed: August 3, 2017
    Date of Patent: December 29, 2020
    Assignee: Cirrus Logic, Inc.
    Inventors: Ryan Roberts, Michael Page
  • Patent number: 10831442
    Abstract: An approach is provided that receives, from a user, an amalgamation at a digital assistant. The amalgamation includes one or more words spoken by the user that are captured by a digital microphone and a set of digital images corresponding to one or more gestures that are performed by the user with the digital images captured by a digital camera. The system then determines an action that is responsive to the amalgamation and then performs the determined action.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Jeremy R. Fox, Gregory J. Boss, Kelley Anders, Sarbajit K. Rakshit
  • Patent number: 10826857
    Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives a message from a client device. The program further determines a language from a plurality of languages associated with the message. The program also determines a model from a plurality of models that corresponds to the determined language. Based on the determined model, the program further determines a function from a plurality of functions provided by a computing device that is associated with the message. The program also sends the computing device a request to perform the function.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventors: Christopher Trudeau, John Dietz, Amanda Casari, Richard Puckett
  • Patent number: 10818299
    Abstract: A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: October 27, 2020
    Assignee: Nuance Communications, Inc.
    Inventors: David Jaramillo, Gerald M. McCobb
  • Patent number: 10770062
    Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: September 8, 2020
    Assignee: INTUIT INC.
    Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
  • Patent number: 10762904
    Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: September 1, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
  • Patent number: 10714094
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: July 14, 2020
    Assignee: Alibaba Group Holding Limited
    Inventor: Qing Ling
  • Patent number: 10652655
    Abstract: A volume and speech frequency level adjustment method, system, and computer program product include learning a preferred level and a characteristic of at least one of volume and speech frequency from a historical conference conversation, detecting a context characteristic of an ongoing conversation and an interaction of a user with a device, determining a cognitive state and a contextual situation of the user in relation to the ongoing conversation as a function of at least one of the context characteristic, a preferred level and a characteristic of the volume or the speech frequency, and the interaction, determining at least one factor to trigger an audio level modulation based on the function, and dynamically adjusting audio levels of the ongoing conversation for the user based on the at least one factor.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 12, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Komminist Weldemariam, Abdigani Diriye, Michael S. Gordon, Heike E. Riel
  • Patent number: 10636419
    Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: April 28, 2020
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Marie Kitajima, Masanori Omote
  • Patent number: 10629184
    Abstract: Cepstral variance normalization is described for audio feature extraction.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: April 21, 2020
    Assignee: Intel Corporation
    Inventors: Tobias Bocklet, Adam Marek
  • Patent number: 10592604
    Abstract: Techniques for inverse text normalization are provided. In some examples, speech input is received and a spoken-form text representation of the speech input is generated. The spoken-form text representation includes a token sequence. A feature representation is determined for the spoken-form text representation and a sequence of labels is determined based on the feature representation. The sequence of labels is assigned to the token sequence and specifies a plurality of edit operations to perform on the token sequence. Each edit operation of the plurality of edit operations corresponds to one of a plurality of predetermined types of edit operations. A written-form text representation of the speech input is generated by applying the plurality of edit operations to the token sequence in accordance with the sequence of labels. A task responsive to the speech input is performed using the generated written-form text representation.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: March 17, 2020
    Assignee: Apple Inc.
    Inventors: Ernest J. Pusateri, Bharat Ram Ambati, Elizabeth S. Brooks, Donald R. McAllaster, Venkatesh Nagesha, Ondrej Platek
  • Patent number: 10573296
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: February 25, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10565191
    Abstract: Systems and methods for utilizing a cognitive device are disclosed. A method includes: receiving, by a computer device, a query from a cognitive device; processing, by the computer device, the query to generate a processed query; transmitting, by the computer device, the processed query to a mobile device; receiving, by the computer device, an action query result from the mobile device based on the mobile device receiving the processed query and performing an action query; transmitting, by the computer device, the action query result to the cognitive device based on receiving the action query result.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Trent W. Boyer
  • Patent number: 10559299
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: February 11, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10553218
    Abstract: In a speaker recognition apparatus, audio features are extracted from a received recognition speech signal, and first order Gaussian mixture model (GMM) statistics are generated therefrom based on a universal background model that includes a plurality of speaker models. The first order GMM statistics are normalized with regard to a duration of the received speech signal. The deep neural network reduces a dimensionality of the normalized first order GMM statistics, and outputs a voiceprint corresponding to the recognition speech signal.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: February 4, 2020
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10535354
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: January 14, 2020
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10515640
    Abstract: An example apparatus for generating dialogue includes an audio receiver to receive audio data including speech. The apparatus also includes a verification score generator to generate a verification score based on the audio data. The apparatus further includes a user detector to detect that the verification score exceeds a lower threshold but does not exceed a higher threshold. The apparatus includes a dialogue generator to generate dialogue to solicit additional audio data to be used to generate an updated verification score in response to detecting that the verification score exceeds a lower threshold but does not exceed a higher threshold.
    Type: Grant
    Filed: November 8, 2017
    Date of Patent: December 24, 2019
    Assignee: Intel Corporation
    Inventors: Jonathan Huang, David Pearce, Willem M. Beltman
  • Patent number: 10453117
    Abstract: A system capable of performing natural language understanding (NLU) using different application domains in parallel. A model takes incoming query text and determines a list of potential supplemental intent categories corresponding to the text. Supplemental applications within those categories are then identified as likely candidates for responding to the query. Application specific domains, including NLU components for the particular supplemental applications, are then activated and process the query text in parallel. Further, certain system default domains may also process incoming queries substantially in parallel with the supplemental applications. The different results are scored and ranked to determine highest scoring NLU results.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: October 22, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Simon Peter Reavely, Rohit Prasad, Imre Attila Kiss, Manoj Sindhwani
  • Patent number: 10447315
    Abstract: In one embodiment, a system provides for optimizing an error rate of data through a communication channel. The system includes a data generator operable to generate a training sequence as a Markov code, and to propagate the training sequence through the communication channel. The system also includes a Soft Output Viterbi Algorithm (SOVA) detector operable to estimate data values of the training sequence after propagation through the communication channel. The system also includes an optimizer operable to compare the estimated data values to the generated training sequence, to determine an error rate based on the comparison, and to change the training sequence based on the Markov code to lower the error rate of the data through the communication channel.
    Type: Grant
    Filed: August 15, 2017
    Date of Patent: October 15, 2019
    Assignee: Seagate Technologies LLC
    Inventor: Raman Venkataramani
  • Patent number: 10438585
    Abstract: A voice recording device that connects/is connected to a network, comprising a voice recording circuit that acquires voice and records the acquired voice as a voice file, a transmission circuit that transmits the voice file to a network, and a control circuit, the control circuit including an information extraction section that extracts associated information that has been associated with the voice file, and a display that displays the associated information associated with a voice data file.
    Type: Grant
    Filed: April 29, 2017
    Date of Patent: October 8, 2019
    Assignee: Olympus Corporation
    Inventors: Kenta Yumoto, Takafumi Onishi, Kazushi Fujitani, Ryusuke Hamakawa
  • Patent number: 10438593
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: July 22, 2015
    Date of Patent: October 8, 2019
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10410628
    Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.
    Type: Grant
    Filed: June 11, 2018
    Date of Patent: September 10, 2019
    Assignee: INTUIT, INC.
    Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
  • Patent number: 10410627
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: September 10, 2019
    Assignee: Google LLC
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno Mengibar
  • Patent number: 10403288
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speaker diarization are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The actions further include determining that the audio data includes an utterance of a predefined hotword spoken by a first speaker. The actions further include identifying a first portion of the audio data that includes speech from the first speaker. The actions further include identifying a second portion of the audio data that includes speech from a second, different speaker. The actions further include transmitting the first portion of the audio data that includes speech from the first speaker and suppressing transmission of the second portion of the audio data that includes speech from the second, different speaker.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: September 3, 2019
    Assignee: Google LLC
    Inventors: Aleksandar Kracun, Richard Cameron Rose
  • Patent number: 10295965
    Abstract: A vector of actual sensor values is received. A maturity of a model is determined and the maturity is defined for sensors. A function that translates model maturity to model range inhibition measure is determined. A model range inhibition (MRI) measure is determined. An MRI limit based upon the MRI measure is determined. The received vector is compared to the MRI limit and the model is selectively changed based upon the comparing. In other aspects, vectors are received having actual values of driver and response sensors. A function that provides a set of boundaries between acceptable observations and unacceptable observations is also determined. Measures of similarity between vectors are determined. The measures of similarity and the function are compared and the model is selectively changed based upon the comparing.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: May 21, 2019
    Assignee: GE INTELLIGENT PLATFORMS, INC
    Inventors: Devang Jagdish Gandhi, James Paul Herzog
  • Patent number: 10204621
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and a hybrid estimation technique, transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a nonlinear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10204620
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10192548
    Abstract: An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.
    Type: Grant
    Filed: June 2, 2017
    Date of Patent: January 29, 2019
    Assignee: Google Technology Holdings LLC
    Inventors: Joel A. Clark, Tenkasi V. Ramabadran, Mark A. Jasiuk
  • Patent number: 10192554
    Abstract: A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: January 29, 2019
    Assignee: Sorenson IP Holdings, LLC
    Inventors: Kenneth Boehme, Michael Holm, Shane Roylance