Update Patterns Patents (Class 704/244)
  • Patent number: 11069337
    Abstract: A voice-content control device includes a voice classifying unit configured to analyze a voice spoken by a user and acquired by a voice acquiring unit to classify the voice as either one of a first voice or a second voice, a process executing unit configured to analyze the acquired voice to execute processing required by the user, and a voice-content generating unit configured to generate, based on content of the executed processing, output sentence that is text data for a voice to be output to the user, wherein the voice-content generating unit is further configured to generate a first output sentence as the output sentence when the analyzed voice has been classified as the first voice, and generate a second output sentence in which information is omitted as compared to the first output sentence as the output sentence when the analyzed voice has been classified as the second voice.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: July 20, 2021
    Assignee: JVC KENWOOD Corporation
    Inventor: Tatsumi Naganuma
  • Patent number: 11043223
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: June 19, 2020
    Date of Patent: June 22, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventor: Qing Ling
  • Patent number: 11024287
    Abstract: A method, a device, and a storage medium for correcting an error in a speech recognition result are provided. The method includes: performing phonetic notation on a speech recognition result to be corrected, to obtain a pinyin corresponding to the speech recognition result; obtaining one or more candidate texts according to the pinyin, and determining an optimum candidate text from the one or more candidate texts; judging whether the optimum candidate text satisfies a preset condition; and determining the optimum candidate text as a corrected result of the speech recognition result to be corrected in response to satisfying the preset condition.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: June 1, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Shujie Yao
  • Patent number: 11021113
    Abstract: A camera module includes a camera imaging a region outside a rear end portion of a vehicle and a storage storing first and second dictionary information corresponding to a first area and a second area. When the camera take an image of a pedestrian and a detected latitude and longitude correspond to the first area, the camera module recognizes the image of the pedestrian based on the first dictionary information, outputs a first vehicle control signal based on a recognition result, and outputs a status the first dictionary information is used. When the camera takes an image of a pedestrian and a detected latitude and longitude correspond to the second area, the camera module recognizes the image of the pedestrian based on the second dictionary information, outputs a second vehicle control signal based on a recognition result, and outputs a status that the second dictionary information is used.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: June 1, 2021
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Teruo Sakamoto, Sangwon Kim
  • Patent number: 11017783
    Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Bicheng Jiang, Erik Visser
  • Patent number: 10957322
    Abstract: Provided is a speech processing apparatus including a word string estimation unit that estimates a word string equivalent to input speech among word strings included in dictionary data, and a calculation unit that calculates, for an element part constituting the word string estimated by the word string estimation unit, a certainty factor in which a content of the element part is equivalent to a content of a corresponding part in the input speech.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: March 23, 2021
    Assignee: SONY CORPORATION
    Inventors: Emiru Tsunoo, Toshiyuki Kumakura
  • Patent number: 10929606
    Abstract: A method for intelligent assistance includes identifying one or more insertion points within an input comprising text for providing additional information. A follow-up expression that includes at least a portion of the input and the additional information at the one or more insertion points is generated for clarifying or supplementing meaning of the input.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: February 23, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Justin C. Martineau, Avik Ray, Hongxia Jin
  • Patent number: 10909468
    Abstract: In one embodiment, a set of training data consisting of inliers may be obtained. A supervised classification model may be trained using the set of training data to identify outliers. The supervised classification model may be applied to generate an anomaly score for a data point. It may be determined whether the data point is an outlier based, at least in part, upon the anomaly score.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: February 2, 2021
    Assignee: Verizon Media Inc.
    Inventors: Makoto Yamada, Chao Qin, Hua Ouyang, Achint Thomas, Yi Chang
  • Patent number: 10885899
    Abstract: A method includes receiving initial training data associated with a trigger phrase in a device and training a voice model in the device using the initial training data. The voice model is used to identify a plurality of voice commands in the device initiated using the trigger phrase. Collection of additional training data from the plurality of voice commands and retraining of the voice model in the device are iteratively performed using the additional training data. A device includes a microphone and a processor to receive initial training data associated with a trigger phrase using the microphone, train a voice model device using the initial training data, use the voice model to identify a plurality of voice commands initiated using the trigger phrase, and iteratively collect additional training data from the plurality of voice commands and retrain the voice model in the device using the additional training data.
    Type: Grant
    Filed: October 9, 2018
    Date of Patent: January 5, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Boby Iyer, Amit Kumar Agrawal
  • Patent number: 10885920
    Abstract: A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel may include receiving audio stream data of the audio stream with speech from a speaker to be authenticated speaking with a second speaker. A voiceprint may be generated for each data chunk in the audio stream data divided into a plurality of data chunks. The voiceprint for each data chunk may be assessed as to whether the voiceprint has speech belonging to the speaker to be authenticated or to the second speaker using representative voiceprints of both speakers. An accumulated voiceprint may be generated using the verified data chunks with speech of the speaker to be authenticated. The accumulated voiceprint may be compared to the reference voiceprint of the speaker to be authenticated for authenticating the speaker speaking with the second speaker over the audio channel.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: January 5, 2021
    Assignee: NICE LTD
    Inventors: Alon Menahem Shoa, Roman Frenkel, Matan Keret
  • Patent number: 10878068
    Abstract: An authentication system, comprising: one or more inputs, for receiving biometric input signals from a user; a routing module, configured to selectively route the biometric input signals from the one or more inputs to one or more of a plurality of components, the plurality of components including a biometric authentication module, for processing the biometric input signals and generating an authentication result; and a security module, for receiving a control instruction for the routing module, determining whether or not the control instruction complies with one or more rules, and controlling the routing module based on the control instruction responsive to a determination that the control instruction complies with the one or more rules.
    Type: Grant
    Filed: August 3, 2017
    Date of Patent: December 29, 2020
    Assignee: Cirrus Logic, Inc.
    Inventors: Ryan Roberts, Michael Page
  • Patent number: 10831442
    Abstract: An approach is provided that receives, from a user, an amalgamation at a digital assistant. The amalgamation includes one or more words spoken by the user that are captured by a digital microphone and a set of digital images corresponding to one or more gestures that are performed by the user with the digital images captured by a digital camera. The system then determines an action that is responsive to the amalgamation and then performs the determined action.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Jeremy R. Fox, Gregory J. Boss, Kelley Anders, Sarbajit K. Rakshit
  • Patent number: 10826857
    Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives a message from a client device. The program further determines a language from a plurality of languages associated with the message. The program also determines a model from a plurality of models that corresponds to the determined language. Based on the determined model, the program further determines a function from a plurality of functions provided by a computing device that is associated with the message. The program also sends the computing device a request to perform the function.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventors: Christopher Trudeau, John Dietz, Amanda Casari, Richard Puckett
  • Patent number: 10818299
    Abstract: A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: October 27, 2020
    Assignee: Nuance Communications, Inc.
    Inventors: David Jaramillo, Gerald M. McCobb
  • Patent number: 10770062
    Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: September 8, 2020
    Assignee: INTUIT INC.
    Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
  • Patent number: 10762904
    Abstract: A method of operating an electronic device and an electronic device thereof are provided. The method includes receiving a first voice signal of a first user, authenticating whether the first user has authority to control the electronic device, based on the first voice signal, and determining an instruction corresponding to the first voice signal based on an authentication result and controlling the electronic device according to the instruction. The electronic device includes a receiver configured to receive a first voice signal of a first user and at least one processor configured to authenticate whether the first user has authority to control the electronic device based on the first voice signal, determine an instruction corresponding to the first voice signal, and control the electronic device according to the instruction.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: September 1, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anas Toma, Ahmad Abu Shariah, Hadi Jadallah
  • Patent number: 10714094
    Abstract: Technologies related to voiceprint recognition model construction are disclosed. In an implementation, a first voice input from a user is received. One or more predetermined keywords from the first voice input are detected. One or more voice segments corresponding to the one or more predetermined keywords are recorded. The voiceprint recognition model is trained based on the one or more voice segments. A second voice input is received from a user, and the user's identity is verified based on the second voice input using the voiceprint recognition model.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: July 14, 2020
    Assignee: Alibaba Group Holding Limited
    Inventor: Qing Ling
  • Patent number: 10652655
    Abstract: A volume and speech frequency level adjustment method, system, and computer program product include learning a preferred level and a characteristic of at least one of volume and speech frequency from a historical conference conversation, detecting a context characteristic of an ongoing conversation and an interaction of a user with a device, determining a cognitive state and a contextual situation of the user in relation to the ongoing conversation as a function of at least one of the context characteristic, a preferred level and a characteristic of the volume or the speech frequency, and the interaction, determining at least one factor to trigger an audio level modulation based on the function, and dynamically adjusting audio levels of the ongoing conversation for the user based on the at least one factor.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 12, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Komminist Weldemariam, Abdigani Diriye, Michael S. Gordon, Heike E. Riel
  • Patent number: 10636419
    Abstract: A chatbot learns a person's related “intents” when asking for information and thereafter, in response to an initial query, which the chatbot answers, the chatbot generates a secondary dialogue, either providing the person with additional information or inquiring as to whether the person wishes to know more about a subject. The chatbot may use an external trigger such as time, event, etc. and automatically generate a query or give information to the person without any initial query from the person.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: April 28, 2020
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Marie Kitajima, Masanori Omote
  • Patent number: 10629184
    Abstract: Cepstral variance normalization is described for audio feature extraction.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: April 21, 2020
    Assignee: Intel Corporation
    Inventors: Tobias Bocklet, Adam Marek
  • Patent number: 10592604
    Abstract: Techniques for inverse text normalization are provided. In some examples, speech input is received and a spoken-form text representation of the speech input is generated. The spoken-form text representation includes a token sequence. A feature representation is determined for the spoken-form text representation and a sequence of labels is determined based on the feature representation. The sequence of labels is assigned to the token sequence and specifies a plurality of edit operations to perform on the token sequence. Each edit operation of the plurality of edit operations corresponds to one of a plurality of predetermined types of edit operations. A written-form text representation of the speech input is generated by applying the plurality of edit operations to the token sequence in accordance with the sequence of labels. A task responsive to the speech input is performed using the generated written-form text representation.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: March 17, 2020
    Assignee: Apple Inc.
    Inventors: Ernest J. Pusateri, Bharat Ram Ambati, Elizabeth S. Brooks, Donald R. McAllaster, Venkatesh Nagesha, Ondrej Platek
  • Patent number: 10573296
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: February 25, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10565191
    Abstract: Systems and methods for utilizing a cognitive device are disclosed. A method includes: receiving, by a computer device, a query from a cognitive device; processing, by the computer device, the query to generate a processed query; transmitting, by the computer device, the processed query to a mobile device; receiving, by the computer device, an action query result from the mobile device based on the mobile device receiving the processed query and performing an action query; transmitting, by the computer device, the action query result to the cognitive device based on receiving the action query result.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Trent W. Boyer
  • Patent number: 10559299
    Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: February 11, 2020
    Assignee: Apprente LLC
    Inventors: Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
  • Patent number: 10553218
    Abstract: In a speaker recognition apparatus, audio features are extracted from a received recognition speech signal, and first order Gaussian mixture model (GMM) statistics are generated therefrom based on a universal background model that includes a plurality of speaker models. The first order GMM statistics are normalized with regard to a duration of the received speech signal. The deep neural network reduces a dimensionality of the normalized first order GMM statistics, and outputs a voiceprint corresponding to the recognition speech signal.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: February 4, 2020
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10535354
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: January 14, 2020
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10515640
    Abstract: An example apparatus for generating dialogue includes an audio receiver to receive audio data including speech. The apparatus also includes a verification score generator to generate a verification score based on the audio data. The apparatus further includes a user detector to detect that the verification score exceeds a lower threshold but does not exceed a higher threshold. The apparatus includes a dialogue generator to generate dialogue to solicit additional audio data to be used to generate an updated verification score in response to detecting that the verification score exceeds a lower threshold but does not exceed a higher threshold.
    Type: Grant
    Filed: November 8, 2017
    Date of Patent: December 24, 2019
    Assignee: Intel Corporation
    Inventors: Jonathan Huang, David Pearce, Willem M. Beltman
  • Patent number: 10453117
    Abstract: A system capable of performing natural language understanding (NLU) using different application domains in parallel. A model takes incoming query text and determines a list of potential supplemental intent categories corresponding to the text. Supplemental applications within those categories are then identified as likely candidates for responding to the query. Application specific domains, including NLU components for the particular supplemental applications, are then activated and process the query text in parallel. Further, certain system default domains may also process incoming queries substantially in parallel with the supplemental applications. The different results are scored and ranked to determine highest scoring NLU results.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: October 22, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Simon Peter Reavely, Rohit Prasad, Imre Attila Kiss, Manoj Sindhwani
  • Patent number: 10447315
    Abstract: In one embodiment, a system provides for optimizing an error rate of data through a communication channel. The system includes a data generator operable to generate a training sequence as a Markov code, and to propagate the training sequence through the communication channel. The system also includes a Soft Output Viterbi Algorithm (SOVA) detector operable to estimate data values of the training sequence after propagation through the communication channel. The system also includes an optimizer operable to compare the estimated data values to the generated training sequence, to determine an error rate based on the comparison, and to change the training sequence based on the Markov code to lower the error rate of the data through the communication channel.
    Type: Grant
    Filed: August 15, 2017
    Date of Patent: October 15, 2019
    Assignee: Seagate Technologies LLC
    Inventor: Raman Venkataramani
  • Patent number: 10438593
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: July 22, 2015
    Date of Patent: October 8, 2019
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10438585
    Abstract: A voice recording device that connects/is connected to a network, comprising a voice recording circuit that acquires voice and records the acquired voice as a voice file, a transmission circuit that transmits the voice file to a network, and a control circuit, the control circuit including an information extraction section that extracts associated information that has been associated with the voice file, and a display that displays the associated information associated with a voice data file.
    Type: Grant
    Filed: April 29, 2017
    Date of Patent: October 8, 2019
    Assignee: Olympus Corporation
    Inventors: Kenta Yumoto, Takafumi Onishi, Kazushi Fujitani, Ryusuke Hamakawa
  • Patent number: 10410628
    Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.
    Type: Grant
    Filed: June 11, 2018
    Date of Patent: September 10, 2019
    Assignee: INTUIT, INC.
    Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
  • Patent number: 10410627
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: September 10, 2019
    Assignee: Google LLC
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno Mengibar
  • Patent number: 10403288
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speaker diarization are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The actions further include determining that the audio data includes an utterance of a predefined hotword spoken by a first speaker. The actions further include identifying a first portion of the audio data that includes speech from the first speaker. The actions further include identifying a second portion of the audio data that includes speech from a second, different speaker. The actions further include transmitting the first portion of the audio data that includes speech from the first speaker and suppressing transmission of the second portion of the audio data that includes speech from the second, different speaker.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: September 3, 2019
    Assignee: Google LLC
    Inventors: Aleksandar Kracun, Richard Cameron Rose
  • Patent number: 10295965
    Abstract: A vector of actual sensor values is received. A maturity of a model is determined and the maturity is defined for sensors. A function that translates model maturity to model range inhibition measure is determined. A model range inhibition (MRI) measure is determined. An MRI limit based upon the MRI measure is determined. The received vector is compared to the MRI limit and the model is selectively changed based upon the comparing. In other aspects, vectors are received having actual values of driver and response sensors. A function that provides a set of boundaries between acceptable observations and unacceptable observations is also determined. Measures of similarity between vectors are determined. The measures of similarity and the function are compared and the model is selectively changed based upon the comparing.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: May 21, 2019
    Assignee: GE INTELLIGENT PLATFORMS, INC
    Inventors: Devang Jagdish Gandhi, James Paul Herzog
  • Patent number: 10204620
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10204621
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and a hybrid estimation technique, transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a nonlinear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10192548
    Abstract: An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.
    Type: Grant
    Filed: June 2, 2017
    Date of Patent: January 29, 2019
    Assignee: Google Technology Holdings LLC
    Inventors: Joel A. Clark, Tenkasi V. Ramabadran, Mark A. Jasiuk
  • Patent number: 10192554
    Abstract: A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: January 29, 2019
    Assignee: Sorenson IP Holdings, LLC
    Inventors: Kenneth Boehme, Michael Holm, Shane Roylance
  • Patent number: 10152975
    Abstract: A method, device, system, and computer medium for providing interactive advertising are provided. For example, a device may request an advertisement from a remote server, receive the advertisement, receive a response from a user who is listening and/or watching the advertisement, and transmit the response to the server for further action. The user may input a response by speaking. A server may receive an advertisement request from the device, select an advertisement based on pre-defined one or more criteria, transmit the selected advertisement to the device for play, receive from the device a response to the selected advertisement, and then perform an action corresponding to the received response.
    Type: Grant
    Filed: January 5, 2015
    Date of Patent: December 11, 2018
    Assignee: XAPPMEDIA, INC.
    Inventors: Patrick B. Higbie, John P. Kelvie, Michael M. Myers, Franklin D. Raines
  • Patent number: 10152973
    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.
    Type: Grant
    Filed: November 16, 2015
    Date of Patent: December 11, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Bjorn Hoffmeister, Hugh Evan Secker-Walker, Jeffrey Cornelius O'Neill
  • Patent number: 10146765
    Abstract: A text prediction engine, a system comprising a text prediction engine, and a method for generating sequence predictions. The text prediction engine, system and method generate multiple sequence predictions based on evidence sources and models, with each sequence prediction having a sequence and associated probability estimate.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: December 4, 2018
    Assignee: Touchtype Ltd.
    Inventors: Benjamin Medlock, Douglas Alexander Harper Orr
  • Patent number: 10140976
    Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: November 27, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
  • Patent number: 10127927
    Abstract: A method for emotion or speaking style recognition and/or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion/speaking style dependent information, and use the transformed data for clustering and/or classification to discover speech with similar emotion or speaking style.
    Type: Grant
    Filed: June 18, 2015
    Date of Patent: November 13, 2018
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
  • Patent number: 10083696
    Abstract: A method for determining user liveness is provided that includes calculating, by a computing device, a spectral property difference between voice biometric data captured from a user and user record voice biometric data. The user and the computing device constitute a user-computing device pair, and the voice biometric data is captured by the computing device during a verification transaction. Moreover, the method includes inputting the spectral property difference into a machine learning algorithm, calculating an output score with the machine learning algorithm, and determining the voice biometric data was captured from a live user when the output score satisfies a threshold score.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: September 25, 2018
    Assignee: DAON HOLDINGS LIMITED
    Inventor: Raphael Blouet
  • Patent number: 10079687
    Abstract: The embodiments herein provide a method and system for password recovery using Fuzzy logic. The system includes a receiving module, a validation module, an authentication module, a display module, a memory module, and a network interface. The system uses a phonetic algorithm such as Soundex algorithm for enabling the password recovery process. The user credentials received through the receiving module is validated with the validation module at the time of accessing the application. The authentication module is configured to authenticate the user using a fuzzy logic derived from a phonetic algorithm, by matching the answers of the user with the stored answers to compute a score which is compared with a threshold score. The user is enabled to unlock the user device when the computed validation score is greater than the threshold score.
    Type: Grant
    Filed: April 12, 2016
    Date of Patent: September 18, 2018
    Assignee: ILANTUS TECHNOLOGIES PVT. LTD.
    Inventors: Ashutosh Kumar Mishra, Saurav Sharma, Deepika Kuntar
  • Patent number: 9997157
    Abstract: Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.
    Type: Grant
    Filed: May 16, 2014
    Date of Patent: June 12, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Murat Akbacak, Dilek Z. Hakkani-Tur, Gokhan Tur, Larry P. Heck, Benoit Dumoulin
  • Patent number: 9984678
    Abstract: Various technologies described herein pertain to adapting a speech recognizer to input speech data. A first linear transform can be selected from a first set of linear transforms based on a value of a first variability source corresponding to the input speech data, and a second linear transform can be selected from a second set of linear transforms based on a value of a second variability source corresponding to the input speech data. The linear transforms in the first and second sets can compensate for the first variability source and the second variability source, respectively. Moreover, the first linear transform can be applied to the input speech data to generate intermediate transformed speech data, and the second linear transform can be applied to the intermediate transformed speech data to generate transformed speech data. Further, speech can be recognized based on the transformed speech data to obtain a result.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: May 29, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Lewis Seltzer, Alejandro Acero
  • Patent number: 9959862
    Abstract: A speech recognition apparatus based on a deep-neural-network (DNN) sound model includes a memory and a processor. As the processor executes a program stored in the memory, the processor generates sound-model state sets corresponding to a plurality of pieces of set training speech data included in multi-set training speech data, generates a multi-set state cluster from the sound-model state sets, and sets the multi-set training speech data as an input node and the multi-set state cluster as output nodes so as to learn a DNN structured parameter.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: May 1, 2018
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Byung Ok Kang, Jeon Gue Park, Hwa Jeon Song, Yun Keun Lee, Eui Sok Chung
  • Patent number: 9953636
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: October 9, 2015
    Date of Patent: April 24, 2018
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno Mengibar