Update Patterns Patents (Class 704/244)

Individualized hotword detection models

Patent number: 10438593

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.

Type: Grant

Filed: July 22, 2015

Date of Patent: October 8, 2019

Assignee: Google LLC

Inventor: Raziel Alvarez Guevara
Voice recording device and voice recording control method

Patent number: 10438585

Abstract: A voice recording device that connects/is connected to a network, comprising a voice recording circuit that acquires voice and records the acquired voice as a voice file, a transmission circuit that transmits the voice file to a network, and a control circuit, the control circuit including an information extraction section that extracts associated information that has been associated with the voice file, and a display that displays the associated information associated with a voice data file.

Type: Grant

Filed: April 29, 2017

Date of Patent: October 8, 2019

Assignee: Olympus Corporation

Inventors: Kenta Yumoto, Takafumi Onishi, Kazushi Fujitani, Ryusuke Hamakawa
Adjusting a ranking of information content of a software application based on feedback from a user

Patent number: 10410628

Abstract: Techniques are disclosed for adjusting a ranking of information content of a software application based on feedback from a user. One embodiment presented herein includes a method comprising receiving, at a computing device, an audio stream comprising audio of the user, the audio being indicative of feedback related to information content. The method further comprises analyzing the audio stream for paralinguistic information to determine an attribute of the user. The method further comprises adjusting a ranking of the information content based on at least one of the feedback and additional feedback and the determined attribute of the user.

Type: Grant

Filed: June 11, 2018

Date of Patent: September 10, 2019

Assignee: INTUIT, INC.

Inventors: Raymond Chan, Igor A. Podgorny, Benjamin Indyk
Automatic language model update

Patent number: 10410627

Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.

Type: Grant

Filed: March 15, 2018

Date of Patent: September 10, 2019

Assignee: Google LLC

Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno Mengibar
Speaker diarization

Patent number: 10403288

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speaker diarization are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The actions further include determining that the audio data includes an utterance of a predefined hotword spoken by a first speaker. The actions further include identifying a first portion of the audio data that includes speech from the first speaker. The actions further include identifying a second portion of the audio data that includes speech from a second, different speaker. The actions further include transmitting the first portion of the audio data that includes speech from the first speaker and suppressing transmission of the second portion of the audio data that includes speech from the second, different speaker.

Type: Grant

Filed: October 17, 2017

Date of Patent: September 3, 2019

Assignee: Google LLC

Inventors: Aleksandar Kracun, Richard Cameron Rose
Apparatus and method for model adaptation

Patent number: 10295965

Abstract: A vector of actual sensor values is received. A maturity of a model is determined and the maturity is defined for sensors. A function that translates model maturity to model range inhibition measure is determined. A model range inhibition (MRI) measure is determined. An MRI limit based upon the MRI measure is determined. The received vector is compared to the MRI limit and the model is selectively changed based upon the comparing. In other aspects, vectors are received having actual values of driver and response sensors. A function that provides a set of boundaries between acceptable observations and unacceptable observations is also determined. Measures of similarity between vectors are determined. The measures of similarity and the function are compared and the model is selectively changed based upon the comparing.

Type: Grant

Filed: September 6, 2013

Date of Patent: May 21, 2019

Assignee: GE INTELLIGENT PLATFORMS, INC

Inventors: Devang Jagdish Gandhi, James Paul Herzog
Adjusting a deep neural network acoustic model

Patent number: 10204620

Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.

Type: Grant

Filed: September 7, 2016

Date of Patent: February 12, 2019

Assignee: International Business Machines Corporation

Inventors: Xiaodong Cui, Vaibhava Goel
Adjusting a deep neural network acoustic model

Patent number: 10204621

Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and a hybrid estimation technique, transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a nonlinear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.

Type: Grant

Filed: September 7, 2016

Date of Patent: February 12, 2019

Assignee: International Business Machines Corporation

Inventors: Xiaodong Cui, Vaibhava Goel
Method and apparatus for evaluating trigger phrase enrollment

Patent number: 10192548

Abstract: An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.

Type: Grant

Filed: June 2, 2017

Date of Patent: January 29, 2019

Assignee: Google Technology Holdings LLC

Inventors: Joel A. Clark, Tenkasi V. Ramabadran, Mark A. Jasiuk
Transcription of communications using multiple speech recognition systems

Patent number: 10192554

Abstract: A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.

Type: Grant

Filed: February 26, 2018

Date of Patent: January 29, 2019

Assignee: Sorenson IP Holdings, LLC

Inventors: Kenneth Boehme, Michael Holm, Shane Roylance
Voice-based interactive content and user interface

Patent number: 10152975

Abstract: A method, device, system, and computer medium for providing interactive advertising are provided. For example, a device may request an advertisement from a remote server, receive the advertisement, receive a response from a user who is listening and/or watching the advertisement, and transmit the response to the server for further action. The user may input a response by speaking. A server may receive an advertisement request from the device, select an advertisement based on pre-defined one or more criteria, transmit the selected advertisement to the device for play, receive from the device a response to the selected advertisement, and then perform an action corresponding to the received response.

Type: Grant

Filed: January 5, 2015

Date of Patent: December 11, 2018

Assignee: XAPPMEDIA, INC.

Inventors: Patrick B. Higbie, John P. Kelvie, Michael M. Myers, Franklin D. Raines
Speech model retrieval in distributed speech recognition systems

Patent number: 10152973

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Type: Grant

Filed: November 16, 2015

Date of Patent: December 11, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Bjorn Hoffmeister, Hugh Evan Secker-Walker, Jeffrey Cornelius O'Neill
System and method for inputting text into electronic devices

Patent number: 10146765

Abstract: A text prediction engine, a system comprising a text prediction engine, and a method for generating sequence predictions. The text prediction engine, system and method generate multiple sequence predictions based on evidence sources and models, with each sequence prediction having a sequence and associated probability estimate.

Type: Grant

Filed: June 3, 2016

Date of Patent: December 4, 2018

Assignee: Touchtype Ltd.

Inventors: Benjamin Medlock, Douglas Alexander Harper Orr
Discriminative training of automatic speech recognition models with natural language processing dictionary for spoken language processing

Patent number: 10140976

Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.

Type: Grant

Filed: December 14, 2015

Date of Patent: November 27, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
Emotional speech processing

Patent number: 10127927

Abstract: A method for emotion or speaking style recognition and/or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion/speaking style dependent information, and use the transformed data for clustering and/or classification to discover speech with similar emotion or speaking style.

Type: Grant

Filed: June 18, 2015

Date of Patent: November 13, 2018

Assignee: Sony Interactive Entertainment Inc.

Inventors: Ozlem Kalinli-Akbacak, Ruxin Chen
Methods and systems for determining user liveness

Patent number: 10083696

Abstract: A method for determining user liveness is provided that includes calculating, by a computing device, a spectral property difference between voice biometric data captured from a user and user record voice biometric data. The user and the computing device constitute a user-computing device pair, and the voice biometric data is captured by the computing device during a verification transaction. Moreover, the method includes inputting the spectral property difference into a machine learning algorithm, calculating an output score with the machine learning algorithm, and determining the voice biometric data was captured from a live user when the output score satisfies a threshold score.

Type: Grant

Filed: March 7, 2017

Date of Patent: September 25, 2018

Assignee: DAON HOLDINGS LIMITED

Inventor: Raphael Blouet
System and method for password recovery using fuzzy logic

Patent number: 10079687

Abstract: The embodiments herein provide a method and system for password recovery using Fuzzy logic. The system includes a receiving module, a validation module, an authentication module, a display module, a memory module, and a network interface. The system uses a phonetic algorithm such as Soundex algorithm for enabling the password recovery process. The user credentials received through the receiving module is validated with the validation module at the time of accessing the application. The authentication module is configured to authenticate the user using a fuzzy logic derived from a phonetic algorithm, by matching the answers of the user with the stored answers to compute a score which is compared with a threshold score. The user is enabled to unlock the user device when the computed validation score is greater than the threshold score.

Type: Grant

Filed: April 12, 2016

Date of Patent: September 18, 2018

Assignee: ILANTUS TECHNOLOGIES PVT. LTD.

Inventors: Ashutosh Kumar Mishra, Saurav Sharma, Deepika Kuntar
Knowledge source personalization to improve language models

Patent number: 9997157

Abstract: Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.

Type: Grant

Filed: May 16, 2014

Date of Patent: June 12, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Murat Akbacak, Dilek Z. Hakkani-Tur, Gokhan Tur, Larry P. Heck, Benoit Dumoulin
Factored transforms for separable adaptation of acoustic models

Patent number: 9984678

Abstract: Various technologies described herein pertain to adapting a speech recognizer to input speech data. A first linear transform can be selected from a first set of linear transforms based on a value of a first variability source corresponding to the input speech data, and a second linear transform can be selected from a second set of linear transforms based on a value of a second variability source corresponding to the input speech data. The linear transforms in the first and second sets can compensate for the first variability source and the second variability source, respectively. Moreover, the first linear transform can be applied to the input speech data to generate intermediate transformed speech data, and the second linear transform can be applied to the intermediate transformed speech data to generate transformed speech data. Further, speech can be recognized based on the transformed speech data to obtain a result.

Type: Grant

Filed: March 23, 2012

Date of Patent: May 29, 2018

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Michael Lewis Seltzer, Alejandro Acero
Apparatus and method for recognizing speech based on a deep-neural-network (DNN) sound model

Patent number: 9959862

Abstract: A speech recognition apparatus based on a deep-neural-network (DNN) sound model includes a memory and a processor. As the processor executes a program stored in the memory, the processor generates sound-model state sets corresponding to a plurality of pieces of set training speech data included in multi-set training speech data, generates a multi-set state cluster from the sound-model state sets, and sets the multi-set training speech data as an input node and the multi-set state cluster as output nodes so as to learn a DNN structured parameter.

Type: Grant

Filed: June 20, 2016

Date of Patent: May 1, 2018

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Byung Ok Kang, Jeon Gue Park, Hwa Jeon Song, Yun Keun Lee, Eui Sok Chung
Automatic language model update

Patent number: 9953636

Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.

Type: Grant

Filed: October 9, 2015

Date of Patent: April 24, 2018

Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno Mengibar
Speech data recognition method, apparatus, and server for distinguishing regional accent

Patent number: 9928831

Abstract: A speech data recognition method, apparatus, and server are for distinguishing regional accent. The speech data recognition method includes: calculating a speech recognition confidence and/or a signal-to-noise ratio of the speech data, and screening a regional speech data from the speech data based on the speech recognition confidence and/or the signal-to-noise ratio of the speech dat; and determining a region to which the regional speech data belongs based on a regional attribute of the regional speech data. The regional speech data are automatically recognized from the mass speech data by calculating the speech recognition confidence, the signal-to-noise ratio of the speech data or the combination thereof, thereby avoiding manual labeling of the speech data and enhancing the efficiency of the speech data processing.

Type: Grant

Filed: December 18, 2014

Date of Patent: March 27, 2018

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Dan Su, Zhao Yin
Class-based discriminative training of speech models

Patent number: 9892726

Abstract: Features are disclosed for modifying a statistical model to more accurately discriminate between classes of input data. A subspace of the total model parameter space can be learned such that individual points in the subspace, corresponding to the various classes, are discriminative with respect to the classes. The subspace can be learned using an iterative process whereby an initial subspace is used to generate data and maximize an objective function. The objective function can correspond to maximizing the posterior probability of the correct class for a given input. The initial subspace, data, and objective function can be used to generate a new subspace that better discriminates between classes. The process may be repeated as desired. A model modified using such a subspace can be used to classify input data.

Type: Grant

Filed: December 17, 2014

Date of Patent: February 13, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Sri Venkata Surya Siva Rama Krishna Garimella, Spyridon Matsoukas, Ariya Rastrow, Bjorn Hoffmeister
System and method for near real-time identification and definition query

Patent number: 9852732

Abstract: A method of operating a communication system includes generating a transcript of at least a portion of a conversation between a plurality of users. The transcript includes a plurality of subsets of characters. The method further includes displaying the transcript on a plurality of communication devices, identifying an occurrence of at least one selected subset of characters from the plurality of subsets of characters, and querying a definition source for at least one definition for the selected subset of characters. The definition for the selected subset of characters is displayed on the plurality of communication devices.

Type: Grant

Filed: October 7, 2010

Date of Patent: December 26, 2017

Assignee: Avaya Inc.

Inventors: David L. Chavez, Larry J. Hardouin
Apparatus and method for verifying utterance in speech recognition system

Patent number: 9799350

Abstract: An apparatus and method for verifying an utterance based on multi-event detection information in a natural language speech recognition system. The apparatus includes a noise processor configured to process noise of an input speech signal, a feature extractor configured to extract features of speech data obtained through the noise processing, an event detector configured to detect events of the plurality of speech features occurring in the speech data using the noise-processed data and data of the extracted features, a decoder configured to perform speech recognition using a plurality of preset speech recognition models for the extracted feature data, and an utterance verifier configured to calculate confidence measurement values in units of words and sentences using information on the plurality of events detected by the event detector and a preset utterance verification model and perform utterance verification according to the calculated confidence measurement values.

Type: Grant

Filed: June 17, 2016

Date of Patent: October 24, 2017

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Jeom Ja Kang, Hwa Jeon Song, Jeon Gue Park, Hoon Chung
System for adapting speech recognition vocabulary

Patent number: 9779722

Abstract: A system and method for adapting a speech recognition and generation system. The system and method include providing a speech recognition and generation engine that processes speech received from a user and providing a dictionary adaptation module that adds out of vocabulary words to a baseline dictionary of the speech recognition and generation system. Words are added by extracting words that are encountered and adding out of vocabulary words to the baseline dictionary of the speech recognition and generation system.

Type: Grant

Filed: November 5, 2013

Date of Patent: October 3, 2017

Assignee: GM Global Technology Operations LLC

Inventors: Ron M. Hecht, Omer Tsimhoni, Timothy J. Grost
System and method for learning alternate pronunciations for speech recognition

Patent number: 9767792

Abstract: A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.

Type: Grant

Filed: October 12, 2016

Date of Patent: September 19, 2017

Assignee: Interactive Intelligence Group, Inc.

Inventors: Zhenhao Ge, Vivek Tyagi, Aravind Ganapathiraju, Ananth Nagaraja Iyer, Scott Allen Randal, Felix Immanuel Wyss
Methods and apparatus for use in speech recognition systems for identifying unknown words and for adding previously unknown words to vocabularies and grammars of speech recognition systems

Patent number: 9754586

Abstract: The present invention concerns methods and apparatus for identifying and assigning meaning to words not recognized by a vocabulary or grammar of a speech recognition system. In an embodiment of the invention, the word may be in an acoustic vocabulary of the speech recognition system, but may be unrecognized by an embedded grammar of a language model of the speech recognition system. In another embodiment of the invention, the word may not be recognized by any vocabulary associated with the speech recognition system. In embodiments of the invention, at least one hypothesis is generated for an utterance not recognized by the speech recognition system. If the at least one hypothesis meets at least one predetermined criterion, a sword or more corresponding to the at least one hypothesis is added to the vocabulary of the speech recognition system.

Type: Grant

Filed: June 5, 2008

Date of Patent: September 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Sabine Deligne, Ramesh A. Gopinath, Dimitri Kanevsky, Mahesh Viswanathan
Cloud based adaptive learning for distributed sensors

Patent number: 9721560

Abstract: A low power sound recognition sensor is configured to receive an analog signal that may contain a signature sound. Sound parameter information is extracted from the analog signal and compared to a sound parameter reference stored locally with the sound recognition sensor to detect when the signature sound is received in the analog signal. A trigger signal is generated when a signature sound is detected. A portion of the extracted sound parameter information is sent to a remote training location for adaptive training when a signature sound detection error occurs. An updated sound parameter reference from the remote training location is received in response to the adaptive training.

Type: Grant

Filed: November 2, 2015

Date of Patent: August 1, 2017

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Lin Sun, Wei Ma
Disambiguating words within a text segment

Patent number: 9684648

Abstract: Determining a subject type for an entity in a text segment. A text segment is selected, which includes one or more single-word or multi-word entities. Natural language processing is performed on the selected text segment to identify entities that constitute subjects of the selected text segment. One entity is selected. A variant annotation is associated with the selected entity. The variant annotation reflects multiple subject types for the selected entity and a value for each subject type. The most probable subject type is determined for the selected entity, based on a combination of natural language processing rules and dictionary listings. The value of the annotation is incremented for the subject type corresponding to the most probable subject type for the selected entity, so that the highest value of the annotation indicates the most probable subject type for the selected entity within the selected text segment.

Type: Grant

Filed: May 31, 2012

Date of Patent: June 20, 2017

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker
Systems and methods for tuning automatic speech recognition systems

Patent number: 9666181

Abstract: A tuning system for tuning a speech recognition system includes a transmitter for sending a user response to a speech recognition system. The user response is based at least in part on a test stimulus that may be generated by the control system. A receiver receives a recognized response from the speech recognition system; this recognized response is based at least in part on the associated user response. An adjustment module adjusts at least one parameter of the speech recognition system based at least in part on at least one of the test stimulus, the associated user response, and the recognized response.

Type: Grant

Filed: February 7, 2013

Date of Patent: May 30, 2017

Assignees: University of Florida Research Foundation, Inc., Cochlear Limited

Inventors: Lee S. Krause, Bonny Banerjee, Mark Skowronski, Rahul Shrivastav, Alice E. Holmes
Phonetic distance measurement system and related methods

Patent number: 9659559

Abstract: Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.

Type: Grant

Filed: June 25, 2009

Date of Patent: May 23, 2017

Assignee: ADACEL SYSTEMS, INC.

Inventor: Chang-Qing Shu
Language independent probabilistic content matching

Patent number: 9633001

Abstract: Content is received and compared against rules for identifying a type of content. Each rule has both segmented and unsegmented patterns. The content is matched against the patterns and assigned a confidence score that is higher if the content matches a segmented pattern and lower if the content matches an unsegmented pattern.

Type: Grant

Filed: June 23, 2015

Date of Patent: April 25, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mauktik Gandhi, Charles Lamanna, Vidyaraman Sankaranarayanan, Raimundo Pontes Filho
Speaker recognition

Patent number: 9626971

Abstract: Method for text-dependent Speaker Recognition using a speaker adapted Universal Background Model, wherein the speaker adapted Universal Background Model is a speaker adapted Hidden Markov Model comprising channel correction.

Type: Grant

Filed: September 20, 2013

Date of Patent: April 18, 2017

Assignee: Cirrus Logic International Semiconductor Ltd.

Inventors: Luis Buera Rodriguez, Carlos Vaquero Aviles-Casco, Marta Garcia Gomar, Antonio Miguel Artiaga
Method and system for improving responsiveness of a voice recognition system

Patent number: 9538114

Abstract: A system includes a voice converter converting a first voice command into a first electrical command and a command library having library contents. A language responsiveness module (LRM) stores the first electrical command in a temporary set when a first control command cannot be determined from the library contents. A voice prompt module receives a second voice command when the first control command cannot be determined from the library contents. The voice converter converts a second voice command into a second electrical command corresponding to the second voice command. The LRM compares the second electrical command to the command library. The LRM determines a second control command corresponding to the second electrical command in response to comparing the second voice command to the command library and stores the first voice command in the command library after determining the control command corresponding to the second voice command.

Type: Grant

Filed: March 15, 2013

Date of Patent: January 3, 2017

Assignee: The DIRECTV Group, Inc.

Inventors: John M. Han, Vlad Zarney, Vladislav Kolegayev, Becky P. Chen
Speech recognition server integration device that is an intermediate module to relay between a terminal module and speech recognition server and speech recognition server integration method

Patent number: 9524718

Abstract: The speech recognition result through the general-purpose server and that through the specialized speech recognition server are integrated in an optimum manner, thereby, a speech recognition function least in errors in the end being provided. The specialized speech recognition server 108 is constructed with the words contained in the user dictionary data in use as well as the performance of the general-purpose speech recognition server 106 is preliminarily evaluated with such user dictionary data. Based on such evaluation result, information related to which recognition results through the specialized and general-purpose speech recognition servers are adopted and to how the adopted recognition results are weighted to obtain an optimum recognition result is preliminarily retained in the form of a database.

Type: Grant

Filed: April 3, 2013

Date of Patent: December 20, 2016

Assignee: Clarion Co., Ltd.

Inventors: Yasunari Obuchi, Takeshi Homma
Apparatus and methods to update a language model in a speech recognition system

Patent number: 9489940

Abstract: The technology of the present application provides a method and apparatus to allow for dynamically updating a language model across a large number of similarly situated users. The system identifies individual changes to user profiles and evaluates the change for a broader application, such as, a dialect correction for a speech recognition engine, as administrator for the system identifies similarly situated user profiles and downloads the profile change to effect a dynamic change to the language model of similarly situated users.

Type: Grant

Filed: June 11, 2012

Date of Patent: November 8, 2016

Assignee: NVOQ INCORPORATED

Inventor: Charles Corfield
Guest speaker robust adapted speech recognition

Patent number: 9478216

Abstract: A method for speech recognition is implemented in the specific form of computer processes that function in a computer processor. That is, one or more computer processes: process a speech input to produce a sequence of representative speech vectors and perform multiple recognition passes to determine a recognition output corresponding to the speech input. At least one generic recognition pass is based on a generic speech recognition arrangement using generic modeling of a broad general class of input speech. And at least one adapted recognition pass is based on a speech adapted arrangement using pre-adapted modeling of a specific sub-class of the general class of input speech.

Type: Grant

Filed: December 8, 2009

Date of Patent: October 25, 2016

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Lambert Mathias, Chuang He, Jianxiong Wu
Acoustic model training corpus selection

Patent number: 9472187

Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.

Type: Grant

Filed: May 25, 2016

Date of Patent: October 18, 2016

Assignee: Google Inc.

Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
Differential acoustic model representation and linear transform-based adaptation for efficient user profile update techniques in automatic speech recognition

Patent number: 9406299

Abstract: A computer-implemented method is described for speaker adaptation in automatic speech recognition. Speech recognition data from a particular speaker is used for adaptation of an initial speech recognition acoustic model to produce a speaker adapted acoustic model. A speaker dependent differential acoustic model is determined that represents differences between the initial speech recognition acoustic model and the speaker adapted acoustic model. In addition, an approach is also disclosed to estimate speaker-specific feature or model transforms over multiple sessions. This is achieved by updating the previously estimated transform using only adaptation statistics of the current session.

Type: Grant

Filed: March 8, 2012

Date of Patent: August 2, 2016

Assignee: Nuance Communications, Inc.

Inventors: Christian Gollan, Daniel Willett
Estimating speaker-specific affine transforms for neural network based speech recognition systems

Patent number: 9378735

Abstract: Features are disclosed for estimating affine transforms in Log Filter-Bank Energy Space (“LFBE” space) in order to adapt artificial neural network-based acoustic models to a new speaker or environment. Neural network-based acoustic models may be trained using concatenated LFBEs as input features. The affine transform may be estimated by minimizing the least squares error between corresponding linear and bias transform parts for the resultant neural network feature vector and some standard speaker-specific feature vector obtained for a GMM-based acoustic model using constrained Maximum Likelihood Linear Regression (“cMLLR”) techniques. Alternatively, the affine transform may be estimated by minimizing the least squares error between the resultant transformed neural network feature and some standard speaker-specific feature obtained for a GMM-based acoustic model.

Type: Grant

Filed: December 19, 2013

Date of Patent: June 28, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Sri Venkata Surya Siva Rama Krishna Garimella, Bjorn Hoffmeister, Nikko Strom
Initiating actions based on partial hotwords

Patent number: 9330663

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, receiving audio data; determining that an initial portion of the audio data corresponds to an initial portion of a hotword; in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and causing one or more actions of the subset to be performed.

Type: Grant

Filed: March 20, 2014

Date of Patent: May 3, 2016

Assignee: Google Inc.

Inventor: Matthew Sharifi
Multi-level speech recognition

Patent number: 9305554

Abstract: A method and device for recognizing an utterance. The method includes transmitting context data associated with a first device to a second device. A first speech recognition model is received from the second device. The first speech recognition model is a subset of a second speech recognition model present at the second device. The first speech recognition model is based on the context data. It is determined whether the utterance can be recognized at the first device based on the first speech recognition model. If the utterance cannot be recognized at the first device, then at least a portion of the utterance is sent to the second device. If the utterance can be recognized at the first device, then an action associated with the recognized utterance is performed.

Type: Grant

Filed: July 16, 2014

Date of Patent: April 5, 2016

Assignee: Samsung Electronics Co., Ltd.

Inventors: Arun Jagatheesan, Juhan Lee, Jong Hoon Ahnn
Audio signal section estimating apparatus, audio signal section estimating method, and recording medium

Patent number: 9208780

Abstract: The processing efficiency and estimation accuracy of a voice activity detection apparatus are improved. An acoustic signal analyzer receives a digital acoustic signal containing a speech signal and a noise signal, generates a non-speech GMM and a speech GMM adapted to a noise environment, by using a silence GMM and a clean-speech GMM in each frame of the digital acoustic signal, and calculates the output probabilities of dominant Gaussian distributions of the GMMs. A speech state probability to non-speech state probability ratio calculator calculates a speech state probability to non-speech state probability ratio based on a state transition model of a speech state and a non-speech state, by using the output probabilities; and a voice activity detection unit judges, from the speech state probability to non-speech state probability ratio, whether the acoustic signal in the frame is in the speech state or in the non-speech state and outputs only the acoustic signal in the speech state.

Type: Grant

Filed: July 15, 2010

Date of Patent: December 8, 2015

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Masakiyo Fujimoto, Tomohiro Nakatani
Multi-pass speech analytics

Patent number: 9171547

Abstract: Included are embodiments for multi-pass analytics. At least one embodiment of a method includes receiving data associated with a communication, performing first tier analytics on the received data, and performing second tier analytics on the received data, where the second tier analytics determines different characteristics of the received data than the first tier analytics.

Type: Grant

Filed: October 12, 2011

Date of Patent: October 27, 2015

Assignee: VERINT AMERICAS INC.

Inventor: Joseph Watson
Ranking modifications of a previous query

Patent number: 9165028

Abstract: Methods and apparatus related to ranking modifications of a previous query. For example, modifications of a previous query may be generated based on a current query issued subsequent to the previous query by substituting one or more n-grams of the previous query with one or more n-grams of the current query. One or more measures of each of the modifications may be identified and, based on such measures, a ranking of each of the modifications may be determined. One of the modifications may be selected as a submission query based on the rankings of the modifications. The submission query may be selected for submission in lieu of, or in addition to, the current query.

Type: Grant

Filed: July 24, 2013

Date of Patent: October 20, 2015

Assignee: GOOGLE INC.

Inventors: Bruce Christensen, Pravir Kumar Gupta, Jan Kuipers
Mobile speech recognition hardware accelerator

Patent number: 9153230

Abstract: A method for executing a mobile speech recognition software application based on a multi-layer neural network model includes providing to a hardware accelerator in the mobile device to classify one or more frames of an audio signal. The hardware accelerator includes a multiplier-accumulator (MAC) unit to perform matrix multiplication operations involved in computing the neural network output.

Type: Grant

Filed: October 23, 2012

Date of Patent: October 6, 2015

Assignee: Google Inc.

Inventor: Juha-Pekka Maaninen
Speaker and call characteristic sensitive open voice search

Patent number: 9099092

Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.

Type: Grant

Filed: January 10, 2014

Date of Patent: August 4, 2015

Assignee: Nuance Communications, Inc.

Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
Spoken utterance classification training for a speech recognition system

Patent number: 9082403

Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

Type: Grant

Filed: December 15, 2011

Date of Patent: July 14, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yun-Cheng Ju, James Garnet Droppo, III
SPEECH RECOGNITION IN AUTOMATED INFORMATION SERVICES SYSTEMS

Publication number: 20150142436

Abstract: The present invention allows feedback from operator workstations to be used to update databases used for providing automated information services. When an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system. Thus, when the operator inputs the words corresponding to the speech provided at the information services automation system, the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.

Type: Application

Filed: January 30, 2015

Publication date: May 21, 2015

Inventors: Bruce Bokish, Michael Craig Presnell

prev 1 2 3 4 5 6 … next