Specialized Models Patents (Class 704/255)
  • Patent number: 10304443
    Abstract: An electronic device and a voice recognition method thereof are provided. A voice recognition method of an electronic device includes receiving trigger voice, storing a characteristic of the trigger voice for voice verification by analyzing the trigger voice. When a user issues a voice command, determining whether the user voice command comprises the characteristic belonging to the trigger voice, and performing function of the electronic device corresponding to the user voice command based on the determination. Accordingly, users may easily use the voice recognition function of the electronic device.
    Type: Grant
    Filed: October 7, 2014
    Date of Patent: May 28, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-yeong Kwon, Hee-seob Ryu, Kyung-mi Park, Chan-sik Bok, Chan-hee Choi
  • Patent number: 10297247
    Abstract: Various implementations disclosed herein include a phonotactic post-processor configured to rescore the N-best phoneme candidates output by a primary ensemble phoneme neural network using a priori phonotactic information. In various implementations, one of the scored set of the N-best phoneme candidates is selected as a preferred estimate for a one-phoneme output decision by the phonotactic post-processor. In some implementations, the one-phoneme output decision is an estimate of the most likely detected and recognized phoneme in a frame based on a function of posterior probabilities generated by an ensemble phoneme neural network, as well as phonotactic information and statistical performance characterizations incorporated by the phonotactic post-processor.
    Type: Grant
    Filed: August 28, 2016
    Date of Patent: May 21, 2019
    Assignee: MALASPINA LABS (BARBADOS), INC.
    Inventor: Robert Alex Fuhrman
  • Patent number: 10296160
    Abstract: Systems and processes are disclosed for virtual assistant request recognition using live usage data and data relating to future events. User requests that are received but not recognized can be used to generate candidate request templates. A count can be associated with each candidate request template and can be incremented each time a matching candidate request template is received. When a count reaches a threshold level, the corresponding candidate request template can be used to train a virtual assistant to recognize and respond to similar user requests in the future. In addition, data relating to future events can be mined to extract relevant information that can be used to populate both recognized user request templates and candidate user request templates. Populated user request templates (e.g., whole expected utterances) can then be used to recognize user requests and disambiguate user intent as future events become relevant.
    Type: Grant
    Filed: December 6, 2013
    Date of Patent: May 21, 2019
    Assignee: APPLE INC.
    Inventors: Rushin N. Shah, Devang K. Naik
  • Patent number: 10276188
    Abstract: Systems and methods are provided for analyzing voice-based audio inputs. A voice-based audio input associated with a user (e.g., wherein the voice-based audio input is a prompt or a command) is received and measures of one or more features are extracted. One or more parameters are calculated based on the measures of the one or more features. The occurrence of one or more mistriggers is identified by inputting the one or more parameters into a predictive model. Further, systems and methods are provided for identifying human mental health states using mobile device data. Mobile device data (including sensor data) associated with a mobile device corresponding to a user is received. Measurements are derived from the mobile device data and input into a predictive model. The predictive model is executed and outputs probability values of one or more symptoms associated with the user.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: April 30, 2019
    Assignee: Cogito Corporation
    Inventors: Joshua Feast, Ali Azarbayejani, Skyler Place
  • Patent number: 10224036
    Abstract: In at least one exemplary embodiment for automated document identification and language dictation recognition systems, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient.
    Type: Grant
    Filed: June 15, 2017
    Date of Patent: March 5, 2019
    Assignee: InfraWare, Inc.
    Inventors: Nathan Lindle, Nick Mahurin
  • Patent number: 10198509
    Abstract: A complex video event classification, search and retrieval system can generate a semantic representation of a video or of segments within the video, based on one or more complex events that are depicted in the video, without the need for manual tagging. The system can use the semantic representations to, among other things, provide enhanced video search and retrieval capabilities.
    Type: Grant
    Filed: January 25, 2016
    Date of Patent: February 5, 2019
    Assignee: SRI International
    Inventors: Hui Cheng, Harpreet Singh Sawhney, Ajay Divakaran, Qian Yu, Jingen Liu, Amir Tamrakar, Saad Ali, Omar Javed
  • Patent number: 10186262
    Abstract: A speech recognition system interprets both spoken system commands as well as application commands. Users may speak commands to an open microphone of a computing device that may be interpreted by at least two speech recognizers operating simultaneously. The first speech recognizer interprets operating system commands and the second speech recognizer interprets application commands. The system commands may include at least opening and closing an application and the application commands may include at least a game command or navigation within a menu. A reserve word may be used to identify whether the command is for the operation system or application. A user's cadence may also indicate whether the speech is a global command or application command. A speech recognizer may include a natural language software component located in a remote computing device, such as in the so-called cloud.
    Type: Grant
    Filed: July 31, 2013
    Date of Patent: January 22, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Christian Klein, Dennis Mooney, II, Thomas Soemo, Leo Soong, Joseph Wheeler
  • Patent number: 10157608
    Abstract: According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.
    Type: Grant
    Filed: February 15, 2017
    Date of Patent: December 18, 2018
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yamato Ohtani, Yu Nasu, Masatsune Tamura, Masahiro Morita
  • Patent number: 10140982
    Abstract: A method for using speech disfluencies detected in speech input to assist in interpreting the input is provided. The method includes providing access to a set of content items with metadata describing the content items, and receiving a speech input intended to identify a desired content item. The method further includes detecting a speech disfluency in the speech input and determining a measure of confidence of a user in a portion of the speech input following the speech disfluency. If the confidence measure is lower than a threshold value, the method includes determining an alternative query input based on replacing the portion of the speech input following the speech disfluency with another word or phrase. The method further includes selecting content items based on comparing the speech input, the alternative query input (when the confidence measure is low), and the metadata associated with the content items.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: November 27, 2018
    Assignee: VEVEO, INC.
    Inventors: Murali Aravamudan, Daren Gill, Sashikumar Venkataraman, Vineet Agarwal, Ganesh Ramamoorthy
  • Patent number: 10132519
    Abstract: An air conditioning system includes a collector that obtains user's voice data; a voice recognizer that generates character-string data on the basis of the voice data; a memory that stores therein a search table having a standard keyword that indicates the type of sensitivity to the environment inside a room and derivative keywords that indicate a same type of sensitivity as the standard keyword, while indicating different sensitivity levels; an analyzer that determines the type and level of user's sensitivity on the basis of the character-string data and the search table; and a controller that outputs a control signal to an adjustment device in an indoor device, the adjustment device adjusting the environment inside a room on the basis of the type and level of sensitivity, which are determined by the analyzer.
    Type: Grant
    Filed: April 3, 2015
    Date of Patent: November 20, 2018
    Assignee: Mitsubishi Electric Corporation
    Inventors: Toshiki Iuchi, Kazuyuki Katayama
  • Patent number: 10102860
    Abstract: In at least one exemplary embodiment for common phrase identification and language dictation recognition systems and methods for using the same, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient.
    Type: Grant
    Filed: June 15, 2017
    Date of Patent: October 16, 2018
    Assignee: InfraWare, Inc.
    Inventors: Nathan Lindle, Nick Mahurin
  • Patent number: 10102200
    Abstract: A mechanism is provided for improving predicate parses (or logical representations of a passage) using semantic knowledge. In response to encountering an ambiguous decision point during a syntactic analysis of a portion of natural language content, a candidate meaning of the ambiguous decision point is generated. Characteristics of the ambiguous decision point are evaluated based on a semantic knowledge base to determine a semantic meaning associated with the ambiguous decision point. A determination is made as to whether the semantic meaning supports or refutes the candidate meaning. In response to determining that the semantic meaning refutes the candidate meaning, the candidate meaning of the ambiguous decision point is overridden based on the semantic meaning to include the semantic meaning as a final meaning for the ambiguous decision point. The portion of natural language content is then processed based on the final meaning for the ambiguous decision point.
    Type: Grant
    Filed: August 25, 2016
    Date of Patent: October 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Brendan C. Bull, David Contreras, Paul L. Felt
  • Patent number: 10095243
    Abstract: A method for controlling a drone includes receiving a natural language request for information about a spatial location, parsing the natural language request into data requests, configuring a flight plan and controlling one or more drones to fly over the spatial location to obtain data types based on the data requests, and extracting and analyzing data to answer the request. The method can include extracting data points from the data types, obtaining labels from a user for one or more of the data points, predicting labels for unlabeled data points from a learning algorithm using the labels obtained from the user, determining the predicted labels are true labels for the unlabeled data points and combining the extracted data, the user labeled data points and the true labeled data points to answer the request for information. The learning algorithm may be active learning using a support vector machine.
    Type: Grant
    Filed: August 9, 2016
    Date of Patent: October 9, 2018
    Assignee: International Business Machines Corporation
    Inventors: Gregory F. Boland, James R. Kozloski, Yu Ma, Justin G. Manweiler, Kevin E. Siemonsen, Umut Topkara, Katherine Vogt, Justin D. Weisz
  • Patent number: 10068565
    Abstract: An exemplary computer system configured to train an ASR using the output from a TTS engine.
    Type: Grant
    Filed: December 8, 2014
    Date of Patent: September 4, 2018
    Inventor: Fathy Yassa
  • Patent number: 10026398
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting follow-up queries to an initial transcription of an utterance. In some implementations, one or more follow-up queries that are pre-associated with a transcription of an initial utterance of a user are identified. A new or modified language model in which a respective probability associated with one or more of the follow-up queries is increased with respect to an initial language model is obtained. Subsequent audio data corresponding to a subsequent utterance of the user is then received. The subsequent audio data is processed using the new or modified language model to generate a transcription of the subsequent utterance. The transcription of the subsequent utterance is then provided for output to the user.
    Type: Grant
    Filed: July 8, 2016
    Date of Patent: July 17, 2018
    Assignee: Google LLC
    Inventors: Behshad Behzadi, Dmitry Osmakov, Martin Baeuml, Gleb Skobeltsyn
  • Patent number: 10008200
    Abstract: According to an embodiment, a decoder searches a finite state transducer and outputs an output symbol string corresponding to a signal that is input or corresponding to a feature sequence of signal that is input. The decoder includes a token operating unit and a duplication eliminator. The token operating unit is configured to, every time the signal or the feature is input, propagate each of a plurality of tokens, which is assigned with a state of the head of a path being searched, according to the finite state transducer. The duplication eliminator is configured to eliminate duplication of two or more tokens which have same state assigned thereto and for which respective previously-passed transitions are assigned with same input symbol.
    Type: Grant
    Filed: December 18, 2014
    Date of Patent: June 26, 2018
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Manabu Nagao
  • Patent number: 9990583
    Abstract: Methods, systems and computer program products are disclosed for detecting patterns in a data stream that match multi-pattern rules. One embodiment of the invention provides a method of recognizing a specified group of patterns in a data stream. The method comprises identifying a rule for said specified group of patterns in the data stream, and using a first array of finite state machines to scan the data stream for at least some of the patterns in the specified group. For patterns in the specified group that are found in the data stream by the first array of finite state machines, pattern identifiers are sent to a second array of finite state machines. The second array of finite state machines determines if the specified group of patterns is in the data stream in accordance with the identified rule by, at least in part, using said pattern identifiers.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: June 5, 2018
    Assignee: International Business Machines Corporation
    Inventor: Jan van Lunteren
  • Patent number: 9934452
    Abstract: Systems and techniques are provided for pruning a node from a possible nodes list for Hidden Markov Model with label transition node pruning. The node may be a label transition node. A frame may be at a predicted segmentation point in decoding input with the Hidden Markov Model. The node may be scored at the frame. The node may be pruned from the possible nodes list for the frame when score for the node is greater than the sum of a best score among nodes on the possible nodes list for the frame and a beam threshold minus a penalty term. A possible nodes list may be generated for a subsequent frame using label selection. A second node may be pruned from the possible nodes list for the subsequent frame with early pruning.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: April 3, 2018
    Assignee: Google LLC
    Inventor: Yasuhisa Fujii
  • Patent number: 9916299
    Abstract: Technology is disclosed that improves language coverage by selecting sentences to be used as training data for a language processing engine. The technology accomplishes the selection of a number of sentences by obtaining a group of sentences, computing a score for each sentence, sorting the sentences based on their scores, and selecting a number of sentences with the highest scores. The scores can be computed by dividing a sum of frequency values of unseen words (or n-grams) in the sentence by a length of the sentence. The frequency values can be based on posts in one or more particular domains, such as the public domain, the private domain, or other specialized domains.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: March 13, 2018
    Assignee: Facebook, Inc.
    Inventor: Matthias Gerhard Eck
  • Patent number: 9904666
    Abstract: One or more embodiments present a script to a user in an interactive script environment. A digital representation of a manuscript is analyzed. This digital representation includes a set of roles and a set of information associated with each role in the set of roles. An active role in the set of roles that is associated with a given user is identified based on the analyzing. At least a portion of the manuscript is presented to the given user via a user interface. The portion includes at least a subset of information in the set of information. Information within the set of information that is associated with the active role is presented in a visually different manner than information within the set of information that is associated with a non-active role, which is a role that is associated with a user other than the given user.
    Type: Grant
    Filed: November 3, 2014
    Date of Patent: February 27, 2018
    Inventor: Randall Lee Threewits
  • Patent number: 9842593
    Abstract: Predetermined services are provided using preset instructions. A transcript of audible content provided over an electronic network and received at a communications device is analyzed to determine whether a trigger is present in the audible content. When the trigger is present in the audible content, preset instructions correlated with the trigger and instructing how to provide a predetermined service are identified. The predetermined service is provided by following the preset instructions.
    Type: Grant
    Filed: November 14, 2014
    Date of Patent: December 12, 2017
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Shiv Kumar, Paritosh Bajpay, Anand K. Singh, Satyendra Tripathi
  • Patent number: 9842592
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using non-linguistic context. In some implementations, context data indicating non-linguistic context for the utterance is received. Based on the context data, feature scores for one or more non-linguistic features are generated. The feature scores for the non-linguistic features are provided to a language model trained to process scores for non-linguistic features. The output from the language model is received, and a transcription for the utterance is determined using the output of the language model.
    Type: Grant
    Filed: February 12, 2014
    Date of Patent: December 12, 2017
    Assignee: Google Inc.
    Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar
  • Patent number: 9836452
    Abstract: Methods and systems are provided for discriminating ambiguous expressions to enhance user experience. For example, a natural language expression may be received by a speech recognition component. The natural language expression may include at least one of words, terms, and phrases of text. A dialog hypothesis set from the natural language expression may be created by using contextual information. In some cases, the dialog hypothesis set has at least two dialog hypotheses. A plurality of dialog responses may be generated for the dialog hypothesis set. The dialog hypothesis set may be ranked based on an analysis of the plurality of the dialog responses. An action may be performed based on ranking the dialog hypothesis set.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: December 5, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jean-Philippe Robichaud, Ruhi Sarikaya
  • Patent number: 9830909
    Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: November 28, 2017
    Assignee: Kopin Corporation
    Inventor: Christopher Parkinson
  • Patent number: 9817881
    Abstract: A method, apparatus, and tangible computer readable medium for processing a Hidden Markov Model (HMM) structure are disclosed herein. For example, the method includes receiving Hidden Markov Model (HMM) information from an external system. The method also includes processing back pointer data and first HMM states scores for one or more NULL states in the HMM information. Second HMM state scores are processed for one or more non-NULL states in the HMM information based on at least one predecessor state. Further, the method includes transferring the second HMM state scores to the external system.
    Type: Grant
    Filed: October 16, 2013
    Date of Patent: November 14, 2017
    Assignee: Cypress Semiconductor Corporation
    Inventors: Ojas A. Bapat, Richard M. Fastow, Jens Olson, Kenichi Kumatani
  • Patent number: 9805371
    Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of responses may be provided to the responding user so that the responding user may use a suggested response instead of entering text to provide a response. Previous messages between the two users and other information may be used to determine an appropriate suggested response. A conversation feature vector may be determined from previous messages, and response feature vectors may be determined from the conversation feature vector. The response feature vectors may be used to determine an appropriate suggested response.
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: October 31, 2017
    Assignee: ASAPP, INC.
    Inventors: Gustavo Sapoznik, Shawn Henry
  • Patent number: 9792907
    Abstract: Techniques related to key phrase detection for applications such as wake on voice are discussed. Such techniques may include updating a start state based rejection model and a key phrase model based on scores of sub-phonetic units from an acoustic model to generate a rejection likelihood score and a key phrase likelihood score and determining whether received audio input is associated with a predetermined key phrase based on the rejection likelihood score and the key phrase likelihood score.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: October 17, 2017
    Assignee: Intel IP Corporation
    Inventors: Tobias Bocklet, Joachim Hofer
  • Patent number: 9753918
    Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.
    Type: Grant
    Filed: January 5, 2015
    Date of Patent: September 5, 2017
    Assignee: Facebook, Inc.
    Inventors: Alexander Waibel, Ian R. Lane
  • Patent number: 9747893
    Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: August 29, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 9721561
    Abstract: In a speech recognition system, deep neural networks (DNNs) are employed in phoneme recognition. While DNNs typically provide better phoneme recognition performance than other techniques, such as Gaussian mixture models (GMM), adapting a DNN to a particular speaker is a real challenge. According to at least one example embodiment, speech data and corresponding speaker data are both applied as input to a DNN. In response, the DNN generates a prediction of a phoneme based on the input speech data and the corresponding speaker data. The speaker data may be generated from the corresponding speech data.
    Type: Grant
    Filed: December 5, 2013
    Date of Patent: August 1, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Yun Tang, Venkatesh Nagesha, Xing Fan
  • Patent number: 9699298
    Abstract: An electronic communication device and method for communication including automatically selecting, composing, and/or presenting a number of proposed responses to a message is disclosed. The device can present the incoming message content to the user together with an alert to the user that one or more proposed responses are available for review and selection. The device can calculate a confidence metric for each proposed response. The user alert and/or presentation of the messages can be made to vary in intensity in accordance with the confidence metric of the one or more proposed responses. The presentation of the proposed responses can be alternated and/or combined with a method for the device to receive user input to the content of the proposed response. The user can edit a proposed response rather than composing a complete response.
    Type: Grant
    Filed: August 10, 2015
    Date of Patent: July 4, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Mark Rubinstein, Jason Nelson
  • Patent number: 9684650
    Abstract: A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured norm. The trained language model with optimized values for the parameters generated by the optimizing is applied to predict a symbol following sequence of symbols of the language modeled by the language model. In some embodiments the penalty comprises a tree-structured lp-norm, such as a tree-structured l2-norm or a tree-structured l?-norm. In some embodiments a tree-structured l?-norm operates on a collapsed suffix trie in which any series of suffixes of increasing lengths which are always observed in the same context are collapsed into a single node. The optimizing may be performed using a proximal step algorithm.
    Type: Grant
    Filed: September 10, 2014
    Date of Patent: June 20, 2017
    Assignee: XEROX CORPORATION
    Inventors: Anil Kumar Nelakanti, Guillaume M. Bouchard, Cedric Archambeau, Francis Bach, Julien Mairal
  • Patent number: 9640178
    Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: May 2, 2017
    Assignee: Kopin Corporation
    Inventor: Christopher Parkinson
  • Patent number: 9626354
    Abstract: For using a tone indicator to determine a language for text recognition, an apparatus for using a tone indicator to identify language in text recognition is disclosed. The apparatus for text recognition using a tone indicator includes a text input module, a tone indicator module, and a language identification module. A method and computer program product also perform the functions of the apparatus.
    Type: Grant
    Filed: January 21, 2014
    Date of Patent: April 18, 2017
    Assignee: Lenovo (Singapore) PTE. LTD.
    Inventor: Xin Feng
  • Patent number: 9620111
    Abstract: Described herein are systems and methods for the generation and maintenance of language models. Language models are developed based at least in part on transaction data from one or more users. These transactions may include purchases and other interactions between one or more users and one or more online merchants. The language models may be associated with a particular user or group of users. Ongoing transactions may modify the language models. The language models may be used to recognize spoken input from the one or more users.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: April 11, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Matthew P. Bell, Stephen Polansky, Yuzo Watanabe
  • Patent number: 9594744
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 14, 2017
    Assignee: Google Inc.
    Inventors: Hasim Sak, Francoise Beaufays
  • Patent number: 9564125
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: February 7, 2017
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
  • Patent number: 9542604
    Abstract: A method and apparatus of providing a combined summary by receiving monitored audio and video are provided The method includes: receiving audio and video captured by at least one network camera; generating a video summary by detecting at least one video event from at least one of the audio and the video; generating an audio summary by detecting at least one audio event from at least one of the audio and the video; extracting at least one section of the video summary corresponding to the at least one audio event, and storing the extracted at least one section of the video summary with the audio summary; and providing a display of the video reproducing apparatus with a video summary control interface for controlling the video summary and an audio summary control interface for controlling the audio summary.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: January 10, 2017
    Assignee: Hanwha Techwin Co., Ltd.
    Inventor: Sungbong Cho
  • Patent number: 9530402
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: December 27, 2016
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
  • Patent number: 9484025
    Abstract: The disclosure includes a system and method for configuring custom vocabularies for personalized speech recognition. The system includes a processor and a memory storing instructions that when executed cause the system to: detect a provisioning trigger event; determine a state of a journey associated with a user based on the provisioning trigger event; determine one or more interest places based on the state of the journey; populate a place vocabulary associated with the user using the one or more interest places; filter the place vocabulary based on one or more place filtering parameters; and register the filtered place vocabulary for the user.
    Type: Grant
    Filed: October 15, 2013
    Date of Patent: November 1, 2016
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Divya Sai Toopran, Vinuth Rai, Rahul Parundekar
  • Patent number: 9471064
    Abstract: A method for controlling a drone includes receiving a natural language request for information about a spatial location, parsing the natural language request into data requests, configuring a flight plan and controlling one or more drones to fly over the spatial location to obtain data types based on the data requests, and extracting and analyzing data to answer the request. The method can include extracting data points from the data types, obtaining labels from a user for one or more of the data points, predicting labels for unlabeled data points from a learning algorithm using the labels obtained from the user, determining the predicted labels are true labels for the unlabeled data points and combining the extracted data, the user labeled data points and the true labeled data points to answer the request for information. The learning algorithm may be active learning using a support vector machine.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: October 18, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gregory F. Boland, James R. Kozloski, Yu Ma, Justin G. Manweiler, Kevin E. Siemonsen, Umut Topkara, Katherine Vogt, Justin D. Weisz
  • Patent number: 9449599
    Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question—e.g., words that are not part of the proper name entities—may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
    Type: Grant
    Filed: May 30, 2014
    Date of Patent: September 20, 2016
    Assignee: PROMPTU SYSTEMS CORPORATION
    Inventor: Harry William Printz
  • Patent number: 9406298
    Abstract: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According to an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of a linear operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition.
    Type: Grant
    Filed: April 4, 2013
    Date of Patent: August 2, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Sandro Cumani, Pietro Laface
  • Patent number: 9367612
    Abstract: A system identifies a set of initial segments of a time-based data item, such as audio. The segments can be defined at regular time intervals within the time-based data item. The initial segments are short segments. The system computes a short-timescale vectorial representation for each initial segment and compares the short-timescale vectorial representation for each initial segment with other short-timescale vectorial representations of the segments in a time duration within the time-based data item (e.g., audio) immediately preceding or immediately following the initial segment. The system generates a representation of long-timescale information for the time-based data item based on a comparison of the short-timescale vectorial representations of the initial segments and the short-timescale vectorial representations of immediate segments. The representation of long-timescale information identifies an underlying repetition structure of the time-based data item, such as rhythm or phrasing in an audio item.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: June 14, 2016
    Assignee: GOOGLE INC.
    Inventors: Douglas Eck, Jay Yagnik
  • Patent number: 9367526
    Abstract: A language processing application employs a classing function optimized for the underlying production application context for which it is expected to process speech. A combination of class based and word based features generates a classing function optimized for a particular production application, meaning that a language model employing the classing function uses word classes having a high likelihood of accurately predicting word sequences encountered by a language model invoked by the production application. The classing function optimizes word classes by aligning the objective of word classing with the underlying language processing task to be performed by the production application. The classing function is optimized to correspond to usage in the production application context using class-based and word-based features by computing a likelihood of a word in an n-gram and a frequency of a word within a class of the n-gram.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: June 14, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Paul Vozila, Maximilian Bisani, Yi Su, Stephen M. Chu, Stanley F. Chen, Ruhi Sarikaya, Bhuvana Ramabhadran
  • Patent number: 9288327
    Abstract: An apparatus and method for routing an incoming call includes the apparatus configured to perform a routing decision resulting in the incoming call being routed to the voicemail of a called user. The apparatus further creates a transcription of a voicemail message being recorded on the voicemail of the called user. This is done by applying a speech analysis process to the voicemail message as the voicemail message is being recorded. Information about a context of the incoming call is determined from the transcription and a second routing decision is performed based on the information.
    Type: Grant
    Filed: May 14, 2014
    Date of Patent: March 15, 2016
    Assignee: Mitel Networks Corporation
    Inventors: Trung Tim Trinh, Tom Gray
  • Patent number: 9275128
    Abstract: Generating a document index comprises: obtaining a document to be indexed; determining whether each monadic partition obtained from the document is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and indexing the polynary partition, otherwise, indexing the monadic partition. Querying data comprising: receiving a data query, determining whether each monadic partition obtained from the data query is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and using the polynary partition to obtain search results, otherwise, using the monadic partition to obtain search results; and combining search results to form a final query search result.
    Type: Grant
    Filed: July 20, 2010
    Date of Patent: March 1, 2016
    Assignee: Alibaba Group Holding Limited
    Inventors: Lei Wei, Jiaxiang Shen
  • Patent number: 9275039
    Abstract: According to an embodiment, an information processing device includes a first set generating unit that generates a first set of states reachable by following ?-transitions in a forward direction and a reverse direction among states included in a finite state automaton; a second set generating unit that generates a second set representing a set of states included in each of a plurality of first sets connected by one transition that is not an ?-transition and the first set that is not connected to another first set by one transition that is not an ?-transition; a combination generating unit that obtains, among combinations of ?-removal patterns to be respectively applied to the ?-transitions in the second set, a combination with which the number of transitions after removing ?-transitions is the smallest; and a removing unit that removes ?-transitions in the second set by the removal patterns included in the obtained combination.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: March 1, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Manabu Nagao
  • Patent number: 9256457
    Abstract: A system for providing an interactive response system for hosted services may include a processor and a memory. The processor may facilitate steps of receiving data streams from participant devices, such as devices participating in a multimedia conference, and transmitting, in response to receiving the data streams from the participant devices, the data streams to the other of the participant devices. The steps may further include receiving, from one of the participant devices, a request to initiate communication with an invitee device, processing the data streams to determine a language associated with the data streams, and providing, to the invitee device, a communication in the determined language. The communication provided to the invitee device may be an initial communication of an interactive response system, such as an interactive voice response system or an interactive messaging response system.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: February 9, 2016
    Assignee: Google Inc.
    Inventors: Matthew John Leske, Jarkko Oikarinen, Tomas Lundqvist
  • Patent number: 9224384
    Abstract: Embodiments of the present invention include an acoustic processing device, a method for acoustic signal processing, and a speech recognition system. The speech processing device can include a processing unit, a histogram pruning unit, and a pre-pruning unit. The processing unit is configured to calculate one or more Hidden Markov Model (HMM) pruning thresholds. The histogram pruning unit is configured to prune one or more HMM states to generate one or more active HMM states. The pruning is based on the one or more pruning thresholds. The pre-pruning unit is configured to prune the one or more active HMM states based on an adjustable pre-pruning threshold. Further, the adjustable pre-pruning threshold is based on the one or more pruning thresholds.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: December 29, 2015
    Assignee: Cypress Semiconductor Corporation
    Inventor: Ojas Ashok Bapat