Specialized Models Patents (Class 704/255)
  • Patent number: 9842592
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using non-linguistic context. In some implementations, context data indicating non-linguistic context for the utterance is received. Based on the context data, feature scores for one or more non-linguistic features are generated. The feature scores for the non-linguistic features are provided to a language model trained to process scores for non-linguistic features. The output from the language model is received, and a transcription for the utterance is determined using the output of the language model.
    Type: Grant
    Filed: February 12, 2014
    Date of Patent: December 12, 2017
    Assignee: Google Inc.
    Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar
  • Patent number: 9842593
    Abstract: Predetermined services are provided using preset instructions. A transcript of audible content provided over an electronic network and received at a communications device is analyzed to determine whether a trigger is present in the audible content. When the trigger is present in the audible content, preset instructions correlated with the trigger and instructing how to provide a predetermined service are identified. The predetermined service is provided by following the preset instructions.
    Type: Grant
    Filed: November 14, 2014
    Date of Patent: December 12, 2017
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Shiv Kumar, Paritosh Bajpay, Anand K. Singh, Satyendra Tripathi
  • Patent number: 9836452
    Abstract: Methods and systems are provided for discriminating ambiguous expressions to enhance user experience. For example, a natural language expression may be received by a speech recognition component. The natural language expression may include at least one of words, terms, and phrases of text. A dialog hypothesis set from the natural language expression may be created by using contextual information. In some cases, the dialog hypothesis set has at least two dialog hypotheses. A plurality of dialog responses may be generated for the dialog hypothesis set. The dialog hypothesis set may be ranked based on an analysis of the plurality of the dialog responses. An action may be performed based on ranking the dialog hypothesis set.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: December 5, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jean-Philippe Robichaud, Ruhi Sarikaya
  • Patent number: 9830909
    Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: November 28, 2017
    Assignee: Kopin Corporation
    Inventor: Christopher Parkinson
  • Patent number: 9817881
    Abstract: A method, apparatus, and tangible computer readable medium for processing a Hidden Markov Model (HMM) structure are disclosed herein. For example, the method includes receiving Hidden Markov Model (HMM) information from an external system. The method also includes processing back pointer data and first HMM states scores for one or more NULL states in the HMM information. Second HMM state scores are processed for one or more non-NULL states in the HMM information based on at least one predecessor state. Further, the method includes transferring the second HMM state scores to the external system.
    Type: Grant
    Filed: October 16, 2013
    Date of Patent: November 14, 2017
    Assignee: Cypress Semiconductor Corporation
    Inventors: Ojas A. Bapat, Richard M. Fastow, Jens Olson, Kenichi Kumatani
  • Patent number: 9805371
    Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of responses may be provided to the responding user so that the responding user may use a suggested response instead of entering text to provide a response. Previous messages between the two users and other information may be used to determine an appropriate suggested response. A conversation feature vector may be determined from previous messages, and response feature vectors may be determined from the conversation feature vector. The response feature vectors may be used to determine an appropriate suggested response.
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: October 31, 2017
    Assignee: ASAPP, INC.
    Inventors: Gustavo Sapoznik, Shawn Henry
  • Patent number: 9792907
    Abstract: Techniques related to key phrase detection for applications such as wake on voice are discussed. Such techniques may include updating a start state based rejection model and a key phrase model based on scores of sub-phonetic units from an acoustic model to generate a rejection likelihood score and a key phrase likelihood score and determining whether received audio input is associated with a predetermined key phrase based on the rejection likelihood score and the key phrase likelihood score.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: October 17, 2017
    Assignee: Intel IP Corporation
    Inventors: Tobias Bocklet, Joachim Hofer
  • Patent number: 9753918
    Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.
    Type: Grant
    Filed: January 5, 2015
    Date of Patent: September 5, 2017
    Assignee: Facebook, Inc.
    Inventors: Alexander Waibel, Ian R. Lane
  • Patent number: 9747893
    Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: August 29, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 9721561
    Abstract: In a speech recognition system, deep neural networks (DNNs) are employed in phoneme recognition. While DNNs typically provide better phoneme recognition performance than other techniques, such as Gaussian mixture models (GMM), adapting a DNN to a particular speaker is a real challenge. According to at least one example embodiment, speech data and corresponding speaker data are both applied as input to a DNN. In response, the DNN generates a prediction of a phoneme based on the input speech data and the corresponding speaker data. The speaker data may be generated from the corresponding speech data.
    Type: Grant
    Filed: December 5, 2013
    Date of Patent: August 1, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Yun Tang, Venkatesh Nagesha, Xing Fan
  • Patent number: 9699298
    Abstract: An electronic communication device and method for communication including automatically selecting, composing, and/or presenting a number of proposed responses to a message is disclosed. The device can present the incoming message content to the user together with an alert to the user that one or more proposed responses are available for review and selection. The device can calculate a confidence metric for each proposed response. The user alert and/or presentation of the messages can be made to vary in intensity in accordance with the confidence metric of the one or more proposed responses. The presentation of the proposed responses can be alternated and/or combined with a method for the device to receive user input to the content of the proposed response. The user can edit a proposed response rather than composing a complete response.
    Type: Grant
    Filed: August 10, 2015
    Date of Patent: July 4, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Mark Rubinstein, Jason Nelson
  • Patent number: 9684650
    Abstract: A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured norm. The trained language model with optimized values for the parameters generated by the optimizing is applied to predict a symbol following sequence of symbols of the language modeled by the language model. In some embodiments the penalty comprises a tree-structured lp-norm, such as a tree-structured l2-norm or a tree-structured l?-norm. In some embodiments a tree-structured l?-norm operates on a collapsed suffix trie in which any series of suffixes of increasing lengths which are always observed in the same context are collapsed into a single node. The optimizing may be performed using a proximal step algorithm.
    Type: Grant
    Filed: September 10, 2014
    Date of Patent: June 20, 2017
    Assignee: XEROX CORPORATION
    Inventors: Anil Kumar Nelakanti, Guillaume M. Bouchard, Cedric Archambeau, Francis Bach, Julien Mairal
  • Patent number: 9640178
    Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: May 2, 2017
    Assignee: Kopin Corporation
    Inventor: Christopher Parkinson
  • Patent number: 9626354
    Abstract: For using a tone indicator to determine a language for text recognition, an apparatus for using a tone indicator to identify language in text recognition is disclosed. The apparatus for text recognition using a tone indicator includes a text input module, a tone indicator module, and a language identification module. A method and computer program product also perform the functions of the apparatus.
    Type: Grant
    Filed: January 21, 2014
    Date of Patent: April 18, 2017
    Assignee: Lenovo (Singapore) PTE. LTD.
    Inventor: Xin Feng
  • Patent number: 9620111
    Abstract: Described herein are systems and methods for the generation and maintenance of language models. Language models are developed based at least in part on transaction data from one or more users. These transactions may include purchases and other interactions between one or more users and one or more online merchants. The language models may be associated with a particular user or group of users. Ongoing transactions may modify the language models. The language models may be used to recognize spoken input from the one or more users.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: April 11, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Matthew P. Bell, Stephen Polansky, Yuzo Watanabe
  • Patent number: 9594744
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 14, 2017
    Assignee: Google Inc.
    Inventors: Hasim Sak, Francoise Beaufays
  • Patent number: 9564125
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: February 7, 2017
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
  • Patent number: 9542604
    Abstract: A method and apparatus of providing a combined summary by receiving monitored audio and video are provided The method includes: receiving audio and video captured by at least one network camera; generating a video summary by detecting at least one video event from at least one of the audio and the video; generating an audio summary by detecting at least one audio event from at least one of the audio and the video; extracting at least one section of the video summary corresponding to the at least one audio event, and storing the extracted at least one section of the video summary with the audio summary; and providing a display of the video reproducing apparatus with a video summary control interface for controlling the video summary and an audio summary control interface for controlling the audio summary.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: January 10, 2017
    Assignee: Hanwha Techwin Co., Ltd.
    Inventor: Sungbong Cho
  • Patent number: 9530402
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: December 27, 2016
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
  • Patent number: 9484025
    Abstract: The disclosure includes a system and method for configuring custom vocabularies for personalized speech recognition. The system includes a processor and a memory storing instructions that when executed cause the system to: detect a provisioning trigger event; determine a state of a journey associated with a user based on the provisioning trigger event; determine one or more interest places based on the state of the journey; populate a place vocabulary associated with the user using the one or more interest places; filter the place vocabulary based on one or more place filtering parameters; and register the filtered place vocabulary for the user.
    Type: Grant
    Filed: October 15, 2013
    Date of Patent: November 1, 2016
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Divya Sai Toopran, Vinuth Rai, Rahul Parundekar
  • Patent number: 9471064
    Abstract: A method for controlling a drone includes receiving a natural language request for information about a spatial location, parsing the natural language request into data requests, configuring a flight plan and controlling one or more drones to fly over the spatial location to obtain data types based on the data requests, and extracting and analyzing data to answer the request. The method can include extracting data points from the data types, obtaining labels from a user for one or more of the data points, predicting labels for unlabeled data points from a learning algorithm using the labels obtained from the user, determining the predicted labels are true labels for the unlabeled data points and combining the extracted data, the user labeled data points and the true labeled data points to answer the request for information. The learning algorithm may be active learning using a support vector machine.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: October 18, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gregory F. Boland, James R. Kozloski, Yu Ma, Justin G. Manweiler, Kevin E. Siemonsen, Umut Topkara, Katherine Vogt, Justin D. Weisz
  • Patent number: 9449599
    Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question—e.g., words that are not part of the proper name entities—may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
    Type: Grant
    Filed: May 30, 2014
    Date of Patent: September 20, 2016
    Assignee: PROMPTU SYSTEMS CORPORATION
    Inventor: Harry William Printz
  • Patent number: 9406298
    Abstract: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According to an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of a linear operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition.
    Type: Grant
    Filed: April 4, 2013
    Date of Patent: August 2, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Sandro Cumani, Pietro Laface
  • Patent number: 9367526
    Abstract: A language processing application employs a classing function optimized for the underlying production application context for which it is expected to process speech. A combination of class based and word based features generates a classing function optimized for a particular production application, meaning that a language model employing the classing function uses word classes having a high likelihood of accurately predicting word sequences encountered by a language model invoked by the production application. The classing function optimizes word classes by aligning the objective of word classing with the underlying language processing task to be performed by the production application. The classing function is optimized to correspond to usage in the production application context using class-based and word-based features by computing a likelihood of a word in an n-gram and a frequency of a word within a class of the n-gram.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: June 14, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Paul Vozila, Maximilian Bisani, Yi Su, Stephen M. Chu, Stanley F. Chen, Ruhi Sarikaya, Bhuvana Ramabhadran
  • Patent number: 9367612
    Abstract: A system identifies a set of initial segments of a time-based data item, such as audio. The segments can be defined at regular time intervals within the time-based data item. The initial segments are short segments. The system computes a short-timescale vectorial representation for each initial segment and compares the short-timescale vectorial representation for each initial segment with other short-timescale vectorial representations of the segments in a time duration within the time-based data item (e.g., audio) immediately preceding or immediately following the initial segment. The system generates a representation of long-timescale information for the time-based data item based on a comparison of the short-timescale vectorial representations of the initial segments and the short-timescale vectorial representations of immediate segments. The representation of long-timescale information identifies an underlying repetition structure of the time-based data item, such as rhythm or phrasing in an audio item.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: June 14, 2016
    Assignee: GOOGLE INC.
    Inventors: Douglas Eck, Jay Yagnik
  • Patent number: 9288327
    Abstract: An apparatus and method for routing an incoming call includes the apparatus configured to perform a routing decision resulting in the incoming call being routed to the voicemail of a called user. The apparatus further creates a transcription of a voicemail message being recorded on the voicemail of the called user. This is done by applying a speech analysis process to the voicemail message as the voicemail message is being recorded. Information about a context of the incoming call is determined from the transcription and a second routing decision is performed based on the information.
    Type: Grant
    Filed: May 14, 2014
    Date of Patent: March 15, 2016
    Assignee: Mitel Networks Corporation
    Inventors: Trung Tim Trinh, Tom Gray
  • Patent number: 9275039
    Abstract: According to an embodiment, an information processing device includes a first set generating unit that generates a first set of states reachable by following ?-transitions in a forward direction and a reverse direction among states included in a finite state automaton; a second set generating unit that generates a second set representing a set of states included in each of a plurality of first sets connected by one transition that is not an ?-transition and the first set that is not connected to another first set by one transition that is not an ?-transition; a combination generating unit that obtains, among combinations of ?-removal patterns to be respectively applied to the ?-transitions in the second set, a combination with which the number of transitions after removing ?-transitions is the smallest; and a removing unit that removes ?-transitions in the second set by the removal patterns included in the obtained combination.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: March 1, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Manabu Nagao
  • Patent number: 9275128
    Abstract: Generating a document index comprises: obtaining a document to be indexed; determining whether each monadic partition obtained from the document is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and indexing the polynary partition, otherwise, indexing the monadic partition. Querying data comprising: receiving a data query, determining whether each monadic partition obtained from the data query is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and using the polynary partition to obtain search results, otherwise, using the monadic partition to obtain search results; and combining search results to form a final query search result.
    Type: Grant
    Filed: July 20, 2010
    Date of Patent: March 1, 2016
    Assignee: Alibaba Group Holding Limited
    Inventors: Lei Wei, Jiaxiang Shen
  • Patent number: 9256457
    Abstract: A system for providing an interactive response system for hosted services may include a processor and a memory. The processor may facilitate steps of receiving data streams from participant devices, such as devices participating in a multimedia conference, and transmitting, in response to receiving the data streams from the participant devices, the data streams to the other of the participant devices. The steps may further include receiving, from one of the participant devices, a request to initiate communication with an invitee device, processing the data streams to determine a language associated with the data streams, and providing, to the invitee device, a communication in the determined language. The communication provided to the invitee device may be an initial communication of an interactive response system, such as an interactive voice response system or an interactive messaging response system.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: February 9, 2016
    Assignee: Google Inc.
    Inventors: Matthew John Leske, Jarkko Oikarinen, Tomas Lundqvist
  • Patent number: 9224384
    Abstract: Embodiments of the present invention include an acoustic processing device, a method for acoustic signal processing, and a speech recognition system. The speech processing device can include a processing unit, a histogram pruning unit, and a pre-pruning unit. The processing unit is configured to calculate one or more Hidden Markov Model (HMM) pruning thresholds. The histogram pruning unit is configured to prune one or more HMM states to generate one or more active HMM states. The pruning is based on the one or more pruning thresholds. The pre-pruning unit is configured to prune the one or more active HMM states based on an adjustable pre-pruning threshold. Further, the adjustable pre-pruning threshold is based on the one or more pruning thresholds.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: December 29, 2015
    Assignee: Cypress Semiconductor Corporation
    Inventor: Ojas Ashok Bapat
  • Patent number: 9225879
    Abstract: A method for video sequential alignment is provided. The method includes inputting a first video sequence and a second video sequence, and extracting features of the first video sequence and the second video sequence. The method also includes generating an expanded path using a hierarchical sequential alignment algorithm and generating masks containing search space using keyframe matching. Further, the method includes calculating the lowest path cost within search space to find a minimal alignment path using a block A* algorithm and outputting a final alignment result after applying the block A* algorithm to search space.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: December 29, 2015
    Assignee: TCL RESEARCH AMERICA INC.
    Inventors: Armin Kappeler, Haiguang Li, Michail Iliadis, Haohong Wang
  • Patent number: 9202459
    Abstract: Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving a first utterance from a user of the speech system; determining a first list of possible results from the first utterance, wherein the first list includes at least two elements that each represent a possible result; analyzing the at least two elements of the first list to determine an ambiguity of the elements; and generating a speech prompt to the user based on partial orthography and the ambiguity.
    Type: Grant
    Filed: April 19, 2013
    Date of Patent: December 1, 2015
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Eli Tzirkel-Hancock, Gaurav Talwar, Xufang Zhao, Greg T. Lindemann
  • Patent number: 9190058
    Abstract: Embodiments related to recognizing speech inputs are disclosed. One disclosed embodiment provides a method for recognizing a speech input including receiving depth information of a physical space from a depth camera, determining an identity of a user in the physical space based on the depth information, receiving audio information from one or more microphones, and determining a speech input from the audio input. If the speech input comprises an ambiguous term, the ambiguous term in the speech input is compared to one or more of depth image data received from the depth image sensor and digital content consumption information for the user to identify an unambiguous term corresponding to the ambiguous term. After identifying the unambiguous term, an action is taken on the computing device based on the speech input and the unambiguous term.
    Type: Grant
    Filed: January 25, 2013
    Date of Patent: November 17, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventor: Christian Klein
  • Patent number: 9110888
    Abstract: A service server apparatus is provided which can realize a service as if a speaker has a conversation in a face-to-face manner using existing applications. The service server apparatus includes a service activating unit receiving an instruction for performing a task during a communication, a telephone/call control unit recording a call speech of the speaker during the communication between speech communication terminal devices, a speech recognizing enabler, a text translating enabler, and a speech synthesizing enabler performing the task based on the recorded call speech when the instruction is received, and a communication control enabler providing task data obtained through an performance of the task to the speech communication terminal device. The task data is text data obtained by converting the call speech into a text, translated text data obtained by translating the text data into another language, and speech data obtained by converting the translated text data into a speech.
    Type: Grant
    Filed: May 24, 2012
    Date of Patent: August 18, 2015
    Assignee: NTT DOCOMO, Inc.
    Inventors: Kazunori Nasu, Seiji Shibata, Mari Akuzawa, Ken Uchiyama
  • Patent number: 9104768
    Abstract: Techniques are described for aggregating person-based information on a computing device. The computing device includes a display device and one or more processors. The processors are configured to determine a context of the computing device, and identify a contact associated with the currently-executing application and a plurality of contact identifiers associated with the contact. The processors are configured to receive, from two or more different sources and based on at least one contact, information associated with the contact that includes a respective portion of information received from each of the different sources, determine, based on the context of the computing device, a respective priority value for each of the portions of the information, select, based on the respective priority value, one or more portions of the information, and output, for display, a graphical user interface including at least the selected one or more portions of the information.
    Type: Grant
    Filed: April 2, 2014
    Date of Patent: August 11, 2015
    Assignee: Google Inc.
    Inventors: Marco Paglia, Alexander Faaborg
  • Patent number: 9093076
    Abstract: A multipass processing system includes a first grammar-based speech recognition system that compares a spoken utterance to a sub-grammar. The sub-grammar includes keywords or key phrases from active grammars that each uniquely identifies one of many application engines. The first grammar-based speech recognition system generates a first grammar-based speech recognition result and a first grammar-based confidence score. A demultiplexer receives the spoken utterance through an input. The demultiplexer transmits the spoken utterance to one of many other grammar-based speech recognition systems based on the first grammar-based speech recognition-result.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: July 28, 2015
    Assignee: 2236008 Ontario Inc.
    Inventor: Darrin Kenneth Fry
  • Patent number: 9093075
    Abstract: A method is disclosed herein for recognizing a repeated utterance in a mobile computing device via a processor. A first utterance is detected being spoken into a first mobile computing device. Likewise, a second utterance is detected being spoken into a second mobile computing device within a predetermined time period. The second utterance substantially matches the first spoken utterance and the first and second mobile computing devices are communicatively coupled to each other. The processor enables capturing, at least temporarily, a matching utterance for performing a subsequent processing function. The performed subsequent processing function is based on a type of captured utterance.
    Type: Grant
    Filed: April 20, 2012
    Date of Patent: July 28, 2015
    Assignee: Google Technology Holdings LLC
    Inventors: Rachid M Alameh, Jiri Slaby, Hirsashi D Watanabe
  • Patent number: 9076451
    Abstract: An operation determination processing section of a center extracts words included in the utterance of a driver and an operator, reads an attribute associated with each word from a synonym and related word in which an attribute is stored so as to be associated with each word, reads a domain of a candidate or the like for the task associated with the attribute from the synonym and related word in which domains of a candidate for a task associated with the read attribute or domains of a task to be actually performed are stored, totals the domains read for each word for words included in the utterance of the driver or the like, and estimates those related to a domain with a highest total score as the candidate for the task and the task to be actually performed. In this manner, it is possible to estimate the task with high accuracy.
    Type: Grant
    Filed: April 12, 2011
    Date of Patent: July 7, 2015
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Toshiyuki Namba, Hiroaki Sekiyama, Tatsuyuki Oka, Minako Fujishiro, Katsutoshi Okawa, Emi Otani, Atsushi Sasaki, Yasuhiko Fujita
  • Patent number: 9069755
    Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.
    Type: Grant
    Filed: March 11, 2010
    Date of Patent: June 30, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Robert Carter Moore
  • Patent number: 9064495
    Abstract: In some embodiments, a user device receives a voice signal corresponding to a user utterance. The user device may set a time marker corresponding to a point in time in the voice signal. The voice signal and the time marker may be transmitted to a server device. The server device may perform speech recognition using the voice signal. The server device may determine a time offset corresponding to a difference in time between an end point of the user utterance and a time associated with the time marker. The server device may determine a response to the user utterance. The server device may transmit the time offset and the response to the user device. The user device may use the time offset to determine a user-perceived latency between the end of the user utterance and a beginning of the response.
    Type: Grant
    Filed: May 7, 2013
    Date of Patent: June 23, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Fred Torok, Peter Spalding VanLund
  • Patent number: 9047868
    Abstract: A specific language model for speech recognition may be built. In some embodiments, the specific language model is associated with a user and built using a corpus of text obtained from a user computing device. In some embodiments, a sequence of words is constructed from the corpus of text. The sequence of words may be obfuscated, and the obfuscated sequence of words may be stored in the specific language model. A server or a user device may use the specific language model in conjunction with a general language model to perform speech recognition on an utterance made by the user.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: June 2, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Jeffrey C. O'Neill, Ryan P. Thomas
  • Patent number: 9043208
    Abstract: A speech recognition system, method of recognizing speech and a computer program product therefor. A client device identified with a context for an associated user selectively streams audio to a provider computer, e.g., a cloud computer. Speech recognition receives streaming audio, maps utterances to specific textual candidates and determines a likelihood of a correct match for each mapped textual candidate. A context model selectively winnows candidate to resolve recognition ambiguity according to context whenever multiple textual candidates are recognized as potential matches for the same mapped utterance. Matches are used to update the context model, which may be used for multiple users in the same context.
    Type: Grant
    Filed: August 10, 2012
    Date of Patent: May 26, 2015
    Assignee: International Business Machines Corporation
    Inventors: Fernando Luiz Koch, Julio Nogima
  • Patent number: 9043213
    Abstract: A speech recognition method including the steps of receiving a speech input from a known speaker of a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model has a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation and has been trained using first training data and adapted using second training data to said speaker. The speech recognition method also determines the likelihood of a sequence of observations occurring in a given language using a language model and combines the likelihoods determined by the acoustic model and the language model and outputs a sequence of words identified from said speech input signal. The acoustic model is context based for the speaker, the context based information being contained in the model using a plurality of decision trees and the structure of the decision trees is based on second training data.
    Type: Grant
    Filed: January 26, 2011
    Date of Patent: May 26, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Byung Ha Chun
  • Patent number: 9037464
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 19, 2015
    Assignee: Google Inc.
    Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
  • Patent number: 9037465
    Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.
    Type: Grant
    Filed: February 21, 2013
    Date of Patent: May 19, 2015
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: I. Dan Melamed, Andrej Ljolje, Bernard Renger, Yeon-Jun Kim, David J. Smith
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9026429
    Abstract: In one embodiment, a method includes receiving one or more characters of a character string as a user enters the character string into a graphical user interface (GUI) of a computing device. The method also includes determining a degree of difficulty of the user entering the character string into the GUI of the computing device. The method further includes, if the degree of difficulty is at least approximately equal to or exceeds a pre-determined threshold, providing for display to the user an auto-suggestion for completing the character string for the user.
    Type: Grant
    Filed: December 5, 2012
    Date of Patent: May 5, 2015
    Assignee: Facebook, Inc.
    Inventors: Jenny Yuen, Luke St. Clair
  • Patent number: 9020818
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Patent number: 9015044
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 21, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Patent number: 9015047
    Abstract: A method, a computer readable medium and a system for reporting automatic speech recognition that comprises, collecting an utterance, analyzing the utterance, receiving a translation of the utterance, and determining a difference between the analyzed utterance and the translated utterance. An embodiment the disclosure includes updating the utterance analysis based upon the determined difference, correlating the analyzed utterance to the translated utterance and tracking determined difference by a translator. In another embodiment the disclose includes reporting, categorizing, sorting, and grouping the determined difference.
    Type: Grant
    Filed: February 12, 2007
    Date of Patent: April 21, 2015
    Assignee: West Corporation
    Inventors: Aaron Scott Fisher, Prashanta Pradhan