Specialized Models Patents (Class 704/255)

Markov (Class 704/256)

Natural language (Class 704/257)

Language models using non-linguistic context

Patent number: 9842592

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using non-linguistic context. In some implementations, context data indicating non-linguistic context for the utterance is received. Based on the context data, feature scores for one or more non-linguistic features are generated. The feature scores for the non-linguistic features are provided to a language model trained to process scores for non-linguistic features. The output from the language model is received, and a transcription for the utterance is determined using the output of the language model.

Type: Grant

Filed: February 12, 2014

Date of Patent: December 12, 2017

Assignee: Google Inc.

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar
Multi-level content analysis and response

Patent number: 9842593

Abstract: Predetermined services are provided using preset instructions. A transcript of audible content provided over an electronic network and received at a communications device is analyzed to determine whether a trigger is present in the audible content. When the trigger is present in the audible content, preset instructions correlated with the trigger and instructing how to provide a predetermined service are identified. The predetermined service is provided by following the preset instructions.

Type: Grant

Filed: November 14, 2014

Date of Patent: December 12, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Shiv Kumar, Paritosh Bajpay, Anand K. Singh, Satyendra Tripathi
Discriminating ambiguous expressions to enhance user experience

Patent number: 9836452

Abstract: Methods and systems are provided for discriminating ambiguous expressions to enhance user experience. For example, a natural language expression may be received by a speech recognition component. The natural language expression may include at least one of words, terms, and phrases of text. A dialog hypothesis set from the natural language expression may be created by using contextual information. In some cases, the dialog hypothesis set has at least two dialog hypotheses. A plurality of dialog responses may be generated for the dialog hypothesis set. The dialog hypothesis set may be ranked based on an analysis of the plurality of the dialog responses. An action may be performed based on ranking the dialog hypothesis set.

Type: Grant

Filed: December 30, 2014

Date of Patent: December 5, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jean-Philippe Robichaud, Ruhi Sarikaya
User configurable speech commands

Patent number: 9830909

Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.

Type: Grant

Filed: March 31, 2017

Date of Patent: November 28, 2017

Assignee: Kopin Corporation

Inventor: Christopher Parkinson
Hidden markov model processing engine

Patent number: 9817881

Abstract: A method, apparatus, and tangible computer readable medium for processing a Hidden Markov Model (HMM) structure are disclosed herein. For example, the method includes receiving Hidden Markov Model (HMM) information from an external system. The method also includes processing back pointer data and first HMM states scores for one or more NULL states in the HMM information. Second HMM state scores are processed for one or more non-NULL states in the HMM information based on at least one predecessor state. Further, the method includes transferring the second HMM state scores to the external system.

Type: Grant

Filed: October 16, 2013

Date of Patent: November 14, 2017

Assignee: Cypress Semiconductor Corporation

Inventors: Ojas A. Bapat, Richard M. Fastow, Jens Olson, Kenichi Kumatani
Automatically suggesting responses to a received message

Patent number: 9805371

Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of responses may be provided to the responding user so that the responding user may use a suggested response instead of entering text to provide a response. Previous messages between the two users and other information may be used to determine an appropriate suggested response. A conversation feature vector may be determined from previous messages, and response feature vectors may be determined from the conversation feature vector. The response feature vectors may be used to determine an appropriate suggested response.

Type: Grant

Filed: September 1, 2016

Date of Patent: October 31, 2017

Assignee: ASAPP, INC.

Inventors: Gustavo Sapoznik, Shawn Henry
Low resource key phrase detection for wake on voice

Patent number: 9792907

Abstract: Techniques related to key phrase detection for applications such as wake on voice are discussed. Such techniques may include updating a start state based rejection model and a key phrase model based on scores of sub-phonetic units from an acoustic model to generate a rejection likelihood score and a key phrase likelihood score and determining whether received audio input is associated with a predetermined key phrase based on the rejection likelihood score and the key phrase likelihood score.

Type: Grant

Filed: November 24, 2015

Date of Patent: October 17, 2017

Assignee: Intel IP Corporation

Inventors: Tobias Bocklet, Joachim Hofer
Lexicon development via shared translation database

Patent number: 9753918

Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.

Type: Grant

Filed: January 5, 2015

Date of Patent: September 5, 2017

Assignee: Facebook, Inc.

Inventors: Alexander Waibel, Ian R. Lane
Unsupervised training method, training apparatus, and training program for an N-gram language model based upon recognition reliability

Patent number: 9747893

Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.

Type: Grant

Filed: October 6, 2016

Date of Patent: August 29, 2017

Assignee: International Business Machines Corporation

Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
Method and apparatus for speech recognition using neural networks with speaker adaptation

Patent number: 9721561

Abstract: In a speech recognition system, deep neural networks (DNNs) are employed in phoneme recognition. While DNNs typically provide better phoneme recognition performance than other techniques, such as Gaussian mixture models (GMM), adapting a DNN to a particular speaker is a real challenge. According to at least one example embodiment, speech data and corresponding speaker data are both applied as input to a DNN. In response, the DNN generates a prediction of a phoneme based on the input speech data and the corresponding speaker data. The speaker data may be generated from the corresponding speech data.

Type: Grant

Filed: December 5, 2013

Date of Patent: August 1, 2017

Assignee: Nuance Communications, Inc.

Inventors: Yun Tang, Venkatesh Nagesha, Xing Fan
Smart automatic composition of short messaging responses

Patent number: 9699298

Abstract: An electronic communication device and method for communication including automatically selecting, composing, and/or presenting a number of proposed responses to a message is disclosed. The device can present the incoming message content to the user together with an alert to the user that one or more proposed responses are available for review and selection. The device can calculate a confidence metric for each proposed response. The user alert and/or presentation of the messages can be made to vary in intensity in accordance with the confidence metric of the one or more proposed responses. The presentation of the proposed responses can be alternated and/or combined with a method for the device to receive user input to the content of the proposed response. The user can edit a proposed response rather than composing a complete response.

Type: Grant

Filed: August 10, 2015

Date of Patent: July 4, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Mark Rubinstein, Jason Nelson
Language model with structured penalty

Patent number: 9684650

Abstract: A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured norm. The trained language model with optimized values for the parameters generated by the optimizing is applied to predict a symbol following sequence of symbols of the language modeled by the language model. In some embodiments the penalty comprises a tree-structured lp-norm, such as a tree-structured l2-norm or a tree-structured l?-norm. In some embodiments a tree-structured l?-norm operates on a collapsed suffix trie in which any series of suffixes of increasing lengths which are always observed in the same context are collapsed into a single node. The optimizing may be performed using a proximal step algorithm.

Type: Grant

Filed: September 10, 2014

Date of Patent: June 20, 2017

Assignee: XEROX CORPORATION

Inventors: Anil Kumar Nelakanti, Guillaume M. Bouchard, Cedric Archambeau, Francis Bach, Julien Mairal
User configurable speech commands

Patent number: 9640178

Abstract: A speech recognition method and system enables user-configurable speech commands. For a given speech command, the speech recognition engine provides a mechanism for the end-user to select speech command terms to use in substitution for the given speech command. The speech recognition engine, or module thereof, forms a replacement command for the given speech command from the user-selected speech command terms. The speech recognition engine thereafter is synonymously responsive to user utterance of the replacement command as though the user issued/voiced the associated counterpart given speech command.

Type: Grant

Filed: December 17, 2014

Date of Patent: May 2, 2017

Assignee: Kopin Corporation

Inventor: Christopher Parkinson
Systems and methods for using tone indicator in text recognition

Patent number: 9626354

Abstract: For using a tone indicator to determine a language for text recognition, an apparatus for using a tone indicator to identify language in text recognition is disclosed. The apparatus for text recognition using a tone indicator includes a text input module, a tone indicator module, and a language identification module. A method and computer program product also perform the functions of the apparatus.

Type: Grant

Filed: January 21, 2014

Date of Patent: April 18, 2017

Assignee: Lenovo (Singapore) PTE. LTD.

Inventor: Xin Feng
Generation and maintenance of language model

Patent number: 9620111

Abstract: Described herein are systems and methods for the generation and maintenance of language models. Language models are developed based at least in part on transaction data from one or more users. These transactions may include purchases and other interactions between one or more users and one or more online merchants. The language models may be associated with a particular user or group of users. Ongoing transactions may modify the language models. The language models may be used to recognize spoken input from the one or more users.

Type: Grant

Filed: May 1, 2012

Date of Patent: April 11, 2017

Assignee: Amazon Technologies, Inc.

Inventors: Matthew P. Bell, Stephen Polansky, Yuzo Watanabe
Speech transcription including written text

Patent number: 9594744

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.

Type: Grant

Filed: March 14, 2013

Date of Patent: March 14, 2017

Assignee: Google Inc.

Inventors: Hasim Sak, Francoise Beaufays
Methods and systems for adapting a speech system based on user characteristics

Patent number: 9564125

Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.

Type: Grant

Filed: October 22, 2013

Date of Patent: February 7, 2017

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
Method and apparatus for providing combined-summary in imaging apparatus

Patent number: 9542604

Abstract: A method and apparatus of providing a combined summary by receiving monitored audio and video are provided The method includes: receiving audio and video captured by at least one network camera; generating a video summary by detecting at least one video event from at least one of the audio and the video; generating an audio summary by detecting at least one audio event from at least one of the audio and the video; extracting at least one section of the video summary corresponding to the at least one audio event, and storing the extracted at least one section of the video summary with the audio summary; and providing a display of the video reproducing apparatus with a video summary control interface for controlling the video summary and an audio summary control interface for controlling the audio summary.

Type: Grant

Filed: May 13, 2015

Date of Patent: January 10, 2017

Assignee: Hanwha Techwin Co., Ltd.

Inventor: Sungbong Cho
Methods and systems for adapting a speech system based on user characteristics

Patent number: 9530402

Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.

Type: Grant

Filed: October 22, 2013

Date of Patent: December 27, 2016

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Ron M. Hecht, Timothy J. Grost, Robert D. Sims, III, Ute Winter
Configuring dynamic custom vocabulary for personalized speech recognition

Patent number: 9484025

Abstract: The disclosure includes a system and method for configuring custom vocabularies for personalized speech recognition. The system includes a processor and a memory storing instructions that when executed cause the system to: detect a provisioning trigger event; determine a state of a journey associated with a user based on the provisioning trigger event; determine one or more interest places based on the state of the journey; populate a place vocabulary associated with the user using the one or more interest places; filter the place vocabulary based on one or more place filtering parameters; and register the filtered place vocabulary for the user.

Type: Grant

Filed: October 15, 2013

Date of Patent: November 1, 2016

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Divya Sai Toopran, Vinuth Rai, Rahul Parundekar
System and method to operate a drone

Patent number: 9471064

Abstract: A method for controlling a drone includes receiving a natural language request for information about a spatial location, parsing the natural language request into data requests, configuring a flight plan and controlling one or more drones to fly over the spatial location to obtain data types based on the data requests, and extracting and analyzing data to answer the request. The method can include extracting data points from the data types, obtaining labels from a user for one or more of the data points, predicting labels for unlabeled data points from a learning algorithm using the labels obtained from the user, determining the predicted labels are true labels for the unlabeled data points and combining the extracted data, the user labeled data points and the true labeled data points to answer the request for information. The learning algorithm may be active learning using a support vector machine.

Type: Grant

Filed: December 8, 2015

Date of Patent: October 18, 2016

Assignee: International Business Machines Corporation

Inventors: Gregory F. Boland, James R. Kozloski, Yu Ma, Justin G. Manweiler, Kevin E. Siemonsen, Umut Topkara, Katherine Vogt, Justin D. Weisz
Systems and methods for adaptive proper name entity recognition and understanding

Patent number: 9449599

Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question—e.g., words that are not part of the proper name entities—may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.

Type: Grant

Filed: May 30, 2014

Date of Patent: September 20, 2016

Assignee: PROMPTU SYSTEMS CORPORATION

Inventor: Harry William Printz
Method and apparatus for efficient i-vector extraction

Patent number: 9406298

Abstract: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According to an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of a linear operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition.

Type: Grant

Filed: April 4, 2013

Date of Patent: August 2, 2016

Assignee: Nuance Communications, Inc.

Inventors: Sandro Cumani, Pietro Laface
Word classing for language modeling

Patent number: 9367526

Abstract: A language processing application employs a classing function optimized for the underlying production application context for which it is expected to process speech. A combination of class based and word based features generates a classing function optimized for a particular production application, meaning that a language model employing the classing function uses word classes having a high likelihood of accurately predicting word sequences encountered by a language model invoked by the production application. The classing function optimizes word classes by aligning the objective of word classing with the underlying language processing task to be performed by the production application. The classing function is optimized to correspond to usage in the production application context using class-based and word-based features by computing a likelihood of a word in an n-gram and a frequency of a word within a class of the n-gram.

Type: Grant

Filed: July 26, 2011

Date of Patent: June 14, 2016

Assignee: Nuance Communications, Inc.

Inventors: Paul Vozila, Maximilian Bisani, Yi Su, Stephen M. Chu, Stanley F. Chen, Ruhi Sarikaya, Bhuvana Ramabhadran
Correlation-based method for representing long-timescale structure in time-series data

Patent number: 9367612

Abstract: A system identifies a set of initial segments of a time-based data item, such as audio. The segments can be defined at regular time intervals within the time-based data item. The initial segments are short segments. The system computes a short-timescale vectorial representation for each initial segment and compares the short-timescale vectorial representation for each initial segment with other short-timescale vectorial representations of the segments in a time duration within the time-based data item (e.g., audio) immediately preceding or immediately following the initial segment. The system generates a representation of long-timescale information for the time-based data item based on a comparison of the short-timescale vectorial representations of the initial segments and the short-timescale vectorial representations of immediate segments. The representation of long-timescale information identifies an underlying repetition structure of the time-based data item, such as rhythm or phrasing in an audio item.

Type: Grant

Filed: November 18, 2011

Date of Patent: June 14, 2016

Assignee: GOOGLE INC.

Inventors: Douglas Eck, Jay Yagnik
Apparatus and method for routing an incoming call

Patent number: 9288327

Abstract: An apparatus and method for routing an incoming call includes the apparatus configured to perform a routing decision resulting in the incoming call being routed to the voicemail of a called user. The apparatus further creates a transcription of a voicemail message being recorded on the voicemail of the called user. This is done by applying a speech analysis process to the voicemail message as the voicemail message is being recorded. Information about a context of the incoming call is determined from the transcription and a second routing decision is performed based on the information.

Type: Grant

Filed: May 14, 2014

Date of Patent: March 15, 2016

Assignee: Mitel Networks Corporation

Inventors: Trung Tim Trinh, Tom Gray
Information processing device, information processing method and computer program product

Patent number: 9275039

Abstract: According to an embodiment, an information processing device includes a first set generating unit that generates a first set of states reachable by following ?-transitions in a forward direction and a reverse direction among states included in a finite state automaton; a second set generating unit that generates a second set representing a set of states included in each of a plurality of first sets connected by one transition that is not an ?-transition and the first set that is not connected to another first set by one transition that is not an ?-transition; a combination generating unit that obtains, among combinations of ?-removal patterns to be respectively applied to the ?-transitions in the second set, a combination with which the number of transitions after removing ?-transitions is the smallest; and a removing unit that removes ?-transitions in the second set by the removal patterns included in the obtained combination.

Type: Grant

Filed: September 14, 2012

Date of Patent: March 1, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventor: Manabu Nagao
Method and system for document indexing and data querying

Patent number: 9275128

Abstract: Generating a document index comprises: obtaining a document to be indexed; determining whether each monadic partition obtained from the document is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and indexing the polynary partition, otherwise, indexing the monadic partition. Querying data comprising: receiving a data query, determining whether each monadic partition obtained from the data query is a filter character and if so, forming a polynary partition with the monadic partition and at least one adjacent monadic partition and using the polynary partition to obtain search results, otherwise, using the monadic partition to obtain search results; and combining search results to form a final query search result.

Type: Grant

Filed: July 20, 2010

Date of Patent: March 1, 2016

Assignee: Alibaba Group Holding Limited

Inventors: Lei Wei, Jiaxiang Shen
Interactive response system for hosted services

Patent number: 9256457

Abstract: A system for providing an interactive response system for hosted services may include a processor and a memory. The processor may facilitate steps of receiving data streams from participant devices, such as devices participating in a multimedia conference, and transmitting, in response to receiving the data streams from the participant devices, the data streams to the other of the participant devices. The steps may further include receiving, from one of the participant devices, a request to initiate communication with an invitee device, processing the data streams to determine a language associated with the data streams, and providing, to the invitee device, a communication in the determined language. The communication provided to the invitee device may be an initial communication of an interactive response system, such as an interactive voice response system or an interactive messaging response system.

Type: Grant

Filed: March 28, 2012

Date of Patent: February 9, 2016

Assignee: Google Inc.

Inventors: Matthew John Leske, Jarkko Oikarinen, Tomas Lundqvist
Histogram based pre-pruning scheme for active HMMS

Patent number: 9224384

Abstract: Embodiments of the present invention include an acoustic processing device, a method for acoustic signal processing, and a speech recognition system. The speech processing device can include a processing unit, a histogram pruning unit, and a pre-pruning unit. The processing unit is configured to calculate one or more Hidden Markov Model (HMM) pruning thresholds. The histogram pruning unit is configured to prune one or more HMM states to generate one or more active HMM states. The pruning is based on the one or more pruning thresholds. The pre-pruning unit is configured to prune the one or more active HMM states based on an adjustable pre-pruning threshold. Further, the adjustable pre-pruning threshold is based on the one or more pruning thresholds.

Type: Grant

Filed: December 21, 2012

Date of Patent: December 29, 2015

Assignee: Cypress Semiconductor Corporation

Inventor: Ojas Ashok Bapat
Method and apparatus for video sequential alignment

Patent number: 9225879

Abstract: A method for video sequential alignment is provided. The method includes inputting a first video sequence and a second video sequence, and extracting features of the first video sequence and the second video sequence. The method also includes generating an expanded path using a hierarchical sequential alignment algorithm and generating masks containing search space using keyframe matching. Further, the method includes calculating the lowest path cost within search space to find a minimal alignment path using a block A* algorithm and outputting a final alignment result after applying the block A* algorithm to search space.

Type: Grant

Filed: December 27, 2013

Date of Patent: December 29, 2015

Assignee: TCL RESEARCH AMERICA INC.

Inventors: Armin Kappeler, Haiguang Li, Michail Iliadis, Haohong Wang
Methods and systems for managing dialog of speech systems

Patent number: 9202459

Abstract: Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving a first utterance from a user of the speech system; determining a first list of possible results from the first utterance, wherein the first list includes at least two elements that each represent a possible result; analyzing the at least two elements of the first list to determine an ambiguity of the elements; and generating a speech prompt to the user based on partial orthography and the ambiguity.

Type: Grant

Filed: April 19, 2013

Date of Patent: December 1, 2015

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Eli Tzirkel-Hancock, Gaurav Talwar, Xufang Zhao, Greg T. Lindemann
Using visual cues to disambiguate speech inputs

Patent number: 9190058

Abstract: Embodiments related to recognizing speech inputs are disclosed. One disclosed embodiment provides a method for recognizing a speech input including receiving depth information of a physical space from a depth camera, determining an identity of a user in the physical space based on the depth information, receiving audio information from one or more microphones, and determining a speech input from the audio input. If the speech input comprises an ambiguous term, the ambiguous term in the speech input is compared to one or more of depth image data received from the depth image sensor and digital content consumption information for the user to identify an unambiguous term corresponding to the ambiguous term. After identifying the unambiguous term, an action is taken on the computing device based on the speech input and the unambiguous term.

Type: Grant

Filed: January 25, 2013

Date of Patent: November 17, 2015

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor: Christian Klein
Service server apparatus, service providing method, and service providing program for providing a service other than a telephone call during the telephone call on a telephone

Patent number: 9110888

Abstract: A service server apparatus is provided which can realize a service as if a speaker has a conversation in a face-to-face manner using existing applications. The service server apparatus includes a service activating unit receiving an instruction for performing a task during a communication, a telephone/call control unit recording a call speech of the speaker during the communication between speech communication terminal devices, a speech recognizing enabler, a text translating enabler, and a speech synthesizing enabler performing the task based on the recorded call speech when the instruction is received, and a communication control enabler providing task data obtained through an performance of the task to the speech communication terminal device. The task data is text data obtained by converting the call speech into a text, translated text data obtained by translating the text data into another language, and speech data obtained by converting the translated text data into a speech.

Type: Grant

Filed: May 24, 2012

Date of Patent: August 18, 2015

Assignee: NTT DOCOMO, Inc.

Inventors: Kazunori Nasu, Seiji Shibata, Mari Akuzawa, Ken Uchiyama
Person-based information aggregation

Patent number: 9104768

Abstract: Techniques are described for aggregating person-based information on a computing device. The computing device includes a display device and one or more processors. The processors are configured to determine a context of the computing device, and identify a contact associated with the currently-executing application and a plurality of contact identifiers associated with the contact. The processors are configured to receive, from two or more different sources and based on at least one contact, information associated with the contact that includes a respective portion of information received from each of the different sources, determine, based on the context of the computing device, a respective priority value for each of the portions of the information, select, based on the respective priority value, one or more portions of the information, and output, for display, a graphical user interface including at least the selected one or more portions of the information.

Type: Grant

Filed: April 2, 2014

Date of Patent: August 11, 2015

Assignee: Google Inc.

Inventors: Marco Paglia, Alexander Faaborg
Multipass ASR controlling multiple applications

Patent number: 9093076

Abstract: A multipass processing system includes a first grammar-based speech recognition system that compares a spoken utterance to a sub-grammar. The sub-grammar includes keywords or key phrases from active grammars that each uniquely identifies one of many application engines. The first grammar-based speech recognition system generates a first grammar-based speech recognition result and a first grammar-based confidence score. A demultiplexer receives the spoken utterance through an input. The demultiplexer transmits the spoken utterance to one of many other grammar-based speech recognition systems based on the first grammar-based speech recognition-result.

Type: Grant

Filed: April 30, 2012

Date of Patent: July 28, 2015

Assignee: 2236008 Ontario Inc.

Inventor: Darrin Kenneth Fry
Recognizing repeated speech in a mobile computing device

Patent number: 9093075

Abstract: A method is disclosed herein for recognizing a repeated utterance in a mobile computing device via a processor. A first utterance is detected being spoken into a first mobile computing device. Likewise, a second utterance is detected being spoken into a second mobile computing device within a predetermined time period. The second utterance substantially matches the first spoken utterance and the first and second mobile computing devices are communicatively coupled to each other. The processor enables capturing, at least temporarily, a matching utterance for performing a subsequent processing function. The performed subsequent processing function is based on a type of captured utterance.

Type: Grant

Filed: April 20, 2012

Date of Patent: July 28, 2015

Assignee: Google Technology Holdings LLC

Inventors: Rachid M Alameh, Jiri Slaby, Hirsashi D Watanabe
Operating system and method of operating

Patent number: 9076451

Abstract: An operation determination processing section of a center extracts words included in the utterance of a driver and an operator, reads an attribute associated with each word from a synonym and related word in which an attribute is stored so as to be associated with each word, reads a domain of a candidate or the like for the task associated with the attribute from the synonym and related word in which domains of a candidate for a task associated with the read attribute or domains of a task to be actually performed are stored, totals the domains read for each word for words included in the utterance of the driver or the like, and estimates those related to a domain with a highest total score as the candidate for the task and the task to be actually performed. In this manner, it is possible to estimate the task with high accuracy.

Type: Grant

Filed: April 12, 2011

Date of Patent: July 7, 2015

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Toshiyuki Namba, Hiroaki Sekiyama, Tatsuyuki Oka, Minako Fujishiro, Katsutoshi Okawa, Emi Otani, Atsushi Sasaki, Yasuhiko Fujita
N-gram model smoothing with independently controllable parameters

Patent number: 9069755

Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.

Type: Grant

Filed: March 11, 2010

Date of Patent: June 30, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventor: Robert Carter Moore
Measurement of user perceived latency in a cloud based speech application

Patent number: 9064495

Abstract: In some embodiments, a user device receives a voice signal corresponding to a user utterance. The user device may set a time marker corresponding to a point in time in the voice signal. The voice signal and the time marker may be transmitted to a server device. The server device may perform speech recognition using the voice signal. The server device may determine a time offset corresponding to a difference in time between an end point of the user utterance and a time associated with the time marker. The server device may determine a response to the user utterance. The server device may transmit the time offset and the response to the user device. The user device may use the time offset to determine a user-perceived latency between the end of the user utterance and a beginning of the response.

Type: Grant

Filed: May 7, 2013

Date of Patent: June 23, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Fred Torok, Peter Spalding VanLund
Language model data collection

Patent number: 9047868

Abstract: A specific language model for speech recognition may be built. In some embodiments, the specific language model is associated with a user and built using a corpus of text obtained from a user computing device. In some embodiments, a sequence of words is constructed from the corpus of text. The sequence of words may be obfuscated, and the obfuscated sequence of words may be stored in the specific language model. A server or a user device may use the specific language model in conjunction with a general language model to perform speech recognition on an utterance made by the user.

Type: Grant

Filed: July 31, 2012

Date of Patent: June 2, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Jeffrey C. O'Neill, Ryan P. Thomas
System, method and program product for providing automatic speech recognition (ASR) in a shared resource environment

Patent number: 9043208

Abstract: A speech recognition system, method of recognizing speech and a computer program product therefor. A client device identified with a context for an associated user selectively streams audio to a provider computer, e.g., a cloud computer. Speech recognition receives streaming audio, maps utterances to specific textual candidates and determines a likelihood of a correct match for each mapped textual candidate. A context model selectively winnows candidate to resolve recognition ambiguity according to context whenever multiple textual candidates are recognized as potential matches for the same mapped utterance. Matches are used to update the context model, which may be used for multiple users in the same context.

Type: Grant

Filed: August 10, 2012

Date of Patent: May 26, 2015

Assignee: International Business Machines Corporation

Inventors: Fernando Luiz Koch, Julio Nogima
Speech recognition and synthesis utilizing context dependent acoustic models containing decision trees

Patent number: 9043213

Abstract: A speech recognition method including the steps of receiving a speech input from a known speaker of a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model has a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation and has been trained using first training data and adapted using second training data to said speaker. The speech recognition method also determines the likelihood of a sequence of observations occurring in a given language using a language model and combines the likelihoods determined by the acoustic model and the language model and outputs a sequence of words identified from said speech input signal. The acoustic model is context based for the speaker, the context based information being contained in the model using a plurality of decision trees and the structure of the decision trees is based on second training data.

Type: Grant

Filed: January 26, 2011

Date of Patent: May 26, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventor: Byung Ha Chun
Computing numeric representations of words in a high-dimensional space

Patent number: 9037464

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: March 15, 2013

Date of Patent: May 19, 2015

Assignee: Google Inc.

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
Automatic disclosure detection

Patent number: 9037465

Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.

Type: Grant

Filed: February 21, 2013

Date of Patent: May 19, 2015

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan Melamed, Andrej Ljolje, Bernard Renger, Yeon-Jun Kim, David J. Smith
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9026442

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: August 14, 2014

Date of Patent: May 5, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Systems and methods for character string auto-suggestion based on degree of difficulty

Patent number: 9026429

Abstract: In one embodiment, a method includes receiving one or more characters of a character string as a user enters the character string into a graphical user interface (GUI) of a computing device. The method also includes determining a degree of difficulty of the user entering the character string into the GUI of the computing device. The method further includes, if the degree of difficulty is at least approximately equal to or exceeds a pre-determined threshold, providing for display to the user an auto-suggestion for completing the character string for the user.

Type: Grant

Filed: December 5, 2012

Date of Patent: May 5, 2015

Assignee: Facebook, Inc.

Inventors: Jenny Yuen, Luke St. Clair
Format based speech reconstruction from noisy signals

Patent number: 9020818

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 28, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Formant based speech reconstruction from noisy signals

Patent number: 9015044

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 21, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Automatic speech recognition reporting

Patent number: 9015047

Abstract: A method, a computer readable medium and a system for reporting automatic speech recognition that comprises, collecting an utterance, analyzing the utterance, receiving a translation of the utterance, and determining a difference between the analyzed utterance and the translated utterance. An embodiment the disclosure includes updating the utterance analysis based upon the determined difference, correlating the analyzed utterance to the translated utterance and tracking determined difference by a translator. In another embodiment the disclose includes reporting, categorizing, sorting, and grouping the determined difference.

Type: Grant

Filed: February 12, 2007

Date of Patent: April 21, 2015

Assignee: West Corporation

Inventors: Aaron Scott Fisher, Prashanta Pradhan

prev 1 2 3 4 5 6 7 … next