Natural Language Patents (Class 704/257)
  • Patent number: 9009040
    Abstract: According to certain embodiments, training a transcription system includes accessing recorded voice data of a user from one or more sources. The recorded voice data comprises voice samples. A transcript of the recorded voice data is accessed. The transcript comprises text representing one or more words of each voice sample. The transcript and the recorded voice data are provided to a transcription system to generate a voice profile for the user. The voice profile comprises information used to convert a voice sample to corresponding text.
    Type: Grant
    Filed: May 5, 2010
    Date of Patent: April 14, 2015
    Assignee: Cisco Technology, Inc.
    Inventors: Todd C. Tatum, Michael A. Ramalho, Paul M. Dunn, Shantanu Sarkar, Tyrone T. Thorsen, Alan D. Gatzke
  • Patent number: 9009042
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: April 14, 2015
    Assignee: Google Inc.
    Inventors: Matthias Quasthoff, Simon Tickner
  • Patent number: 9002710
    Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: April 7, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
  • Patent number: 9002705
    Abstract: The present invention provides an interactive device which allows quick utterance recognition results and sequential output thereof and which diminishes a recognition rate decrease even if user's utterance is divided by a short interval into frames for quick decision. The interactive device: sets a recognition section for voice recognition; performs voice recognition for the recognition section; when the voice recognition includes a key phrase, determines response actions corresponding thereto; and executes the response actions. The interactive device repeatedly updates the set recognition terminal point to a frame which is the predetermined time length ahead of the set recognition terminal point to set a plurality of recognition sections. The interactive device performs voice recognition for each recognition section.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: April 7, 2015
    Assignee: Honda Motor Co., Ltd.
    Inventors: Yuichi Yoshida, Taku Osada
  • Publication number: 20150095033
    Abstract: Embodiments provide for tracking a partial dialog state as part of managing a dialog state space, but the embodiments are not so limited. A method of an embodiment jointly models partial state update and named entity recognition using a sequence-based classification or other model, wherein recognition of named entities and a partial state update can be performed in a single processing stage at runtime to generate a distribution over partial dialog states. A system of an embodiment is configured to generate a distribution over partial dialog states at runtime in part using a sequence classification decoding or other algorithm to generate one or more partial dialog state hypothesis and/or a confidence score or measure associated with each hypothesis. Other embodiments are included.
    Type: Application
    Filed: October 2, 2013
    Publication date: April 2, 2015
    Applicant: MICROSOFT CORPORATION
    Inventors: Daniel Boies, Ruhi Sarikaya, Alexandre Rochette, Zhaleh Feizollahi, Nikhil Ramesh
  • Publication number: 20150095159
    Abstract: In certain implementations, a system-initiated dialog with a user may be provided based on prior user interactions. In an implementation, context information determined based on one or more prior interactions of the user with the system may be obtained. A dialog-initiation opportunity may be detected based on the context information. A natural language dialog with the user may be initiated based on the dialog-initiation opportunity. In an implementation, the one or more prior interactions of the user may comprise one or more prior conversations between the user and the system. At least one of the one or more prior conversations may, for example, comprise a natural language utterance of the user and a natural language response of the system to the natural language utterance.
    Type: Application
    Filed: December 8, 2014
    Publication date: April 2, 2015
    Applicant: VOICEBOX TECHNOLOGIES CORPORATION
    Inventors: Michael R. KENNEWICK, Catherine CHEUNG, Larry BALDWIN, Ari SALOMON, Michael TJALVE, Sheetal GUTTIGOLI, Lynn ARMSTRONG, Philippe DI CRISTO, Bernie ZIMMERMAN, Sam MENAKER
  • Patent number: 8996360
    Abstract: A method and an apparatus for generating a journal, which can implement automatic generation of a journal based on data from various sources. The method includes: obtaining a source data set and a journal description data set corresponding to the source data set; calculating an alignment probability between each source data sequence and each journal description data sequence to obtain an alignment probability set; calculating a probability that each journal description data sequence occurs in the journal description data set to obtain an occurrence probability set; determining, according to the alignment probability set and the occurrence probability set and from each journal description data sequence, a target journal description data sequence corresponding to a source data sequence to be translated, and translating the target journal description data sequence into a journal description text.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: March 31, 2015
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zhenhua Dong, Gong Zhang, Liangwei Wang
  • Patent number: 8996375
    Abstract: A speech processing system which exploits statistical modeling and formal logic to receive and process speech input, which may represent data to be received, such as dictation, or commands to be processed by an operating system, application or process. A command dictionary and dynamic grammars are used in processing speech input to identify, disambiguate and extract commands. The logical processing scheme ensures that putative commands are complete and unambiguous before processing. Context sensitivity may be employed to differentiate data and commands. A multi faceted graphic user interface may be provided for interaction with a user to speech enable interaction with applications and processes that do not necessarily have native support for speech input.
    Type: Grant
    Filed: July 26, 2013
    Date of Patent: March 31, 2015
    Inventors: Jean Gagnon, Philippe Roy, Paul J. Lagassey
  • Publication number: 20150088519
    Abstract: In some embodiments, a recognition result produced by a speech processing system based on an analysis of a speech input is evaluated for indications of potential errors. In some embodiments, sets of words/phrases that may be acoustically similar or otherwise confusable, the misrecognition of which can be significant in the domain, may be used together with a language model to evaluate a recognition result to determine whether the recognition result includes such an indication. In some embodiments, a word/phrase of a set that appears in the result is iteratively replaced with each of the other words/phrases of the set. The result of the replacement may be evaluated using a language model to determine a likelihood of the newly-created string of words appearing in a language and/or domain. The likelihood may then be evaluated to determine whether the result of the replacement is sufficiently likely for an alert to be triggered.
    Type: Application
    Filed: December 1, 2014
    Publication date: March 26, 2015
    Applicant: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Patent number: 8990070
    Abstract: A method, system and computer program product for building an expression, including utilizing any formal grammar of a context-free language, displaying an expression on a computer display via a graphical user interface, replacing at least one non-terminal display object within the displayed expression with any of at least one non-terminal display object and at least one terminal display object, and repeating the replacing step a plurality of times for a plurality of non-terminal display objects until no non-terminal display objects remain in the displayed expression, wherein the non-terminal display objects correspond to non-terminal elements within the grammar, and wherein the terminal display objects correspond to terminal elements within the grammar.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Yigal S. Dayan, Gil Fuchs, Josemina M. Magdalen
  • Patent number: 8990064
    Abstract: A document containing text in a source language may be translated into a target language based on content associated with that document, in conjunction with the present technology. An indication to perform an optimal translation of a document into a target language may be received via a user interface. The document may then be accessed by a computing device. The optimal translation is executed by a preferred translation engine of a plurality of available translation engines. The preferred translation engine is the most likely to produce the most accurate translation of the document among the plurality of available translation engines. Additionally, the preferred translation engine may be identified based on content associated with the document. The document is translated into the target language using the preferred translation engine to obtain a translated document, which may then be outputted by a computing device.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: March 24, 2015
    Assignee: Language Weaver, Inc.
    Inventors: Daniel Marcu, Radu Soricut, Narayanaswamy Viswanathan
  • Patent number: 8990085
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: March 24, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro
  • Patent number: 8983826
    Abstract: One embodiment provides a system for extracting shadow entities from emails. During operation, the system receives a number of document corpora. The system then calculates word-collocation statistics associated with different n-gram sizes for the document corpora. Next, the system receives an email and identifies shadow entities in the email based on the calculated word-collocation statistics for the document corpora.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: March 17, 2015
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Oliver Brdiczka, Petro Hizalev
  • Patent number: 8983840
    Abstract: Techniques, an apparatus and an article of manufacture identifying one or more utterances that are likely to carry the intent of a speaker, from a conversation between two or more parties. A method includes obtaining an input of a set of utterances in chronological order from a conversation between two or more parties, computing an intent confidence value of each utterance by summing intent confidence scores from each of the constituent words of the utterance, wherein intent confidence scores capture each word's influence on the subsequent utterances in the conversation based on (i) the uniqueness of the word in the conversation and (ii) the number of times the word subsequently occurs in the conversation, and generating a ranked order of the utterances from highest to lowest intent confidence value, wherein the highest intent value corresponds to the utterance which is most likely to carry intent of the speaker.
    Type: Grant
    Filed: June 19, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Sachindra Joshi, Saket Saurabh, Ashish Verma
  • Patent number: 8983839
    Abstract: The system and method described herein may dynamically generate a recognition grammar associated with a conversational voice user interface in an integrated voice navigation services environment. In particular, in response to receiving a natural language utterance that relates to a navigation context at the voice user interface, a conversational language processor may generate a dynamic recognition grammar that organizes grammar information based on one or more topological domains. For example, the one or more topological domains may be determined based on a current location associated with a navigation device, whereby a speech recognition engine may use the grammar information organized in the dynamic recognition grammar according to the one or more topological domains to generate one or more interpretations associated with the natural language utterance.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: March 17, 2015
    Assignee: VoiceBox Technologies Corporation
    Inventors: Michael R. Kennewick, Catherine Cheung, Larry Baldwin, Ari Salomon, Michael Tjalve, Sheetal Guttigoli, Lynn Armstrong, Philippe Di Cristo, Bernie Zimmerman, Sam Menaker
  • Patent number: 8977549
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: March 10, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8977555
    Abstract: Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech (“TTS”) presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition (“ASR”) modules and/or natural language understanding (“NLU”) modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: March 10, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Fred Torok, Frédéric Johan Georges Deramat, Vikram Kumar Gundeti
  • Patent number: 8977539
    Abstract: A language analysis apparatus of the invention includes division rules, each of which is classified into one of levels according to the degree of risk of causing analysis accuracy problems when applied; a division point candidate generation unit 21 which, when a character string whose length is greater than the predetermined maximum input length is input, generates division point candidates for the input character string by applying the division rules sequentially one by one in the ascending order of the level of risk of causing problems; a division point adjustment unit 22 which, when the length of a division unit candidate obtained by the division point candidate generated by the division point candidate generation unit 21 is less than the maximum input length, selects a combination of division points from among the division point candidates obtained by applying division rules of the same level while ensuring that each division unit is not greater in length than the maximum input length; and a division unit
    Type: Grant
    Filed: March 23, 2010
    Date of Patent: March 10, 2015
    Assignee: NEC Corporation
    Inventors: Shinichi Ando, Kunihiko Sadamasa
  • Patent number: 8972260
    Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: March 3, 2015
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
  • Patent number: 8972245
    Abstract: Provided are techniques for text prediction using environment hints. A list of words is received, wherein each word in the list of words has an associated weight. For at least one word in the list of words, an environment weight is obtained from an environment dictionary. The associated weight of the at least one word is updated using the obtained environment weight. The words in the list of words are ordered based on the updated, associated weight of each of the words.
    Type: Grant
    Filed: August 20, 2013
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Zachary H. Jones, Aaron J. Quirk, Lin Sun
  • Patent number: 8972264
    Abstract: A method and apparatus for utterance verification are provided for verifying a recognized vocabulary output from speech recognition. The apparatus for utterance verification includes a reference score accumulator, a verification score generator and a decision device. A log-likelihood score obtained from speech recognition is processed by taking a logarithm of the value of the probability of one of feature vectors of an input speech conditioned on one of states of each model vocabulary. A verification score is generated based on the processed result. The verification score is compared with a predetermined threshold value so as to reject or accept the recognized vocabulary.
    Type: Grant
    Filed: December 17, 2012
    Date of Patent: March 3, 2015
    Assignee: Industrial Technology Research Institute
    Inventor: Shih-Chieh Chien
  • Patent number: 8972242
    Abstract: A system may include an extraction engine to extract candidate phrases from a content stream, and an analysis engine to assign the candidate phrases visual cues and display the visual cues to an operator.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: March 3, 2015
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Ming C. Hao, Christian Rohrdantz, Lars-Erik Haug, Umeshwar Dayal, Meichun Hsu, Daniel Keim
  • Publication number: 20150058018
    Abstract: In some aspects, a method of recognizing speech that comprises natural language and at least one word specified in at least one domain-specific vocabulary is provided. The method comprises performing a first speech processing pass comprising identifying, in the speech, a first portion including the natural language and a second portion including the at least one word specified in the at least one domain-specific vocabulary, and recognizing the first portion including the natural language. The method further comprises performing a second speech processing pass comprising recognizing the second portion including the at least one word specified in the at least one domain-specific vocabulary.
    Type: Application
    Filed: August 23, 2013
    Publication date: February 26, 2015
    Applicant: Nuance Communications, Inc.
    Inventors: Munir Nikolai Alexander Georges, Stephan Kanthak
  • Patent number: 8965754
    Abstract: Provided are techniques for text prediction using environment hints. A list of words is received, wherein each word in the list of words has an associated weight. For at least one word in the list of words, an environment weight is obtained from an environment dictionary. The associated weight of the at least one word is updated using the obtained environment weight. The words in the list of words are ordered based on the updated, associated weight of each of the words.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: February 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Zachary H. Jones, Aaron J. Quirk, Lin Sun
  • Patent number: 8965772
    Abstract: Methods, systems, and products are disclosed for displaying speech command input state information in a multimodal browser including displaying an icon representing a speech command type and displaying an icon representing the input state of the speech command. In typical embodiments, the icon representing a speech command type and the icon representing the input state of the speech command also includes attributes of a single icon. Typical embodiments include accepting from a user a speech command of the speech command type, changing the input state of the speech command, and displaying another icon representing the changed input state of the speech command. Typical embodiments also include displaying the text of the speech command in association with the icon representing the speech command type.
    Type: Grant
    Filed: March 20, 2014
    Date of Patent: February 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Charles W. Cross, Jr., Michael C. Hollinger, Igor R. Jablokov, Benjamin D. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
  • Patent number: 8959020
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for discovery of problematic pronunciations for automatic speech recognition systems. One of the methods includes determining a frequency of occurrences of one or more n-grams in transcribed text and a frequency of occurrences of the n-grams in typed text and classifying a system pronunciation of a word included in the n-grams as correct or incorrect based on the frequencies. The n-grams may comprise one or more words and at least one of the words is classified as incorrect based on the frequencies. The frequencies of the specific n-grams may be determined across a domain using one or more n-grams that typically appear adjacent to the specific n-grams.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: February 17, 2015
    Assignee: Google Inc.
    Inventors: Brian Strope, Francoise Beaufays, Trevor D. Strohman
  • Patent number: 8954319
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for generating a natural language spoken dialog system. The method includes nominating a set of allowed dialog actions and a set of contextual features at each turn in a dialog, and selecting an optimal action from the set of nominated allowed dialog actions using a machine learning algorithm. The method includes generating a response based on the selected optimal action at each turn in the dialog. The set of manually nominated allowed dialog actions can incorporate a set of business rules. Prompt wordings in the generated natural language spoken dialog system can be tailored to a current context while following the set of business rules. A compression label can represent at least one of the manually nominated allowed dialog actions.
    Type: Grant
    Filed: July 23, 2014
    Date of Patent: February 10, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Jason D. Williams
  • Patent number: 8949127
    Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
    Type: Grant
    Filed: February 17, 2014
    Date of Patent: February 3, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Patent number: 8949130
    Abstract: In embodiments of the present invention improved capabilities are described for a user interacting with a mobile communication facility, where speech presented by the user is recorded using a mobile communication facility resident capture facility. The recorded speech may be recognized using an external speech recognition facility to produce an external output and a resident speech recognition facility to produce an internal output, where at least one of the external output and the internal output may be selected based on a criteria.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: February 3, 2015
    Assignee: Vlingo Corporation
    Inventor: Michael S. Phillips
  • Publication number: 20150032454
    Abstract: A method and a processing device for managing an interactive speech recognition system is provided. Whether a voice input relates to expected input, at least partially, of any one of a group of menus different from a current menu is determined. If the voice input relates to the expected input, at least partially, of any one of the group of menus different from the current menu, skipping to the one of the group of menus is performed. The group of menus is different from the current menu include menus at multiple hierarchical levels.
    Type: Application
    Filed: October 13, 2014
    Publication date: January 29, 2015
    Inventor: Harry E. BLANCHARD
  • Patent number: 8942981
    Abstract: A natural language call router forwards an incoming call from a caller to an appropriate destination. The call router has a speech recognition mechanism responsive to words spoken by a caller for producing recognized text corresponding to the spoken words. A robust parsing mechanism is responsive to the recognized text for detecting a class of words in the recognized text. The class is defined as a group of words having a common attribute. An interpreting mechanism is responsive to the detected class for determining the appropriate destination for routing the call.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: January 27, 2015
    Assignee: Cellco Partnership
    Inventors: Veronica Klein, Deborah Washington Brown
  • Patent number: 8938391
    Abstract: A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.
    Type: Grant
    Filed: June 12, 2011
    Date of Patent: January 20, 2015
    Assignee: Microsoft Corporation
    Inventors: Geoffrey Zweig, Shuangyu Chang
  • Patent number: 8938386
    Abstract: When redacting natural language text, a classifier is used to provide a sensitive concept model according to features in natural language text and in which the various classes employed are sensitive concepts reflected in the natural language text. Similarly, the classifier is used to provide an utility concepts model based on utility concepts. Based on these models, and for one or more identified sensitive concept and identified utility concept, at least one feature in the natural language text is identified that implicates the at least one identified sensitive topic more than the at least one identified utility concept. At least some of the features thus identified may be perturbed such that the modified natural language text may be provided as at least one redacted document. In this manner, features are perturbed to maximize classification error for sensitive concepts while simultaneously minimizing classification error in the utility concepts.
    Type: Grant
    Filed: March 15, 2011
    Date of Patent: January 20, 2015
    Assignee: Accenture Global Services Limited
    Inventors: Chad Cumby, Rayid Ghani
  • Publication number: 20150019227
    Abstract: A device, method and system are provided for interpreting and executing operations based on multimodal input received at a computing device. The multimodal input can include one or more verbal and non-verbal inputs, such as a combination of speech and gesture inputs received substantially concurrently via suitable user interface means provided on the computing device. One or more target objects is identified from the non-verbal input, and text is recognized from the verbal input. An interaction object is generated using the recognized text and identified target objects, and thus comprises a natural language expression with embedded target objects. The interaction object is then processed to identify one or more operations to be executed.
    Type: Application
    Filed: May 15, 2013
    Publication date: January 15, 2015
    Applicant: XTREME INTERACTIONS, INC.
    Inventor: Joe Anandarajah
  • Patent number: 8935199
    Abstract: A system and a method for linking textual and physical concepts are disclosed. The method includes extracting candidate phrases from a knowledge base for a device, the candidate phrases including noun phrases. A set of candidate concepts is generated, based on the extracted noun phrases. Provision is made, e.g., on a graphical user interface, for a user to generate mapped concepts for physical components of the device by selecting, for each concept to be mapped, a physical component shown in a graphical representation of the device and at least one of the candidate concepts which is to be linked to that physical component. The knowledge base is indexed, based on the mapped concepts. In this way, textual expressions in the knowledge base are linked to a respective physical component through one of the mapped concepts.
    Type: Grant
    Filed: December 14, 2010
    Date of Patent: January 13, 2015
    Assignee: Xerox Corporation
    Inventors: Frederic Roulland, Stefania Castellani, Nicolas Hairon, Pascal Valobra
  • Patent number: 8935168
    Abstract: A state detecting device includes an input unit that receives an input voice sound; an analyzer that calculates a feature parameter of each of plurality of frames extracted from the voice sound; a calculator that calculates the average of the feature parameters of the frames, determines a threshold on the basis of the average and statistical data representing relationships between other averages of other feature parameters obtained from a plurality of speakers and cumulative frequencies of the other feature parameters, and calculates an appearance frequency of a frame that is among the plurality of frames and whose feature parameter is larger than the threshold; a determining unit that determines, on the basis of the appearance frequency, a strained state of a vocal cord that has made the voice sound; and an output unit that outputs a result of the determination.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: January 13, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8935153
    Abstract: A natural language incident report resolution method and system are provided. Natural language incident reports received from a user are analyzed to determine a category associated with the incident. A database of existing incidents is analyzed to determine whether a report for the incident has already been submitted. The current status or state of the device associated with the incident is then ascertained and the incident, if new, is added to an incident database. If the incident is preexisting, the incident in the database is updated with the current status. A solution database is then queried to determine any solutions, automatic or manual workflows, that may correct the error or fault associated with the incident. The determined solution is communicated to the device associated with the incident for implementation.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: January 13, 2015
    Assignee: Xerox Corporation
    Inventors: Neil Andrew McKeeman, Jerry Shkavritko
  • Publication number: 20150012260
    Abstract: A method for recognizing a voice includes receiving, as an input, a voice involving multiple languages, recognizing a first voice of the voice by using a voice recognition algorithm matched to a preset primary language, identifying the preset primary language and a non-primary language different from the preset primary language, which are included in the multiple languages, determining a type of the non-primary language based on context information, recognizing a second voice of the voice in the non-primary language by applying a voice recognition algorithm, which is matched to the non-primary language of the determined type, to the second voice, and outputting a result of recognizing the voice which is based on a result of recognizing the first voice and a result of recognizing the second voice.
    Type: Application
    Filed: July 3, 2014
    Publication date: January 8, 2015
    Inventor: Subhojit Chakladar
  • Patent number: 8930179
    Abstract: Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: January 6, 2015
    Assignee: Microsoft Corporation
    Inventors: Shuangyu Chang, Michael Levit, Bruce Buntschuh
  • Patent number: 8930191
    Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from a user. In response to the user request, (1) an echo of the speech input based on a textual interpretation of the speech input, and (2) a paraphrase of the user request based at least in part on a respective semantic interpretation of the speech input are presented to the user.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: January 6, 2015
    Assignee: Apple Inc.
    Inventors: Thomas Robert Gruber, Harry Joseph Saddler, Adam John Cheyer, Dag Kittlaus, Christopher Dean Brigham, Richard Donald Giuli, Didier Rene Guzzoni, Marcello Bastea-Forte
  • Patent number: 8930187
    Abstract: An apparatus for utilizing textual data and acoustic data corresponding to speech data to detect sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including evaluating textual data and acoustic data corresponding to voice data associated with captured speech content. The computer program code may further cause the apparatus to analyze the textual data and the acoustic data to detect whether the textual data or the acoustic data includes one or more words indicating at least one sentiment of a user that spoke the speech content. The computer program code may further cause the apparatus to assign at least one predefined sentiment to at least one of the words in response to detecting that the word(s) indicates the sentiment of the user. Corresponding methods and computer program products are also provided.
    Type: Grant
    Filed: January 3, 2012
    Date of Patent: January 6, 2015
    Assignee: Nokia Corporation
    Inventors: Imre Attila Kiss, Joseph Polifroni, Francois Mairesse, Mark Adler
  • Patent number: 8930181
    Abstract: A method preformed in a character entry system involves receiving user input and using a Generalized Lexicographic Ordering (GLO) process to determine an order for presentation of one or more completion candidates to a the user for selection.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: January 6, 2015
    Inventor: Prashant Parikh
  • Patent number: 8924215
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for recognizing speech. The method includes receiving speech from a user, perceiving at least one speech dialect in the received speech, selecting at least one grammar from a plurality of optimized dialect grammars based on at least one score associated with the perceived speech dialect and the perceived at least one speech dialect, and recognizing the received speech with the selected at least one grammar. Selecting at least one grammar can be further based on a user profile. Multiple grammars can be blended. Predefined parameters can include pronunciation differences, vocabulary, and sentence structure. Optimized dialect grammars can be domain specific. The method can further include recognizing initial received speech with a generic grammar until an optimized dialect grammar is selected. Selecting at least one grammar from a plurality of optimized dialect grammars can be based on a certainty threshold.
    Type: Grant
    Filed: December 23, 2013
    Date of Patent: December 30, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Gregory Pulz, Harry E. Blanchard, Steven H. Lewis, Lan Zhang
  • Patent number: 8924210
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Grant
    Filed: May 28, 2014
    Date of Patent: December 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20140379349
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing a search. A system configured to practice the method first receives from an automatic speech recognition (ASR) system a word lattice based on speech query and receives indexed documents from an information repository. The system composes, based on the word lattice and the indexed documents, at least one triple including a query word, selected indexed document, and weight. The system generates an N-best path through the word lattice based on the at least one triple and re-ranks ASR output based on the N-best path. The system aggregates each weight across the query words to generate N-best listings and returns search results to the speech query based on the re-ranked ASR output and the N-best listings. The lattice can be a confusion network, the arc density of which can be adjusted for a desired performance level.
    Type: Application
    Filed: September 8, 2014
    Publication date: December 25, 2014
    Inventors: Srinivas BANGALORE, Taniya MISHRA
  • Patent number: 8918318
    Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.
    Type: Grant
    Filed: January 15, 2008
    Date of Patent: December 23, 2014
    Assignee: NEC Corporation
    Inventor: Yoshifumi Onishi
  • Patent number: 8918321
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: December 23, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Michael Czahor
  • Patent number: 8918320
    Abstract: An apparatus for generating a review based in part on detected sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including determining a location(s) of the apparatus and a time(s) that the location(s) was determined responsive to capturing voice data of speech content associated with spoken reviews of entities. The computer program code may further cause the apparatus to analyze textual and acoustic data corresponding to the voice data to detect whether the textual or acoustic data includes words indicating a sentiment(s) of a user speaking the speech content. The computer program code may further cause the apparatus to generate a review of an entity corresponding to a spoken review(s) based on assigning a predefined sentiment to a word(s) responsive to detecting that the word indicates the sentiment of the user. Corresponding methods and computer program products are also provided.
    Type: Grant
    Filed: January 3, 2012
    Date of Patent: December 23, 2014
    Assignee: Nokia Corporation
    Inventors: Mark Adler, Imre Attila Kiss, Francois Mairesse, Joseph Polifroni
  • Publication number: 20140372122
    Abstract: A method for recognizing speech including a sequence of words determines a shape of a gesture and a location of the gesture with respect to a display device showing a set of interpretations of the speech. The method determines a type of the word sequence constraint based on the shape of the gesture and determines a value of the word sequence constraint based on the location of the gesture. Next, the speech is recognized using the word sequence constraint.
    Type: Application
    Filed: July 22, 2014
    Publication date: December 18, 2014
    Inventors: Bret Harsham, John Hershey
  • Patent number: 8913720
    Abstract: A method includes receiving a communication from a party at a voice response system and capturing verbal communication spoken by the party. Then a processor creates a voice model associated with the party, the voice model being created by processing the captured verbal communication spoken by the party. The creation of the voice model is imperceptible to the party. The voice model is then stored to provide voice verification of the party during a subsequent communication.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: December 16, 2014
    Assignee: AT&T Intellectual Property, L.P.
    Inventor: Mazin Gilbert