Update Patterns Patents (Class 704/244)
  • Patent number: 8880495
    Abstract: Audio information is recorded in an overwriteable circular buffer of a computing device. Construction of a search query is initiated by receiving a user input. The user input includes one or more keywords forming a user-defined portion of the search query. At least a portion of the audio information recorded in the overwriteable circular buffer is processed to obtain one or more additional keywords forming an expanded portion of the search query. The portion of the audio information containing the additional keywords is received and recorded in the overwriteable circular buffer prior to receiving the user input. The search query including the user-defined portion and the expanded portion is supplied to a search engine. A response to the search query is received from the search engine. The response is generated by the search engine based on the user-defined portion and the expanded portion of the search query.
    Type: Grant
    Filed: October 16, 2012
    Date of Patent: November 4, 2014
    Inventors: Michael J. Andri, Megan E. McVicar
  • Publication number: 20140324428
    Abstract: A system and method are provided for improving speech recognition accuracy. Contextual information about user speech may be received, and then speech recognition analysis can be performed on the user speech using the contextual information. This allows the system and method to improve accuracy when performing tasks like searching and navigating using speech recognition.
    Type: Application
    Filed: April 30, 2013
    Publication date: October 30, 2014
    Applicant: eBay Inc.
    Inventor: Eric J. Farraro
  • Publication number: 20140324429
    Abstract: An adaptive dialogue system and also a computer-implemented method for semantic training of a dialogue system are disclosed. In this connection, semantic annotations are generated automatically on the basis of received speech inputs, the semantic annotations being intended for controlling instruments or for communication with a user. For this purpose, at least one speech input is received in the course of an interaction with a user. A sense content of the speech input is registered and appraised, by the speech input being classified on the basis of a trainable semantic model, in order to make a semantic annotation available for the speech input. Further user information connected with the speech input is taken into account if the registered sense content is appraised erroneously, incompletely and/or as untrustworthy. The sense content of the speech input is learned automatically on the basis of the additional user information.
    Type: Application
    Filed: April 24, 2014
    Publication date: October 30, 2014
    Applicant: ELEKTROBIT AUTOMOTIVE GmbH
    Inventors: Karl Weilhammer, Silke Goronzy-Thomae
  • Publication number: 20140324430
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Application
    Filed: July 14, 2014
    Publication date: October 30, 2014
    Inventors: Andrej LJOLJE, Bernard S. RENGER, Steven Neil Tischer
  • Publication number: 20140316783
    Abstract: Systems and methods for vocal keyword training from text are provided. In one example method, text is received via keyboard or a touch screen. The text can include one or more words of language known to a user. The received text can be compiled to generate a signature. The signature can embody a spoken keyword and include a sequence of phonemes or a triphone. The signature can be provided as an input to automatic speech recognition (ASR) software for subsequent comparison to an audible input. In various embodiments, a mobile device receives the audible input and the text, and at least one of the compiling and ASR functionality is distributed to a cloud-based system.
    Type: Application
    Filed: April 9, 2014
    Publication date: October 23, 2014
    Inventor: Eitan Asher Medina
  • Publication number: 20140316782
    Abstract: Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving a first utterance from a user of the speech system; determining a first list of possible results from the first utterance, wherein the first list includes at least two elements that each represent a possible result; analyzing the at least two elements of the first list to determine an ambiguity of the elements; and generating a speech prompt to the user based on partial orthography and the ambiguity.
    Type: Application
    Filed: April 19, 2013
    Publication date: October 23, 2014
    Inventors: Eli TZIRKEL-HANCOCK, Gaurav TALWAR, Xufang ZHAO, Greg T. Lindemann
  • Patent number: 8868423
    Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.
    Type: Grant
    Filed: July 11, 2013
    Date of Patent: October 21, 2014
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8862468
    Abstract: A system and method of refining context-free grammars (CFGs). The method includes deriving back-off grammar (BOG) rules from an initially developed CFG and utilizing the initial CFG and the derived BOG rules to recognize user utterances. Based on a response of the initial CFG and the derived BOG rules to the user utterances, at least a portion of the derived BOG rules are utilized to modify the initial CFG and thereby produce a refined CFG. The above method can carried out iterativey, with each new iteration utilizing a refined CFG from preceding iterations.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: October 14, 2014
    Assignee: Microsoft Corporation
    Inventors: Timothy Paek, Max Chickering, Eric Badger
  • Patent number: 8856002
    Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
  • Patent number: 8849664
    Abstract: Methods, systems, and computer programs encoded on a computer storage medium for real-time acoustic adaptation using stability measures are disclosed. The methods include the actions of receiving a transcription of a first portion of a speech session, wherein the transcription of the first portion of the speech session is generated using a speaker adaptation profile. The actions further include receiving a stability measure for a segment of the transcription and determining that the stability measure for the segment satisfies a threshold. Additionally, the actions include triggering an update of the speaker adaptation profile using the segment, or using a portion of speech data that corresponds to the segment. And the actions include receiving a transcription of a second portion of the speech session, wherein the transcription of the second portion of the speech session is generated using the updated speaker adaptation profile.
    Type: Grant
    Filed: July 16, 2013
    Date of Patent: September 30, 2014
    Assignee: Google Inc.
    Inventors: Xin Lei, Petar Aleksic
  • Patent number: 8843371
    Abstract: The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text.
    Type: Grant
    Filed: August 1, 2012
    Date of Patent: September 23, 2014
    Assignee: Elwha LLC
    Inventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud
  • Patent number: 8843367
    Abstract: An adaptive equalization system that adjusts the spectral shape of a speech signal based on an intelligibility measurement of the speech signal may improve the intelligibility of the output speech signal. Such an adaptive equalization system may include a speech intelligibility measurement module, a spectral shape adjustment module, and an adaptive equalization module. The speech intelligibility measurement module is configured to calculate a speech intelligibility measurement of a speech signal. The spectral shape adjustment module is configured to generate a weighted long-term speech curve based on a first predetermined long-term average speech curve, a second predetermined long-term average speech curve, and the speech intelligibility measurement. The adaptive equalization module is configured to adapt equalization coefficients for the speech signal based on the weighted long-term speech curve.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: September 23, 2014
    Assignee: 8758271 Canada Inc.
    Inventors: Phillip Alan Hetherington, Xueman Li
  • Publication number: 20140278412
    Abstract: Characterizing an acoustic signal includes extracting a vector from the acoustic signal, where the vector contains information about the nuisance characteristics present in the acoustic signal, and computing a set of likelihoods of the vector for a plurality of classes that model a plurality of nuisance characteristics. Training a system to characterize an acoustic signal includes obtaining training data, the training data comprising a plurality of acoustic signals, where each of the plurality of acoustic signals is associated with one of a plurality of classes that indicates a presence of a specific type of nuisance characteristic, transforming each of the plurality of acoustic signals into a vector that summarizes information about the acoustic characteristics of the signal, to produce a plurality of vectors, and labeling each of the plurality of vectors with one of the plurality of classes.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Applicant: SRI International
    Inventors: NICOLAS SCHEFFER, LUCIANA FERRER
  • Publication number: 20140278414
    Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.
    Type: Application
    Filed: May 27, 2014
    Publication date: September 18, 2014
    Applicant: International Business Machines Corporation
    Inventors: Yukari Miki, Masami Noguchi
  • Patent number: 8838448
    Abstract: A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance.
    Type: Grant
    Filed: April 5, 2012
    Date of Patent: September 16, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Dan Ning Jiang, Vaibhava Goel, Dimitri Kanevsky, Yong Qin
  • Publication number: 20140257811
    Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.
    Type: Application
    Filed: March 11, 2013
    Publication date: September 11, 2014
    Applicant: Nuance Communications, Inc.
    Inventor: Jean-Francois Lavallee
  • Publication number: 20140249818
    Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
    Type: Application
    Filed: May 16, 2014
    Publication date: September 4, 2014
    Applicant: MModal IP LLC
    Inventors: Girija Yegnanarayanan, Michael Finke, Juergen Fritsch, Detlef Koll, Monika Woszczyna
  • Publication number: 20140244256
    Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar The database may comprise a directory of names.
    Type: Application
    Filed: May 2, 2014
    Publication date: August 28, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Harry BLANCHARD, Steven H. Lewis, Sivaprasad Shankarnarayan, Lan Zhang
  • Publication number: 20140244255
    Abstract: A semiconductor integrated circuit device for speech recognition includes a conversion candidate setting unit that receives text data indicating words or sentences together with a command and sets the text data in a conversion list in accordance with the command; a standard pattern extracting unit that extracts, from a speech recognition database, a standard pattern corresponding to at least a part of the words or sentences indicated by the text data that is set in the conversion list; a signal processing unit that extracts frequency components of an input speech signal and generates a feature pattern indicating distribution of the frequency components; and a match detecting unit that detects a match between the feature pattern generated from at least a part of the speech signal and the standard pattern and outputs a speech recognition result.
    Type: Application
    Filed: February 14, 2014
    Publication date: August 28, 2014
    Applicant: SEIKO EPSON CORPORATION
    Inventor: Tsutomu NONAKA
  • Patent number: 8818809
    Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.
    Type: Grant
    Filed: June 20, 2013
    Date of Patent: August 26, 2014
    Assignee: Google Inc.
    Inventors: Craig L. Reding, Suzi Levas
  • Patent number: 8812317
    Abstract: Provided are an apparatus and method for recognizing voice commands, the apparatus including: a voice command recognition unit which recognizes an input voice command; a voice command recognition learning unit which learns a recognition-targeted voice command; and a controller which controls the voice command recognition unit to recognize the recognition-targeted voice command from an input voice command, controls the voice command recognition learning unit to learn the input voice command if the voice command recognition is unsuccessful, and performs a particular operation corresponding to the recognized voice command if the voice command recognition is successful.
    Type: Grant
    Filed: September 2, 2009
    Date of Patent: August 19, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jong-hyuk Jang, Seung-kwon Park, Jong-ho Lea
  • Patent number: 8805684
    Abstract: Automatic speech recognition (ASR) may be performed on received utterances. The ASR may be performed by an ASR module of a computing device (e.g., a client device). The ASR may include: generating feature vectors based on the utterances, updating the feature vectors based on feature-space speaker adaptation parameters, transcribing the utterances to text strings, and updating the feature-space speaker adaptation parameters based on the feature vectors. The transcriptions may be based, at least in part, on an acoustic model and the updated feature vectors. Updated speaker adaptation parameters may be received from another computing device and incorporated into the ASR module.
    Type: Grant
    Filed: October 17, 2012
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: Petar Aleksic, Xin Lei
  • Patent number: 8798994
    Abstract: The present invention discloses a solution for conserving computing resources when implementing transformation based adaptation techniques. The disclosed solution limits the amount of speech data used by real-time adaptation algorithms to compute a transformation, which results in substantial computational savings. Appreciably, application of a transform is a relatively low memory and computationally cheap process compared to memory and resource requirements for computing the transform to be applied.
    Type: Grant
    Filed: February 6, 2008
    Date of Patent: August 5, 2014
    Assignee: International Business Machines Corporation
    Inventors: John W. Eckhart, Michael Florio, Radek Hampl, Pavel Krbec, Jonathan Palgon
  • Patent number: 8798995
    Abstract: Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: August 5, 2014
    Assignee: Amazon Technologies, Inc.
    Inventor: Kiran K. Edara
  • Publication number: 20140207459
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Application
    Filed: March 26, 2014
    Publication date: July 24, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Patent number: 8781821
    Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: July 15, 2014
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Patent number: 8781831
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Grant
    Filed: September 5, 2013
    Date of Patent: July 15, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8775177
    Abstract: A speech recognition process may perform the following operations: performing a preliminary recognition process on first audio to identify candidates for the first audio; generating first templates corresponding to the first audio, where each first template includes a number of elements; selecting second templates corresponding to the candidates, where the second templates represent second audio, and where each second template includes elements that correspond to the elements in the first templates; comparing the first templates to the second templates, where comparing comprises includes similarity metrics between the first templates and corresponding second templates; applying weights to the similarity metrics to produce weighted similarity metrics, where the weights are associated with corresponding second templates; and using the weighted similarity metrics to determine whether the first audio corresponds to the second audio.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: July 8, 2014
    Assignee: Google Inc.
    Inventors: Georg Heigold, Patrick An Phu Nguyen, Mitchel Weintraub, Vincent O. Vanhoucke
  • Patent number: 8775178
    Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.
    Type: Grant
    Filed: October 27, 2009
    Date of Patent: July 8, 2014
    Assignee: International Business Machines Corporation
    Inventors: Yukari Miki, Masami Noguchi
  • Patent number: 8768698
    Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: July 1, 2014
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein
  • Patent number: 8762149
    Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.
    Type: Grant
    Filed: December 10, 2008
    Date of Patent: June 24, 2014
    Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto Martín de los Santos de las Heras, Marta García Gomar
  • Patent number: 8762148
    Abstract: A method and apparatus for carrying out adaptation using input speech data information even at a low reference pattern recognition performance. A reference pattern adaptation device 2 includes a speech recognition section 18, an adaptation data calculating section 19 and a reference pattern adaptation section 20. The speech recognition section 18 calculates a recognition result teacher label from the input speech data and the reference pattern. The adaptation data calculating section 19 calculates adaptation data composed of a teacher label and speech data. The adaptation data is composed of the input speech data and the recognition result teacher label corrected for adaptation by the recognition error knowledge which is the statistical information of the tendency towards recognition errors of the reference pattern. The reference pattern adaptation section 20 adapts the reference pattern using the adaptation data to generate an adaptation pattern.
    Type: Grant
    Filed: February 16, 2007
    Date of Patent: June 24, 2014
    Assignee: NEC Corporation
    Inventor: Yoshifumi Onishi
  • Patent number: 8751232
    Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.
    Type: Grant
    Filed: February 6, 2013
    Date of Patent: June 10, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
  • Patent number: 8751226
    Abstract: A speech processing apparatus 101 includes a recognition feature extracting unit 12 that extracts recognition feature information which is a characteristic of a speech recognition result 15 obtained by performing a speech recognition process on an inputted speech from the speech recognition result 15; a language feature extracting unit 11 that extracts language feature information which is a characteristic of a pre-registered language resource 14 from the language resource 14; and a model learning unit 13 that obtains a verification model 16 by a learning process based on the extracted recognition feature information and language feature information.
    Type: Grant
    Filed: June 18, 2007
    Date of Patent: June 10, 2014
    Assignee: NEC Corporation
    Inventors: Hitoshi Yamamoto, Kiyokazu Miki
  • Publication number: 20140156274
    Abstract: Methods and systems to translate input labels of arcs of a network, corresponding to a sequence of states of the network, to a list of output grammar elements of the arcs, corresponding to a sequence of grammar elements. The network may include a plurality of speech recognition models combined with a weighted finite state machine transducer (WFST). Traversal may include active arc traversal, and may include active arc propagation. Arcs may be processed in parallel, including arcs originating from multiple source states and directed to a common destination state. Self-loops associated with states may be modeled within outgoing arcs of the states, which may reduce synchronization operations. Tasks may be ordered with respect to cache-data locality to associate tasks with processing threads based at least in part on whether another task associated with a corresponding data object was previously assigned to the thread.
    Type: Application
    Filed: June 24, 2013
    Publication date: June 5, 2014
    Inventors: Kisun You, Christopher J. Hughes, Yen-Kuang Chen
  • Publication number: 20140156275
    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.
    Type: Application
    Filed: February 10, 2014
    Publication date: June 5, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 8744850
    Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.
    Type: Grant
    Filed: January 14, 2013
    Date of Patent: June 3, 2014
    Assignee: John Nicholas and Kristin Gross
    Inventor: John Nicholas Gross
  • Patent number: 8738377
    Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.
    Type: Grant
    Filed: June 7, 2010
    Date of Patent: May 27, 2014
    Assignee: Google Inc.
    Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
  • Patent number: 8738375
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Grant
    Filed: May 9, 2011
    Date of Patent: May 27, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
  • Patent number: 8732845
    Abstract: Systems, methods and articles of manufacture for generating a video such that when another person views the video, the other person can view non-private information but not private information of the person who generated the video. A first interview screen is generated by a financial application and displayed to a first person or user of a financial application. The screen includes private data related to the first person. A video of the interview screen is generated and may be transmitted over a network to a second person who may also utilize a financial application. The video is displayed to the second person, but the second person cannot view the private data related to the first person.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: May 20, 2014
    Assignee: Intuit Inc.
    Inventors: Steven C. Barker, Benjamin J. Kanspedos
  • Patent number: 8731922
    Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.
    Type: Grant
    Filed: April 30, 2013
    Date of Patent: May 20, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Robert Wesley Bossemeyer, Jr.
  • Patent number: 8731924
    Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: May 20, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur
  • Publication number: 20140136202
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; processing the speech data for a pattern of a user competence associated with at least one of task requests and interaction behavior; and selectively updating at least one of a system prompt and an interaction sequence based on the user competence.
    Type: Application
    Filed: October 22, 2013
    Publication date: May 15, 2014
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: ROBERT D. SIMS, III, TIMOTHY J. GROST, RON M. HECHT, UTE WINTER
  • Publication number: 20140136201
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.
    Type: Application
    Filed: October 22, 2013
    Publication date: May 15, 2014
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III, UTE WINTER
  • Publication number: 20140136200
    Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: processing a spoken command with one or more models of one or more model types to achieve model results; evaluating a frequency of the model results; and selectively updating the one or more models of the one or more model types based on the evaluating.
    Type: Application
    Filed: October 22, 2013
    Publication date: May 15, 2014
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: UTE WINTER, RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III
  • Patent number: 8725509
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to language models stored for digital language processing. In one aspect, a method includes the actions of generating a language model, including: receiving a collection of n-grams from a corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus, and generating a trie representing the collection of n-grams, the trie being represented using one or more arrays of integers, and compressing an array representation of the trie using block encoding; and using the language model to identify a second probability of a particular string of words occurring.
    Type: Grant
    Filed: June 17, 2009
    Date of Patent: May 13, 2014
    Assignee: Google Inc.
    Inventors: Boulos Harb, Ciprian Chelba, Jeffrey A. Dean, Sanjay Ghemawat
  • Patent number: 8725511
    Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: May 13, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Harry Blanchard, Steven H. Lewis, Shankarnarayan Sivaprasad, Lan Zhang
  • Patent number: 8719017
    Abstract: Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: May 6, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Richard C Rose, Bojana Gajic
  • Patent number: 8712757
    Abstract: A method for communication management includes receiving at least one keyword and receiving a replay time span input. Further, the method includes receiving a plurality of communication inputs including at least a first communication input and a second communication input, monitoring at least the first communication input and second communication input for the at least one keyword, and determining an instantiation of the at least one keyword in at least one of the first communication input and second communication input. Additionally, the method includes associating the determined instantiation with one of the first communication input and second communication input, and providing at least a portion of the communication associated with the determined instantiation based on the replay time span input responsive to the instantiation.
    Type: Grant
    Filed: January 10, 2007
    Date of Patent: April 29, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Rick A. Hamilton, II, Peter G. Finn, Christopher J. Dawson, John S. Langford
  • Publication number: 20140114661
    Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.
    Type: Application
    Filed: September 24, 2013
    Publication date: April 24, 2014
    Applicant: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein