Update Patterns Patents (Class 704/244)
-
Patent number: 8880495Abstract: Audio information is recorded in an overwriteable circular buffer of a computing device. Construction of a search query is initiated by receiving a user input. The user input includes one or more keywords forming a user-defined portion of the search query. At least a portion of the audio information recorded in the overwriteable circular buffer is processed to obtain one or more additional keywords forming an expanded portion of the search query. The portion of the audio information containing the additional keywords is received and recorded in the overwriteable circular buffer prior to receiving the user input. The search query including the user-defined portion and the expanded portion is supplied to a search engine. A response to the search query is received from the search engine. The response is generated by the search engine based on the user-defined portion and the expanded portion of the search query.Type: GrantFiled: October 16, 2012Date of Patent: November 4, 2014Inventors: Michael J. Andri, Megan E. McVicar
-
Publication number: 20140324428Abstract: A system and method are provided for improving speech recognition accuracy. Contextual information about user speech may be received, and then speech recognition analysis can be performed on the user speech using the contextual information. This allows the system and method to improve accuracy when performing tasks like searching and navigating using speech recognition.Type: ApplicationFiled: April 30, 2013Publication date: October 30, 2014Applicant: eBay Inc.Inventor: Eric J. Farraro
-
Publication number: 20140324429Abstract: An adaptive dialogue system and also a computer-implemented method for semantic training of a dialogue system are disclosed. In this connection, semantic annotations are generated automatically on the basis of received speech inputs, the semantic annotations being intended for controlling instruments or for communication with a user. For this purpose, at least one speech input is received in the course of an interaction with a user. A sense content of the speech input is registered and appraised, by the speech input being classified on the basis of a trainable semantic model, in order to make a semantic annotation available for the speech input. Further user information connected with the speech input is taken into account if the registered sense content is appraised erroneously, incompletely and/or as untrustworthy. The sense content of the speech input is learned automatically on the basis of the additional user information.Type: ApplicationFiled: April 24, 2014Publication date: October 30, 2014Applicant: ELEKTROBIT AUTOMOTIVE GmbHInventors: Karl Weilhammer, Silke Goronzy-Thomae
-
Publication number: 20140324430Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: ApplicationFiled: July 14, 2014Publication date: October 30, 2014Inventors: Andrej LJOLJE, Bernard S. RENGER, Steven Neil Tischer
-
Publication number: 20140316783Abstract: Systems and methods for vocal keyword training from text are provided. In one example method, text is received via keyboard or a touch screen. The text can include one or more words of language known to a user. The received text can be compiled to generate a signature. The signature can embody a spoken keyword and include a sequence of phonemes or a triphone. The signature can be provided as an input to automatic speech recognition (ASR) software for subsequent comparison to an audible input. In various embodiments, a mobile device receives the audible input and the text, and at least one of the compiling and ASR functionality is distributed to a cloud-based system.Type: ApplicationFiled: April 9, 2014Publication date: October 23, 2014Inventor: Eitan Asher Medina
-
Publication number: 20140316782Abstract: Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving a first utterance from a user of the speech system; determining a first list of possible results from the first utterance, wherein the first list includes at least two elements that each represent a possible result; analyzing the at least two elements of the first list to determine an ambiguity of the elements; and generating a speech prompt to the user based on partial orthography and the ambiguity.Type: ApplicationFiled: April 19, 2013Publication date: October 23, 2014Inventors: Eli TZIRKEL-HANCOCK, Gaurav TALWAR, Xufang ZHAO, Greg T. Lindemann
-
Patent number: 8868423Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.Type: GrantFiled: July 11, 2013Date of Patent: October 21, 2014Assignee: John Nicholas and Kristin Gross TrustInventor: John Nicholas Gross
-
Patent number: 8862468Abstract: A system and method of refining context-free grammars (CFGs). The method includes deriving back-off grammar (BOG) rules from an initially developed CFG and utilizing the initial CFG and the derived BOG rules to recognize user utterances. Based on a response of the initial CFG and the derived BOG rules to the user utterances, at least a portion of the derived BOG rules are utilized to modify the initial CFG and thereby produce a refined CFG. The above method can carried out iterativey, with each new iteration utilizing a refined CFG from preceding iterations.Type: GrantFiled: December 22, 2011Date of Patent: October 14, 2014Assignee: Microsoft CorporationInventors: Timothy Paek, Max Chickering, Eric Badger
-
Patent number: 8856002Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.Type: GrantFiled: April 11, 2008Date of Patent: October 7, 2014Assignee: International Business Machines CorporationInventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
-
Patent number: 8849664Abstract: Methods, systems, and computer programs encoded on a computer storage medium for real-time acoustic adaptation using stability measures are disclosed. The methods include the actions of receiving a transcription of a first portion of a speech session, wherein the transcription of the first portion of the speech session is generated using a speaker adaptation profile. The actions further include receiving a stability measure for a segment of the transcription and determining that the stability measure for the segment satisfies a threshold. Additionally, the actions include triggering an update of the speaker adaptation profile using the segment, or using a portion of speech data that corresponds to the segment. And the actions include receiving a transcription of a second portion of the speech session, wherein the transcription of the second portion of the speech session is generated using the updated speaker adaptation profile.Type: GrantFiled: July 16, 2013Date of Patent: September 30, 2014Assignee: Google Inc.Inventors: Xin Lei, Petar Aleksic
-
Patent number: 8843371Abstract: The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text.Type: GrantFiled: August 1, 2012Date of Patent: September 23, 2014Assignee: Elwha LLCInventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud
-
Patent number: 8843367Abstract: An adaptive equalization system that adjusts the spectral shape of a speech signal based on an intelligibility measurement of the speech signal may improve the intelligibility of the output speech signal. Such an adaptive equalization system may include a speech intelligibility measurement module, a spectral shape adjustment module, and an adaptive equalization module. The speech intelligibility measurement module is configured to calculate a speech intelligibility measurement of a speech signal. The spectral shape adjustment module is configured to generate a weighted long-term speech curve based on a first predetermined long-term average speech curve, a second predetermined long-term average speech curve, and the speech intelligibility measurement. The adaptive equalization module is configured to adapt equalization coefficients for the speech signal based on the weighted long-term speech curve.Type: GrantFiled: May 4, 2012Date of Patent: September 23, 2014Assignee: 8758271 Canada Inc.Inventors: Phillip Alan Hetherington, Xueman Li
-
Publication number: 20140278412Abstract: Characterizing an acoustic signal includes extracting a vector from the acoustic signal, where the vector contains information about the nuisance characteristics present in the acoustic signal, and computing a set of likelihoods of the vector for a plurality of classes that model a plurality of nuisance characteristics. Training a system to characterize an acoustic signal includes obtaining training data, the training data comprising a plurality of acoustic signals, where each of the plurality of acoustic signals is associated with one of a plurality of classes that indicates a presence of a specific type of nuisance characteristic, transforming each of the plurality of acoustic signals into a vector that summarizes information about the acoustic characteristics of the signal, to produce a plurality of vectors, and labeling each of the plurality of vectors with one of the plurality of classes.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Applicant: SRI InternationalInventors: NICOLAS SCHEFFER, LUCIANA FERRER
-
Publication number: 20140278414Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.Type: ApplicationFiled: May 27, 2014Publication date: September 18, 2014Applicant: International Business Machines CorporationInventors: Yukari Miki, Masami Noguchi
-
Patent number: 8838448Abstract: A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance.Type: GrantFiled: April 5, 2012Date of Patent: September 16, 2014Assignee: Nuance Communications, Inc.Inventors: Dan Ning Jiang, Vaibhava Goel, Dimitri Kanevsky, Yong Qin
-
Publication number: 20140257811Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.Type: ApplicationFiled: March 11, 2013Publication date: September 11, 2014Applicant: Nuance Communications, Inc.Inventor: Jean-Francois Lavallee
-
Publication number: 20140249818Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.Type: ApplicationFiled: May 16, 2014Publication date: September 4, 2014Applicant: MModal IP LLCInventors: Girija Yegnanarayanan, Michael Finke, Juergen Fritsch, Detlef Koll, Monika Woszczyna
-
Publication number: 20140244256Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar The database may comprise a directory of names.Type: ApplicationFiled: May 2, 2014Publication date: August 28, 2014Applicant: AT&T Intellectual Property II, L.P.Inventors: Harry BLANCHARD, Steven H. Lewis, Sivaprasad Shankarnarayan, Lan Zhang
-
Publication number: 20140244255Abstract: A semiconductor integrated circuit device for speech recognition includes a conversion candidate setting unit that receives text data indicating words or sentences together with a command and sets the text data in a conversion list in accordance with the command; a standard pattern extracting unit that extracts, from a speech recognition database, a standard pattern corresponding to at least a part of the words or sentences indicated by the text data that is set in the conversion list; a signal processing unit that extracts frequency components of an input speech signal and generates a feature pattern indicating distribution of the frequency components; and a match detecting unit that detects a match between the feature pattern generated from at least a part of the speech signal and the standard pattern and outputs a speech recognition result.Type: ApplicationFiled: February 14, 2014Publication date: August 28, 2014Applicant: SEIKO EPSON CORPORATIONInventor: Tsutomu NONAKA
-
Patent number: 8818809Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.Type: GrantFiled: June 20, 2013Date of Patent: August 26, 2014Assignee: Google Inc.Inventors: Craig L. Reding, Suzi Levas
-
Patent number: 8812317Abstract: Provided are an apparatus and method for recognizing voice commands, the apparatus including: a voice command recognition unit which recognizes an input voice command; a voice command recognition learning unit which learns a recognition-targeted voice command; and a controller which controls the voice command recognition unit to recognize the recognition-targeted voice command from an input voice command, controls the voice command recognition learning unit to learn the input voice command if the voice command recognition is unsuccessful, and performs a particular operation corresponding to the recognized voice command if the voice command recognition is successful.Type: GrantFiled: September 2, 2009Date of Patent: August 19, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Jong-hyuk Jang, Seung-kwon Park, Jong-ho Lea
-
Patent number: 8805684Abstract: Automatic speech recognition (ASR) may be performed on received utterances. The ASR may be performed by an ASR module of a computing device (e.g., a client device). The ASR may include: generating feature vectors based on the utterances, updating the feature vectors based on feature-space speaker adaptation parameters, transcribing the utterances to text strings, and updating the feature-space speaker adaptation parameters based on the feature vectors. The transcriptions may be based, at least in part, on an acoustic model and the updated feature vectors. Updated speaker adaptation parameters may be received from another computing device and incorporated into the ASR module.Type: GrantFiled: October 17, 2012Date of Patent: August 12, 2014Assignee: Google Inc.Inventors: Petar Aleksic, Xin Lei
-
Patent number: 8798994Abstract: The present invention discloses a solution for conserving computing resources when implementing transformation based adaptation techniques. The disclosed solution limits the amount of speech data used by real-time adaptation algorithms to compute a transformation, which results in substantial computational savings. Appreciably, application of a transform is a relatively low memory and computationally cheap process compared to memory and resource requirements for computing the transform to be applied.Type: GrantFiled: February 6, 2008Date of Patent: August 5, 2014Assignee: International Business Machines CorporationInventors: John W. Eckhart, Michael Florio, Radek Hampl, Pavel Krbec, Jonathan Palgon
-
Patent number: 8798995Abstract: Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.Type: GrantFiled: September 23, 2011Date of Patent: August 5, 2014Assignee: Amazon Technologies, Inc.Inventor: Kiran K. Edara
-
Publication number: 20140207459Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.Type: ApplicationFiled: March 26, 2014Publication date: July 24, 2014Applicant: AT&T Intellectual Property II, L.P.Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
-
Patent number: 8781821Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.Type: GrantFiled: April 30, 2012Date of Patent: July 15, 2014Assignee: ZanavoxInventor: David Edward Newman
-
Patent number: 8781831Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: GrantFiled: September 5, 2013Date of Patent: July 15, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8775177Abstract: A speech recognition process may perform the following operations: performing a preliminary recognition process on first audio to identify candidates for the first audio; generating first templates corresponding to the first audio, where each first template includes a number of elements; selecting second templates corresponding to the candidates, where the second templates represent second audio, and where each second template includes elements that correspond to the elements in the first templates; comparing the first templates to the second templates, where comparing comprises includes similarity metrics between the first templates and corresponding second templates; applying weights to the similarity metrics to produce weighted similarity metrics, where the weights are associated with corresponding second templates; and using the weighted similarity metrics to determine whether the first audio corresponds to the second audio.Type: GrantFiled: October 31, 2012Date of Patent: July 8, 2014Assignee: Google Inc.Inventors: Georg Heigold, Patrick An Phu Nguyen, Mitchel Weintraub, Vincent O. Vanhoucke
-
Patent number: 8775178Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.Type: GrantFiled: October 27, 2009Date of Patent: July 8, 2014Assignee: International Business Machines CorporationInventors: Yukari Miki, Masami Noguchi
-
Patent number: 8768698Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.Type: GrantFiled: September 24, 2013Date of Patent: July 1, 2014Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein
-
Patent number: 8762149Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.Type: GrantFiled: December 10, 2008Date of Patent: June 24, 2014Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto MartÃn de los Santos de las Heras, Marta GarcÃa Gomar
-
Patent number: 8762148Abstract: A method and apparatus for carrying out adaptation using input speech data information even at a low reference pattern recognition performance. A reference pattern adaptation device 2 includes a speech recognition section 18, an adaptation data calculating section 19 and a reference pattern adaptation section 20. The speech recognition section 18 calculates a recognition result teacher label from the input speech data and the reference pattern. The adaptation data calculating section 19 calculates adaptation data composed of a teacher label and speech data. The adaptation data is composed of the input speech data and the recognition result teacher label corrected for adaptation by the recognition error knowledge which is the statistical information of the tendency towards recognition errors of the reference pattern. The reference pattern adaptation section 20 adapts the reference pattern using the adaptation data to generate an adaptation pattern.Type: GrantFiled: February 16, 2007Date of Patent: June 24, 2014Assignee: NEC CorporationInventor: Yoshifumi Onishi
-
Patent number: 8751232Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.Type: GrantFiled: February 6, 2013Date of Patent: June 10, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
-
Patent number: 8751226Abstract: A speech processing apparatus 101 includes a recognition feature extracting unit 12 that extracts recognition feature information which is a characteristic of a speech recognition result 15 obtained by performing a speech recognition process on an inputted speech from the speech recognition result 15; a language feature extracting unit 11 that extracts language feature information which is a characteristic of a pre-registered language resource 14 from the language resource 14; and a model learning unit 13 that obtains a verification model 16 by a learning process based on the extracted recognition feature information and language feature information.Type: GrantFiled: June 18, 2007Date of Patent: June 10, 2014Assignee: NEC CorporationInventors: Hitoshi Yamamoto, Kiyokazu Miki
-
Publication number: 20140156274Abstract: Methods and systems to translate input labels of arcs of a network, corresponding to a sequence of states of the network, to a list of output grammar elements of the arcs, corresponding to a sequence of grammar elements. The network may include a plurality of speech recognition models combined with a weighted finite state machine transducer (WFST). Traversal may include active arc traversal, and may include active arc propagation. Arcs may be processed in parallel, including arcs originating from multiple source states and directed to a common destination state. Self-loops associated with states may be modeled within outgoing arcs of the states, which may reduce synchronization operations. Tasks may be ordered with respect to cache-data locality to associate tasks with processing threads based at least in part on whether another task associated with a corresponding data object was previously assigned to the thread.Type: ApplicationFiled: June 24, 2013Publication date: June 5, 2014Inventors: Kisun You, Christopher J. Hughes, Yen-Kuang Chen
-
Publication number: 20140156275Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.Type: ApplicationFiled: February 10, 2014Publication date: June 5, 2014Applicant: AT&T Intellectual Property II, L.P.Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
-
Patent number: 8744850Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.Type: GrantFiled: January 14, 2013Date of Patent: June 3, 2014Assignee: John Nicholas and Kristin GrossInventor: John Nicholas Gross
-
Patent number: 8738377Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.Type: GrantFiled: June 7, 2010Date of Patent: May 27, 2014Assignee: Google Inc.Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
-
Patent number: 8738375Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.Type: GrantFiled: May 9, 2011Date of Patent: May 27, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
-
Patent number: 8732845Abstract: Systems, methods and articles of manufacture for generating a video such that when another person views the video, the other person can view non-private information but not private information of the person who generated the video. A first interview screen is generated by a financial application and displayed to a first person or user of a financial application. The screen includes private data related to the first person. A video of the interview screen is generated and may be transmitted over a network to a second person who may also utilize a financial application. The video is displayed to the second person, but the second person cannot view the private data related to the first person.Type: GrantFiled: May 18, 2012Date of Patent: May 20, 2014Assignee: Intuit Inc.Inventors: Steven C. Barker, Benjamin J. Kanspedos
-
Patent number: 8731922Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.Type: GrantFiled: April 30, 2013Date of Patent: May 20, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Robert Wesley Bossemeyer, Jr.
-
Patent number: 8731924Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.Type: GrantFiled: August 8, 2011Date of Patent: May 20, 2014Assignee: AT&T Intellectual Property II, L.P.Inventor: Gokhan Tur
-
Publication number: 20140136202Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; processing the speech data for a pattern of a user competence associated with at least one of task requests and interaction behavior; and selectively updating at least one of a system prompt and an interaction sequence based on the user competence.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: ROBERT D. SIMS, III, TIMOTHY J. GROST, RON M. HECHT, UTE WINTER
-
Publication number: 20140136201Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III, UTE WINTER
-
Publication number: 20140136200Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: processing a spoken command with one or more models of one or more model types to achieve model results; evaluating a frequency of the model results; and selectively updating the one or more models of the one or more model types based on the evaluating.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: UTE WINTER, RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III
-
Patent number: 8725509Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to language models stored for digital language processing. In one aspect, a method includes the actions of generating a language model, including: receiving a collection of n-grams from a corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus, and generating a trie representing the collection of n-grams, the trie being represented using one or more arrays of integers, and compressing an array representation of the trie using block encoding; and using the language model to identify a second probability of a particular string of words occurring.Type: GrantFiled: June 17, 2009Date of Patent: May 13, 2014Assignee: Google Inc.Inventors: Boulos Harb, Ciprian Chelba, Jeffrey A. Dean, Sanjay Ghemawat
-
Patent number: 8725511Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.Type: GrantFiled: July 2, 2013Date of Patent: May 13, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Harry Blanchard, Steven H. Lewis, Shankarnarayan Sivaprasad, Lan Zhang
-
Patent number: 8719017Abstract: Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.Type: GrantFiled: May 15, 2008Date of Patent: May 6, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Richard C Rose, Bojana Gajic
-
Patent number: 8712757Abstract: A method for communication management includes receiving at least one keyword and receiving a replay time span input. Further, the method includes receiving a plurality of communication inputs including at least a first communication input and a second communication input, monitoring at least the first communication input and second communication input for the at least one keyword, and determining an instantiation of the at least one keyword in at least one of the first communication input and second communication input. Additionally, the method includes associating the determined instantiation with one of the first communication input and second communication input, and providing at least a portion of the communication associated with the determined instantiation based on the replay time span input responsive to the instantiation.Type: GrantFiled: January 10, 2007Date of Patent: April 29, 2014Assignee: Nuance Communications, Inc.Inventors: Rick A. Hamilton, II, Peter G. Finn, Christopher J. Dawson, John S. Langford
-
Publication number: 20140114661Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.Type: ApplicationFiled: September 24, 2013Publication date: April 24, 2014Applicant: Google Inc.Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein