Patents Examined by David Kovacek
-
Patent number: 9218810Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.Type: GrantFiled: April 15, 2014Date of Patent: December 22, 2015Assignee: AT&T Intellectual Property II, L.P.Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
-
Patent number: 9202460Abstract: Methods and apparatus to generate a speech recognition library for use by a speech recognition system are disclosed. An example method comprises identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments, computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments, selecting a set of the plurality of audio data segments based on the plurality of difference metrics, identifying a first one of the audio data segments in the set as a representative audio data segment, determining a first phonetic transcription of the representative audio data segment, and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.Type: GrantFiled: May 14, 2008Date of Patent: December 1, 2015Assignee: AT&T INTELLECTUAL PROPERTY I, LPInventor: Hisao M. Chang
-
Patent number: 9171539Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.Type: GrantFiled: March 26, 2015Date of Patent: October 27, 2015Assignee: Vocollect, Inc.Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
-
Patent number: 9165554Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.Type: GrantFiled: December 4, 2014Date of Patent: October 20, 2015Assignee: AT&T Intellectual Property II, L.P.Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
-
Patent number: 9165555Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.Type: GrantFiled: November 26, 2014Date of Patent: October 20, 2015Assignee: AT&T Intellectual Property II, L.P.Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar
-
Patent number: 9159338Abstract: Systems and methods of rendering a textual animation are provided. The methods include receiving an audio sample of an audio signal that is being rendered by a media rendering source. The methods also include receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector. Based on the one or more descriptors, a client device may render the textual transcriptions of vocal elements of the audio signal in an animated manner. The client device may further render the textual transcriptions of the vocal elements of the audio signal to be substantially in synchrony to the audio signal being rendered by the media rendering source. In addition, the client device may further receive an identification of a song corresponding to the audio sample, and may render lyrics of the song in an animated manner.Type: GrantFiled: December 3, 2010Date of Patent: October 13, 2015Assignee: Shazam Entertainment Ltd.Inventors: Rahul Powar, Avery Li-Chun Wang
-
Patent number: 9135797Abstract: A method of identifying incidents using mobile devices can include receiving a communication from each of a plurality of mobile devices. Each communication can specify information about a detected sound. Spatial and temporal information can be identified from each communication as well as an indication of a sound signature matching the detected sound. The communications can be compared with a policy specifying spatial and temporal requirements relating to the sound signature indicated by the communications. A notification can be selectively sent according to the comparison.Type: GrantFiled: December 28, 2006Date of Patent: September 15, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Christopher C. Couper, Neil A. Katz, Victor S. Moore
-
Patent number: 9129599Abstract: A method for execution on a server for serving presence information, the method for providing dynamically loaded speech recognition parameters to a speech recognition engine, can be provided. The method can include storing at least one rule for selecting speech recognition parameters, wherein a rule comprises an if-portion including criteria and a then-portion specifying speech recognition parameters that must be used when the criteria is met. The method can further include receiving notice that a speech recognition session has been initiated between a user and the speech recognition engine. The method can further include selecting a first set of speech recognition parameters responsive to executing the at least one rule and providing to the speech recognition engine the first set of speech recognition parameters for performing speech recognition of the user.Type: GrantFiled: October 18, 2007Date of Patent: September 8, 2015Assignee: Nuance Communications, Inc.Inventors: Girish Dhanakshirur, Baiju D. Mandalia, Wendi L. Nusbickel
-
Patent number: 9129601Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.Type: GrantFiled: November 26, 2008Date of Patent: September 8, 2015Assignee: AT&T Intellectual Property I, L.P.Inventors: Amanda Stent, Srinivas Bangalore
-
Patent number: 9081760Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for collecting web data in order to create diverse language models. A system configured to practice the method first crawls, such as via a crawler operating on a computing device, a set of documents in a network of interconnected devices according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from previous crawling cycles by crawling documents whose vocabulary considered likely to fill gaps in the current language model. A language model from a previous cycle can be used to guide the creation of a language model in the following cycle. The novelty regions can include documents with high perplexity values over the current language model.Type: GrantFiled: March 8, 2011Date of Patent: July 14, 2015Assignee: AT&T Intellectual Property I, L.P.Inventors: Luciano De Andrade Barbosa, Srinivas Bangalore
-
Patent number: 9058812Abstract: In a speech encoder/decoder a pitch delay contour endpoint modifier is employed to shift the endpoints of a pitch delay interpolation curve up or down. Particularly, the endpoints of the pitch delay interpolation curve are shifted based on a variation and/or a standard deviation in pitch delay.Type: GrantFiled: July 27, 2005Date of Patent: June 16, 2015Assignee: GOOGLE TECHNOLOGY HOLDINGS LLCInventors: James P. Ashley, Udar Mittal
-
Patent number: 9047860Abstract: A method for concatenating a first frame of samples and a subsequent second frame of samples, the method comprising applying a phase filter adapted to minimizing a discontinuity at a boundary between the first and second frames of samples.Type: GrantFiled: January 31, 2006Date of Patent: June 2, 2015Assignee: SKYPEInventor: Soren Andersen
-
Patent number: 9009048Abstract: A speech recognition method, medium, and system. The method includes detecting an energy change of each frame making up signals including speech and non-speech signals, and identifying a speech segment corresponding to frames that include only speech signals from among the frames based on the detected energy change.Type: GrantFiled: August 1, 2007Date of Patent: April 14, 2015Assignees: Samsung Electronics Co., Ltd., Apple Inc.Inventors: Giljin Jang, Jeongsu Kim, John S. Bridle, Melvyn J. Hunt
-
Patent number: 8996384Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.Type: GrantFiled: October 30, 2009Date of Patent: March 31, 2015Assignee: Vocollect, Inc.Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
-
Patent number: 8983830Abstract: An encoding device can achieve both highly effective encoding/decoding and high-quality decoding audio when executing a scalable stereo audio encoding by using MDCT and ICP. In the encoding device, an MDCT converter executes an MDCT conversion on a residual signal of left channel/right channel subjected to window processing. An MDCT converter executes an MDCT conversion on the monaural residual signal which has been subjected to the window processing. An ICP analyzer executes an ICP analysis by using the correlation between a frequency coefficient of a high-band portion of the left channel/right channel and a frequency coefficient of a high-band portion of the monaural residual signal so as to generate an ICP parameter of the left channel/right channel residual signal. An ICP parameter quantizes each of the ICP parameters. A low-band encoding unit encoder executes highly-accurate encoding on the frequency coefficient of the low-band portion of the left channel/right channel residual signal.Type: GrantFiled: March 28, 2008Date of Patent: March 17, 2015Assignee: Panasonic Intellectual Property Corporation of AmericaInventors: Jiong Zhou, Kok Seng Chong, Koji Yoshida
-
Patent number: 8965763Abstract: Training data from a plurality of utterance-to-text-string mappings of an automatic speech recognition (ASR) system may be selected. Parameters of the ASR system that characterize the utterances and their respective mappings may be determined through application of a first acoustic model and a language model. A second acoustic model and the language model may be applied to the selected training data utterances to determine a second set of utterance-to-text-string mappings. The first set of utterance-to-text-string mappings may be compared to the second set of utterance-to-text-string mappings, and the parameters of the ASR system may be updated based on the comparison.Type: GrantFiled: May 1, 2012Date of Patent: February 24, 2015Assignee: Google Inc.Inventors: Ciprian Ioan Chelba, Brian Strope, Preethi Jyothi, Leif Johnson
-
Patent number: 8959014Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.Type: GrantFiled: June 29, 2012Date of Patent: February 17, 2015Assignee: Google Inc.Inventors: Peng Xu, Fernando Pereira, Ciprian I. Chelba
-
Patent number: 8938390Abstract: In one embodiment, a method for detecting autism in a natural language environment using a microphone, sound recorder, and a computer programmed with software for the specialized purpose of processing recordings captured by the microphone and sound recorder combination, the computer programmed to execute the method, includes segmenting an audio signal captured by the microphone and sound recorder combination using the computer programmed for the specialized purpose into a plurality recording segments. The method further includes determining which of the plurality of recording segments correspond to a key child. The method further includes determining which of the plurality of recording segments that correspond to the key child are classified as key child recordings.Type: GrantFiled: February 27, 2009Date of Patent: January 20, 2015Assignee: LENA FoundationInventors: Dongxin D. Xu, Terrance D. Paul
-
Patent number: 8924212Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.Type: GrantFiled: August 26, 2005Date of Patent: December 30, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
-
Patent number: 8909527Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.Type: GrantFiled: June 24, 2009Date of Patent: December 9, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar