Patents Examined by David Kovacek
  • Patent number: 9218810
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Grant
    Filed: April 15, 2014
    Date of Patent: December 22, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 9202460
    Abstract: Methods and apparatus to generate a speech recognition library for use by a speech recognition system are disclosed. An example method comprises identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments, computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments, selecting a set of the plurality of audio data segments based on the plurality of difference metrics, identifying a first one of the audio data segments in the set as a representative audio data segment, determining a first phonetic transcription of the representative audio data segment, and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.
    Type: Grant
    Filed: May 14, 2008
    Date of Patent: December 1, 2015
    Assignee: AT&T INTELLECTUAL PROPERTY I, LP
    Inventor: Hisao M. Chang
  • Patent number: 9171539
    Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.
    Type: Grant
    Filed: March 26, 2015
    Date of Patent: October 27, 2015
    Assignee: Vocollect, Inc.
    Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
  • Patent number: 9165554
    Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.
    Type: Grant
    Filed: December 4, 2014
    Date of Patent: October 20, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
  • Patent number: 9165555
    Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: October 20, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar
  • Patent number: 9159338
    Abstract: Systems and methods of rendering a textual animation are provided. The methods include receiving an audio sample of an audio signal that is being rendered by a media rendering source. The methods also include receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector. Based on the one or more descriptors, a client device may render the textual transcriptions of vocal elements of the audio signal in an animated manner. The client device may further render the textual transcriptions of the vocal elements of the audio signal to be substantially in synchrony to the audio signal being rendered by the media rendering source. In addition, the client device may further receive an identification of a song corresponding to the audio sample, and may render lyrics of the song in an animated manner.
    Type: Grant
    Filed: December 3, 2010
    Date of Patent: October 13, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Rahul Powar, Avery Li-Chun Wang
  • Patent number: 9135797
    Abstract: A method of identifying incidents using mobile devices can include receiving a communication from each of a plurality of mobile devices. Each communication can specify information about a detected sound. Spatial and temporal information can be identified from each communication as well as an indication of a sound signature matching the detected sound. The communications can be compared with a policy specifying spatial and temporal requirements relating to the sound signature indicated by the communications. A notification can be selectively sent according to the comparison.
    Type: Grant
    Filed: December 28, 2006
    Date of Patent: September 15, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher C. Couper, Neil A. Katz, Victor S. Moore
  • Patent number: 9129599
    Abstract: A method for execution on a server for serving presence information, the method for providing dynamically loaded speech recognition parameters to a speech recognition engine, can be provided. The method can include storing at least one rule for selecting speech recognition parameters, wherein a rule comprises an if-portion including criteria and a then-portion specifying speech recognition parameters that must be used when the criteria is met. The method can further include receiving notice that a speech recognition session has been initiated between a user and the speech recognition engine. The method can further include selecting a first set of speech recognition parameters responsive to executing the at least one rule and providing to the speech recognition engine the first set of speech recognition parameters for performing speech recognition of the user.
    Type: Grant
    Filed: October 18, 2007
    Date of Patent: September 8, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Girish Dhanakshirur, Baiju D. Mandalia, Wendi L. Nusbickel
  • Patent number: 9129601
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: September 8, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Amanda Stent, Srinivas Bangalore
  • Patent number: 9081760
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for collecting web data in order to create diverse language models. A system configured to practice the method first crawls, such as via a crawler operating on a computing device, a set of documents in a network of interconnected devices according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from previous crawling cycles by crawling documents whose vocabulary considered likely to fill gaps in the current language model. A language model from a previous cycle can be used to guide the creation of a language model in the following cycle. The novelty regions can include documents with high perplexity values over the current language model.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: July 14, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Luciano De Andrade Barbosa, Srinivas Bangalore
  • Patent number: 9058812
    Abstract: In a speech encoder/decoder a pitch delay contour endpoint modifier is employed to shift the endpoints of a pitch delay interpolation curve up or down. Particularly, the endpoints of the pitch delay interpolation curve are shifted based on a variation and/or a standard deviation in pitch delay.
    Type: Grant
    Filed: July 27, 2005
    Date of Patent: June 16, 2015
    Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC
    Inventors: James P. Ashley, Udar Mittal
  • Patent number: 9047860
    Abstract: A method for concatenating a first frame of samples and a subsequent second frame of samples, the method comprising applying a phase filter adapted to minimizing a discontinuity at a boundary between the first and second frames of samples.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: June 2, 2015
    Assignee: SKYPE
    Inventor: Soren Andersen
  • Patent number: 9009048
    Abstract: A speech recognition method, medium, and system. The method includes detecting an energy change of each frame making up signals including speech and non-speech signals, and identifying a speech segment corresponding to frames that include only speech signals from among the frames based on the detected energy change.
    Type: Grant
    Filed: August 1, 2007
    Date of Patent: April 14, 2015
    Assignees: Samsung Electronics Co., Ltd., Apple Inc.
    Inventors: Giljin Jang, Jeongsu Kim, John S. Bridle, Melvyn J. Hunt
  • Patent number: 8996384
    Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.
    Type: Grant
    Filed: October 30, 2009
    Date of Patent: March 31, 2015
    Assignee: Vocollect, Inc.
    Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
  • Patent number: 8983830
    Abstract: An encoding device can achieve both highly effective encoding/decoding and high-quality decoding audio when executing a scalable stereo audio encoding by using MDCT and ICP. In the encoding device, an MDCT converter executes an MDCT conversion on a residual signal of left channel/right channel subjected to window processing. An MDCT converter executes an MDCT conversion on the monaural residual signal which has been subjected to the window processing. An ICP analyzer executes an ICP analysis by using the correlation between a frequency coefficient of a high-band portion of the left channel/right channel and a frequency coefficient of a high-band portion of the monaural residual signal so as to generate an ICP parameter of the left channel/right channel residual signal. An ICP parameter quantizes each of the ICP parameters. A low-band encoding unit encoder executes highly-accurate encoding on the frequency coefficient of the low-band portion of the left channel/right channel residual signal.
    Type: Grant
    Filed: March 28, 2008
    Date of Patent: March 17, 2015
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Jiong Zhou, Kok Seng Chong, Koji Yoshida
  • Patent number: 8965763
    Abstract: Training data from a plurality of utterance-to-text-string mappings of an automatic speech recognition (ASR) system may be selected. Parameters of the ASR system that characterize the utterances and their respective mappings may be determined through application of a first acoustic model and a language model. A second acoustic model and the language model may be applied to the selected training data utterances to determine a second set of utterance-to-text-string mappings. The first set of utterance-to-text-string mappings may be compared to the second set of utterance-to-text-string mappings, and the parameters of the ASR system may be updated based on the comparison.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: February 24, 2015
    Assignee: Google Inc.
    Inventors: Ciprian Ioan Chelba, Brian Strope, Preethi Jyothi, Leif Johnson
  • Patent number: 8959014
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: February 17, 2015
    Assignee: Google Inc.
    Inventors: Peng Xu, Fernando Pereira, Ciprian I. Chelba
  • Patent number: 8938390
    Abstract: In one embodiment, a method for detecting autism in a natural language environment using a microphone, sound recorder, and a computer programmed with software for the specialized purpose of processing recordings captured by the microphone and sound recorder combination, the computer programmed to execute the method, includes segmenting an audio signal captured by the microphone and sound recorder combination using the computer programmed for the specialized purpose into a plurality recording segments. The method further includes determining which of the plurality of recording segments correspond to a key child. The method further includes determining which of the plurality of recording segments that correspond to the key child are classified as key child recordings.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: January 20, 2015
    Assignee: LENA Foundation
    Inventors: Dongxin D. Xu, Terrance D. Paul
  • Patent number: 8924212
    Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.
    Type: Grant
    Filed: August 26, 2005
    Date of Patent: December 30, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
  • Patent number: 8909527
    Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.
    Type: Grant
    Filed: June 24, 2009
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar