Patents by Inventor Andrej Ljolje
Andrej Ljolje has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8781831Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: GrantFiled: September 5, 2013Date of Patent: July 15, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8756065Abstract: A method of correlating received communication data with operational communication characteristics is provided. The method includes receiving audible input from a source in a communication over a communications network, recording the received audible input, and transcribing the recorded audible input into a transcript. The method further includes outputting the transcript, specifying features of the transcript to be analyzed, specifying and recording operational communication characteristics particular to the communication, analyzing the transcript for the specified features to identify patterns associated with the audible input, computing statistical correlations of the identified patterns with the operational communication characteristics, and outputting results of the computed statistical correlations on a user interface.Type: GrantFiled: December 24, 2008Date of Patent: June 17, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: I. Dan Melamed, Yeon-Jun Kim, Bernard S. Renger, Andrej Ljolje, David J. Smith
-
Patent number: 8751229Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.Type: GrantFiled: November 21, 2008Date of Patent: June 10, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie
-
Patent number: 8738375Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.Type: GrantFiled: May 9, 2011Date of Patent: May 27, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
-
Publication number: 20140032214Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: ApplicationFiled: October 1, 2013Publication date: January 30, 2014Applicant: AT&T Intellectual Property I, L.P.Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. Syrdal
-
Publication number: 20140006024Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: ApplicationFiled: September 5, 2013Publication date: January 2, 2014Applicant: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8612234Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.Type: GrantFiled: October 24, 2011Date of Patent: December 17, 2013Assignee: AT&T Intellectual Property I, L.P.Inventor: Andrej Ljolje
-
Patent number: 8600749Abstract: Disclosed herein are systems, methods, and computer-readable storage media for training adaptation-specific acoustic models. A system practicing the method receives speech and generates a full size model and a reduced size model, the reduced size model starting with a single distribution for each speech sound in the received speech. The system finds speech segment boundaries in the speech using the full size model and adapts features of the speech data using the reduced size model based on the speech segment boundaries and an overall centroid for each speech sound. The system then recognizes speech using the adapted features of the speech. The model can be a Hidden Markov Model (HMM). The reduced size model can also be of a reduced complexity, such as having fewer mixture components than a model of full complexity. Adapting features of speech can include moving the features closer to an overall feature distribution center.Type: GrantFiled: December 8, 2009Date of Patent: December 3, 2013Assignee: AT&T Intellectual Property I, L.P.Inventor: Andrej Ljolje
-
Patent number: 8589163Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.Type: GrantFiled: December 4, 2009Date of Patent: November 19, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Mazin Gilbert
-
Patent number: 8548807Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: June 9, 2009Date of Patent: October 1, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 8532993Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.Type: GrantFiled: July 2, 2012Date of Patent: September 10, 2013Assignee: AT&T Intellectual Property II, L.P.Inventor: Andrej Ljolje
-
Patent number: 8532992Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: GrantFiled: February 8, 2013Date of Patent: September 10, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8428443Abstract: A method of providing modified media content is disclosed that includes providing media content to a destination device via a network, where the media content comprises video data and audio data have a first viewing rate. The method further includes receiving data indicating a selection of a second viewing rate via the network and modifying the media content to produce modified media content having approximately the second viewing rate. The modified media content includes modified video data and modified audio data synchronized at approximately the second viewing rate.Type: GrantFiled: March 12, 2007Date of Patent: April 23, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
-
Patent number: 8412527Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.Type: GrantFiled: June 24, 2009Date of Patent: April 2, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: I. Dan Melamed, Yeon-Jun Kim, Andrej Ljolje, Bernard S. Renger, David J. Smith
-
Patent number: 8374867Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: GrantFiled: November 13, 2009Date of Patent: February 12, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8346549Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.Type: GrantFiled: December 4, 2009Date of Patent: January 1, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Mazin Gilbert
-
Publication number: 20120290298Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.Type: ApplicationFiled: May 9, 2011Publication date: November 15, 2012Applicant: AT&T Intellectual Property I, L.P.Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
-
Patent number: 8312492Abstract: A method and system of providing media content is disclosed. In a particular embodiment, the method includes receiving media content from a content source at a set-top box device. The media content includes video data having a first playback rate and audio data having the first playback rate. The method further includes transforming the audio data via a non-linear transformation to produce modified audio data having a second playback rate, modifying the video data to produce modified video data having the second playback rate, and synchronizing the modified audio data and the modified video data to produce modified media content having the second playback rate. A network-based media content storage device and associated logic to provide adjusted rate audio content are also disclosed.Type: GrantFiled: March 19, 2007Date of Patent: November 13, 2012Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
-
Publication number: 20120271635Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.Type: ApplicationFiled: July 2, 2012Publication date: October 25, 2012Applicant: AT&T Intellectual Property II, L.P.Inventor: Andrej Ljolje
-
Patent number: 8296141Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.Type: GrantFiled: November 19, 2008Date of Patent: October 23, 2012Assignee: AT&T Intellectual Property I, L.P.Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje