Patents by Inventor Andrej Ljolje

Andrej Ljolje has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8781831
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Grant
    Filed: September 5, 2013
    Date of Patent: July 15, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8756065
    Abstract: A method of correlating received communication data with operational communication characteristics is provided. The method includes receiving audible input from a source in a communication over a communications network, recording the received audible input, and transcribing the recorded audible input into a transcript. The method further includes outputting the transcript, specifying features of the transcript to be analyzed, specifying and recording operational communication characteristics particular to the communication, analyzing the transcript for the specified features to identify patterns associated with the audible input, computing statistical correlations of the identified patterns with the operational communication characteristics, and outputting results of the computed statistical correlations on a user interface.
    Type: Grant
    Filed: December 24, 2008
    Date of Patent: June 17, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: I. Dan Melamed, Yeon-Jun Kim, Bernard S. Renger, Andrej Ljolje, David J. Smith
  • Patent number: 8751229
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.
    Type: Grant
    Filed: November 21, 2008
    Date of Patent: June 10, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie
  • Patent number: 8738375
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Grant
    Filed: May 9, 2011
    Date of Patent: May 27, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
  • Publication number: 20140032214
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Application
    Filed: October 1, 2013
    Publication date: January 30, 2014
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. Syrdal
  • Publication number: 20140006024
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Application
    Filed: September 5, 2013
    Publication date: January 2, 2014
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8612234
    Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: December 17, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8600749
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for training adaptation-specific acoustic models. A system practicing the method receives speech and generates a full size model and a reduced size model, the reduced size model starting with a single distribution for each speech sound in the received speech. The system finds speech segment boundaries in the speech using the full size model and adapts features of the speech data using the reduced size model based on the speech segment boundaries and an overall centroid for each speech sound. The system then recognizes speech using the adapted features of the speech. The model can be a Hidden Markov Model (HMM). The reduced size model can also be of a reduced complexity, such as having fewer mixture components than a model of full complexity. Adapting features of speech can include moving the features closer to an overall feature distribution center.
    Type: Grant
    Filed: December 8, 2009
    Date of Patent: December 3, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8589163
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.
    Type: Grant
    Filed: December 4, 2009
    Date of Patent: November 19, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Mazin Gilbert
  • Patent number: 8548807
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: October 1, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 8532993
    Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.
    Type: Grant
    Filed: July 2, 2012
    Date of Patent: September 10, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8532992
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Grant
    Filed: February 8, 2013
    Date of Patent: September 10, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8428443
    Abstract: A method of providing modified media content is disclosed that includes providing media content to a destination device via a network, where the media content comprises video data and audio data have a first viewing rate. The method further includes receiving data indicating a selection of a second viewing rate via the network and modifying the media content to produce modified media content having approximately the second viewing rate. The modified media content includes modified video data and modified audio data synchronized at approximately the second viewing rate.
    Type: Grant
    Filed: March 12, 2007
    Date of Patent: April 23, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
  • Patent number: 8412527
    Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.
    Type: Grant
    Filed: June 24, 2009
    Date of Patent: April 2, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: I. Dan Melamed, Yeon-Jun Kim, Andrej Ljolje, Bernard S. Renger, David J. Smith
  • Patent number: 8374867
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Grant
    Filed: November 13, 2009
    Date of Patent: February 12, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8346549
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.
    Type: Grant
    Filed: December 4, 2009
    Date of Patent: January 1, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Mazin Gilbert
  • Publication number: 20120290298
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Application
    Filed: May 9, 2011
    Publication date: November 15, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
  • Patent number: 8312492
    Abstract: A method and system of providing media content is disclosed. In a particular embodiment, the method includes receiving media content from a content source at a set-top box device. The media content includes video data having a first playback rate and audio data having the first playback rate. The method further includes transforming the audio data via a non-linear transformation to produce modified audio data having a second playback rate, modifying the video data to produce modified video data having the second playback rate, and synchronizing the modified audio data and the modified video data to produce modified media content having the second playback rate. A network-based media content storage device and associated logic to provide adjusted rate audio content are also disclosed.
    Type: Grant
    Filed: March 19, 2007
    Date of Patent: November 13, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
  • Publication number: 20120271635
    Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.
    Type: Application
    Filed: July 2, 2012
    Publication date: October 25, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8296141
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.
    Type: Grant
    Filed: November 19, 2008
    Date of Patent: October 23, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje