Patents by Inventor Andrej Ljolje

Andrej Ljolje has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods of providing modified media content

Patent number: 9628741

Abstract: A method includes processing media content. The media content includes audio data corresponding to a first audio playback rate and video data corresponding to a first video playback rate. Processing the media content includes identifying a speech portion of the audio data. The speech portion includes a consonant portion. The method further includes producing modified media content. The modified media content is produced based on modifying the video data and modifying the audio data. Modifying the audio data includes applying a non-linear transformation to the speech portion identified in the audio data. The method further includes storing the modified media content.

Type: Grant

Filed: October 11, 2012

Date of Patent: April 18, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
Automatic disclosure detection

Patent number: 9607279

Abstract: A method of detecting pre-determined phrases to determine compliance quality includes determining whether a precursor event has occurred based on a comparison between stored pre-determined phrases and a received communication, and determining a compliance rating based on a presence of a pre-determined phrase associated with the precursor event in the communication.

Type: Grant

Filed: April 15, 2015

Date of Patent: March 28, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan Melamed, Andrej Ljolje, Bernard Renger, Yeon-Jun Kim, David J. Smith
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9576582

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: February 23, 2016

Date of Patent: February 21, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and Method for Optimizing Speech Recognition and Natural Language Parameters with User Feedback

Publication number: 20160329045

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Application

Filed: July 18, 2016

Publication date: November 10, 2016

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Mazin GILBERT, Vincent GOFFIN, Taniya Mishra
System and method for handling repeat queries due to wrong ASR output by modifying an acoustic, a language and a semantic model

Patent number: 9484024

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.

Type: Grant

Filed: March 24, 2015

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro
System and method for discriminative pronunciation modeling for voice search

Patent number: 9484019

Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.

Type: Grant

Filed: October 11, 2012

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
System and method for supplemental speech recognition by identified idle resources

Patent number: 9431005

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.

Type: Grant

Filed: November 30, 2012

Date of Patent: August 30, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Mazin Gilbert
System and method for pronunciation modeling

Patent number: 9431011

Abstract: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

Type: Grant

Filed: September 17, 2014

Date of Patent: August 30, 2016

Assignee: Interactions LLC

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Systems and methods of providing modified media content

Patent number: 9414010

Abstract: A method includes receiving a command to provide media content configured to be sent to a display device for display at a particular scan rate. The media content includes audio data and video data. The method includes identifying high priority segments of the media content based on the audio data. The high priority segments are to be displayed by the display device at a presentation rate such that the high priority segments displayed at the presentation rate correspond to the media content displayed at the particular scan rate. The method also includes sending the high priority segments to the display device to provide video content and audio content of the requested media content for display.

Type: Grant

Filed: May 15, 2012

Date of Patent: August 9, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Andrej Ljolje
System and method for optimizing speech recognition and natural language parameters with user feedback

Patent number: 9396725

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Grant

Filed: May 27, 2014

Date of Patent: July 19, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
SYSTEM AND METHOD FOR HANDLING MISSING SPEECH DATA

Publication number: 20160180841

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Application

Filed: March 2, 2016

Publication date: June 23, 2016

Inventors: Andrej Ljolje, ALISTAIR D. CONKIE
SYSTEM AND METHOD FOR ADAPTING AUTOMATIC SPEECH RECOGNITION PRONUNCIATION BY ACOUSTIC MODEL RESTRUCTURING

Publication number: 20160171984

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Application

Filed: February 23, 2016

Publication date: June 16, 2016

Inventors: ANDREJ LJOLJE, ALISTAIR D. CONKIE, ANN K. SYRDAL
System and method for standardized speech recognition infrastructure

Patent number: 9336773

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

Type: Grant

Filed: May 1, 2015

Date of Patent: May 10, 2016

Assignee: INTERACTIONS LLC

Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
System and method for handling missing speech data

Patent number: 9305546

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Grant

Filed: June 9, 2014

Date of Patent: April 5, 2016

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9305547

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: April 28, 2015

Date of Patent: April 5, 2016

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and Method for Optimizing Speech Recognition and Natural Language Parameters with User Feedback

Publication number: 20150348540

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Application

Filed: May 27, 2014

Publication date: December 3, 2015

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Mazin GILBERT GILBERT, Vincent GOFFIN, Taniya Mishra
Low latency real-time vocal tract length normalization

Patent number: 9165555

Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.

Type: Grant

Filed: November 26, 2014

Date of Patent: October 20, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar
System and Method for Personalization of Acoustic Models for Automatic Speech Recognition

Publication number: 20150248884

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Application

Filed: April 30, 2015

Publication date: September 3, 2015

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Alistair D. CONKIE
System and Method for Adapting Automatic Speech Recognition Pronunciation by Acoustic Model Restructuring

Publication number: 20150243282

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Application

Filed: April 28, 2015

Publication date: August 27, 2015

Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. Syrdal
System and Method for Standardized Speech Recognition Infrastructure

Publication number: 20150235639

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

Type: Application

Filed: May 1, 2015

Publication date: August 20, 2015

Inventors: Andrej LJOLJE, Bernard S. RENGER, Steven Neil Tischer

prev 1 2 3 4 5 6 next