Patents by Inventor Andrej Ljolje

Andrej Ljolje has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Augmenting voice samples based on distributions of speaker classes

Patent number: 12340793

Abstract: A method of augmenting a training dataset of voice samples is provided. An audio processing system obtains voice samples and groups the voice samples into classes of spectral representations. The system obtains warp distributions associated with the classes of spectral representations and determines spectral change ratios based on a comparison of the warp distributions. The system determines transformations based at least in part on the spectral change ratios and applies the transformations to the voice samples grouped into the classes of spectral representations to generate a set of augmented voice samples. The system compiles the training dataset using at least the set of augmented voice samples. A recognition model is trained using the training dataset.

Type: Grant

Filed: June 28, 2022

Date of Patent: June 24, 2025

Assignee: Interactions LLC

Inventor: Andrej Ljolje
Universal semi-word model for vocabulary contraction in automatic speech recognition

Patent number: 12008986

Abstract: A speech recognition system includes, or has access to, conventional speech recognizer data, including a conventional acoustic model and pronunciation dictionary. The speech recognition system generates restructured speech recognizer data from the conventional speech recognizer data. When used at runtime by a speech recognizer module, the restructured speech recognizer data produces more accurate and efficient results than those produced using the conventional speech recognizer data. The restructuring involves segmenting entries of the conventional pronunciation dictionary and acoustic model according to their constituent phonemes and grouping those entries with the same initial N phonemes, for some integer N (e.g., N=3), and deriving a restructured dictionary with a corresponding semi-word acoustic model for the various grouped entries.

Type: Grant

Filed: April 27, 2020

Date of Patent: June 11, 2024

Assignee: Interactions LLC

Inventors: Ilija Zeljkovic, Andrej Ljolje
System and method for speech personalization by need

Patent number: 11620988

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Grant

Filed: December 9, 2019

Date of Patent: April 4, 2023

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 10699702

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: December 4, 2017

Date of Patent: June 30, 2020

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
System and method for speech personalization by need

Patent number: 10504505

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Grant

Filed: December 4, 2017

Date of Patent: December 10, 2019

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Systems and methods of providing modified media content

Patent number: 10291966

Abstract: A method includes receiving, at a content server from a media device, a request for media content at a first playback rate. The media content is available to the content server at a second playback rate that is different from the first playback rate. The method includes generating modified media content by modifying a first portion of the media content to have a second format corresponding to a third media playback rate. The first portion having a first media characteristic. The third playback rate is different than the first playback rate and is different than the second playback rate. The third playback rate is selected such that the modified media content has a third format corresponding to the first playback rate. The method further includes sending the modified media content from the content server to a media device.

Type: Grant

Filed: March 23, 2017

Date of Patent: May 14, 2019

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
Multi-channel speech recognition

Patent number: 10199035

Abstract: Systems, methods, and computer-readable storage devices for performing per-channel automatic speech recognition. An example system configured to practice the method combines a first audio signal of a first speaker in a communication session and a second audio signal from a second speaker in the communication session as a first audio channel and a second audio channel. The system can recognize speech in the first audio channel of the recording using a first model specific to the first speaker, and recognize speech in the second audio channel of the recording using a second model specific to the second speaker, wherein the first model is different from the second model. The system can generate recognized speech as an output from the communication session. The system can identify the models based on identifiers of the speakers, such as a telephone number, an IP address, a customer number, or account number.

Type: Grant

Filed: November 22, 2013

Date of Patent: February 5, 2019

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Ilya Dan Melamed, Andrej Ljolje
System and method for optimizing speech recognition and natural language parameters with user feedback

Patent number: 9984679

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Grant

Filed: July 18, 2016

Date of Patent: May 29, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
Automatic disclosure detection

Patent number: 9934792

Abstract: A method of detecting pre-determined phrases to determine compliance quality of an agent includes determining a presence of a predetermined input based on a comparison between stored pre-determined phrases and a received communication, and determining a compliance rating of the agent based on a presence of a pre-determined phrase associated with the predetermined input in the communication.

Type: Grant

Filed: January 31, 2017

Date of Patent: April 3, 2018

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan Melamed, Andrej Ljolje, Bernard S. Renger, David J. Smith, Yeon-Jun Kim
System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling

Patent number: 9880996

Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations by converting portions of symbolic input into a number of possible lexical pronunciation variants based on an established set of conversion rules, wherein the symbolic input comprises labeled speech data and selecting pronunciations in a speech recognition context from the potential pronunciations, to yield selected potential pronunciations. The method further includes retraining the established set of conversion rules based on the selected potential pronunciations.

Type: Grant

Filed: November 12, 2014

Date of Patent: January 30, 2018

Assignee: Nuance Communications, Inc.

Inventors: Alistair D. Conkie, Mazin Gilbert, Andrej Ljolje
System and method for speech personalization by need

Patent number: 9837071

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Grant

Filed: April 6, 2015

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9837072

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: May 15, 2017

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
System and method for handling missing speech data

Patent number: 9773497

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Grant

Filed: March 2, 2016

Date of Patent: September 26, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie
SYSTEMS AND METHODS OF PROVIDING MODIFIED MEDIA CONTENT

Publication number: 20170195741

Abstract: A method includes receiving, at a content server from a media device, a request for media content at a first playback rate. The media content is available to the content server at a second playback rate that is different from the first playback rate. The method includes generating modified media content by modifying a first portion of the media content to have a second format corresponding to a third media playback rate. The first portion having a first media characteristic. The third playback rate is different than the first playback rate and is different than the second playback rate. The third playback rate is selected such that the modified media content has a third format corresponding to the first playback rate. The method further includes sending the modified media content from the content server to a media device.

Type: Application

Filed: March 23, 2017

Publication date: July 6, 2017

Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9653069

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: April 30, 2015

Date of Patent: May 16, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
Systems and methods of providing modified media content

Patent number: 9628741

Abstract: A method includes processing media content. The media content includes audio data corresponding to a first audio playback rate and video data corresponding to a first video playback rate. Processing the media content includes identifying a speech portion of the audio data. The speech portion includes a consonant portion. The method further includes producing modified media content. The modified media content is produced based on modifying the video data and modifying the audio data. Modifying the audio data includes applying a non-linear transformation to the speech portion identified in the audio data. The method further includes storing the modified media content.

Type: Grant

Filed: October 11, 2012

Date of Patent: April 18, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
Automatic disclosure detection

Patent number: 9607279

Abstract: A method of detecting pre-determined phrases to determine compliance quality includes determining whether a precursor event has occurred based on a comparison between stored pre-determined phrases and a received communication, and determining a compliance rating based on a presence of a pre-determined phrase associated with the precursor event in the communication.

Type: Grant

Filed: April 15, 2015

Date of Patent: March 28, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan Melamed, Andrej Ljolje, Bernard Renger, Yeon-Jun Kim, David J. Smith
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9576582

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: February 23, 2016

Date of Patent: February 21, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and method for handling repeat queries due to wrong ASR output by modifying an acoustic, a language and a semantic model

Patent number: 9484024

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.

Type: Grant

Filed: March 24, 2015

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro
System and method for discriminative pronunciation modeling for voice search

Patent number: 9484019

Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.

Type: Grant

Filed: October 11, 2012

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje

1 2 3 4 next