Patents by Inventor Alistair D. Conkie

Alistair D. Conkie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for prosodically modified unit selection databases

Patent number: 10249290

Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

Type: Grant

Filed: June 11, 2018

Date of Patent: April 2, 2019

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
SYSTEM AND METHOD FOR DISTRIBUTED VOICE MODELS ACROSS CLOUD AND DEVICE FOR EMBEDDED TEXT-TO-SPEECH

Publication number: 20190088249

Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.

Type: Application

Filed: November 19, 2018

Publication date: March 21, 2019

Inventors: Benjamin J. STERN, Mark Charles BEUTNAGEL, Alistair D. CONKIE, Horst J. SCHROETER, Amanda Joy STENT
System and method for unified normalization in text-to-speech and automatic speech recognition

Patent number: 10199034

Abstract: A system, method and computer-readable storage devices are for using a single set of normalization protocols and a single language lexica (or dictionary) for both TTS and ASR. The system receives input (which is either text to be converted to speech or ASR training text), then normalizes the input. The system produces, using the normalized input and a dictionary configured for both automatic speech recognition and text-to-speech processing, output which is either phonemes corresponding to the input or text corresponding to the input for training the ASR system. When the output is phonemes corresponding to the input, the system generates speech by performing prosody generation and unit selection synthesis using the phonemes. When the output is text corresponding to the input, the system trains both an acoustic model and a language model for use in future speech recognition.

Type: Grant

Filed: August 18, 2014

Date of Patent: February 5, 2019

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Alistair D. Conkie, Ladan Golipour
System and Method for Unit Selection Text-to-Speech Using a Modified Viterbi Approach

Publication number: 20190019496

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for speech synthesis. A system practicing the method receives a set of ordered lists of speech units, for each respective speech unit in each ordered list in the set of ordered lists, constructs a sublist of speech units from a next ordered list which are suitable for concatenation, performs a cost analysis of paths through the set of ordered lists of speech units based on the sublist of speech units for each respective speech unit, and synthesizes speech using a lowest cost path of speech units through the set of ordered lists based on the cost analysis. The ordered lists can be ordered based on the respective pitch of each speech unit. In one embodiment, speech units which do not have an assigned pitch can be assigned a pitch.

Type: Application

Filed: September 17, 2018

Publication date: January 17, 2019

Inventor: Alistair D. CONKIE
System and method for distributed voice models across cloud and device for embedded text-to-speech

Patent number: 10134383

Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.

Type: Grant

Filed: September 8, 2017

Date of Patent: November 20, 2018

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Benjamin J. Stern, Mark Charles Beutnagel, Alistair D. Conkie, Horst J. Schroeter, Amanda Joy Stent
SYSTEM AND METHOD FOR PROSODICALLY MODIFIED UNIT SELECTION DATABASES

Publication number: 20180293972

Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

Type: Application

Filed: June 11, 2018

Publication date: October 11, 2018

Inventors: Alistair D. CONKIE, Ladan GOLIPOUR, Ann K. SYRDAL
System and method for unit selection text-to-speech using a modified Viterbi approach

Patent number: 10079011

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for speech synthesis. A system practicing the method receives a set of ordered lists of speech units, for each respective speech unit in each ordered list in the set of ordered lists, constructs a sublist of speech units from a next ordered list which are suitable for concatenation, performs a cost analysis of paths through the set of ordered lists of speech units based on the sublist of speech units for each respective speech unit, and synthesizes speech using a lowest cost path of speech units through the set of ordered lists based on the cost analysis. The ordered lists can be ordered based on the respective pitch of each speech unit. In one embodiment, speech units which do not have an assigned pitch can be assigned a pitch.

Type: Grant

Filed: May 20, 2014

Date of Patent: September 18, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventor: Alistair D. Conkie
System and Method for Data-Driven Socially Customized Models for Language Generation

Publication number: 20180261209

Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.

Type: Application

Filed: May 14, 2018

Publication date: September 13, 2018

Inventors: Taniya MISHRA, Alistair D. CONKIE, Svetlana STOYANCHEV
System and method for prosodically modified unit selection databases

Patent number: 9997154

Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

Type: Grant

Filed: May 12, 2014

Date of Patent: June 12, 2018

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
System and method for automatic detection of abnormal stress patterns in unit selection synthesis

Patent number: 9978360

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for detecting and correcting abnormal stress patterns in unit-selection speech synthesis. A system practicing the method detects incorrect stress patterns in selected acoustic units representing speech to be synthesized, and corrects the incorrect stress patterns in the selected acoustic units to yield corrected stress patterns. The system can further synthesize speech based on the corrected stress patterns. In one aspect, the system also classifies the incorrect stress patterns using a machine learning algorithm such as a classification and regression tree, adaptive boosting, support vector machine, and maximum entropy. In this way a text-to-speech unit selection speech synthesizer can produce more natural sounding speech with suitable stress patterns regardless of the stress of units in a unit selection database.

Type: Grant

Filed: February 22, 2016

Date of Patent: May 22, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Yeon-Jun Kim, Mark Charles Beutnagel, Alistair D. Conkie, Ann K. Syrdal
System and method for data-driven socially customized models for language generation

Patent number: 9972309

Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialog or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.

Type: Grant

Filed: August 5, 2016

Date of Patent: May 15, 2018

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev
Unit-selection text-to-speech synthesis based on predicted concatenation parameters

Patent number: 9934775

Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In an example process, text to be converted to speech is received. The text is represented as a sequence of target units. A plurality of candidate speech segments corresponding to the sequence of target units are selected. Predicted statistical parameters of acoustic features associated with the sequence of target units are determined. The predicted statistical parameters of acoustic features are used to determine target costs and concatenation costs associated with the plurality of candidate speech segments. Based on a combined cost determined from the target costs and concatenation costs, a subset of candidate speech segments is selected from the plurality of candidate speech segments. Speech corresponding to the received text is generated using the subset of candidate speech segments.

Type: Grant

Filed: September 15, 2016

Date of Patent: April 3, 2018

Assignee: Apple Inc.

Inventors: Tuomo J. Raitio, Kishore Sunkeswari Prahallad, Alistair D. Conkie, Ladan Golipour, David A. Winarsky
SYSTEM AND METHOD FOR SPEECH PERSONALIZATION BY NEED

Publication number: 20180090129

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Application

Filed: December 4, 2017

Publication date: March 29, 2018

Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. SYRDAL
System and Method for Personalization of Acoustic Models for Automatic Speech Recognition

Publication number: 20180090130

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Application

Filed: December 4, 2017

Publication date: March 29, 2018

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Alistair D. CONKIE
SYSTEM AND METHOD FOR LOW-LATENCY WEB-BASED TEXT-TO-SPEECH WITHOUT PLUGINS

Publication number: 20180047384

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for reducing latency in web-browsing TTS systems without the use of a plug-in or Flash® module. A system configured according to the disclosed methods allows the browser to send prosodically meaningful sections of text to a web server. A TTS server then converts intonational phrases of the text into audio and responds to the browser with the audio file. The system saves the audio file in a cache, with the file indexed by a unique identifier. As the system continues converting text into speech, when identical text appears the system uses the cached audio corresponding to the identical text without the need for re-synthesis via the TTS server.

Type: Application

Filed: October 23, 2017

Publication date: February 15, 2018

Inventors: Alistair D. CONKIE, Mark Charles BEUTNAGEL, Taniya MISHRA
System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling

Patent number: 9880996

Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations by converting portions of symbolic input into a number of possible lexical pronunciation variants based on an established set of conversion rules, wherein the symbolic input comprises labeled speech data and selecting pronunciations in a speech recognition context from the potential pronunciations, to yield selected potential pronunciations. The method further includes retraining the established set of conversion rules based on the selected potential pronunciations.

Type: Grant

Filed: November 12, 2014

Date of Patent: January 30, 2018

Assignee: Nuance Communications, Inc.

Inventors: Alistair D. Conkie, Mazin Gilbert, Andrej Ljolje
System and Method for Handling Missing Speech Data

Publication number: 20180018962

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Application

Filed: September 25, 2017

Publication date: January 18, 2018

Inventors: Andrej LJOLJE, Alistair D. CONKIE
SYSTEM AND METHOD FOR DISTRIBUTED VOICE MODELS ACROSS CLOUD AND DEVICE FOR EMBEDDED TEXT-TO-SPEECH

Publication number: 20170372692

Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.

Type: Application

Filed: September 8, 2017

Publication date: December 28, 2017

Inventors: Benjamin J. STERN, Mark Charles BEUTNAGEL, Alistair D. CONKIE, Horst J. SCHROETER, Amanda Joy STENT
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9837072

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: May 15, 2017

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
System and method for speech personalization by need

Patent number: 9837071

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Grant

Filed: April 6, 2015

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal

prev 1 2 3 4 5 6 … next