Patents by Inventor Alistair D. Conkie

Alistair D. Conkie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

UNIT-SELECTION TEXT-TO-SPEECH SYNTHESIS BASED ON PREDICTED CONCATENATION PARAMETERS

Publication number: 20170345411

Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In an example process, text to be converted to speech is received. The text is represented as a sequence of target units. A plurality of candidate speech segments corresponding to the sequence of target units are selected. Predicted statistical parameters of acoustic features associated with the sequence of target units are determined. The predicted statistical parameters of acoustic features are used to determine target costs and concatenation costs associated with the plurality of candidate speech segments. Based on a combined cost determined from the target costs and concatenation costs, a subset of candidate speech segments is selected from the plurality of candidate speech segments. Speech corresponding to the received text is generated using the subset of candidate speech segments.

Type: Application

Filed: September 15, 2016

Publication date: November 30, 2017

Inventors: Tuomo J. RAITIO, Kishore Sunkeswari PRAHALLAD, Alistair D. CONKIE, Ladan GOLIPOUR, David A. WINARSKY
System and method for low-latency web-based text-to-speech without plugins

Patent number: 9799323

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for reducing latency in web-browsing TTS systems without the use of a plug-in or Flash® module. A system configured according to the disclosed methods allows the browser to send prosodically meaningful sections of text to a web server. A TTS server then converts intonational phrases of the text into audio and responds to the browser with the audio file. The system saves the audio file in a cache, with the file indexed by a unique identifier. As the system continues converting text into speech, when identical text appears the system uses the cached audio corresponding to the identical text without the need for re-synthesis via the TTS server.

Type: Grant

Filed: December 14, 2015

Date of Patent: October 24, 2017

Assignee: Nuance Communications, Inc.

Inventors: Alistair D. Conkie, Mark Charles Beutnagel, Taniya Mishra
System and method for handling missing speech data

Patent number: 9773497

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Grant

Filed: March 2, 2016

Date of Patent: September 26, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie
System and method for managing models for embedded speech and language processing

Patent number: 9773498

Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.

Type: Grant

Filed: December 1, 2016

Date of Patent: September 26, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Benjamin J. Stern, Enrico Luigi Bocchieri, Alistair D. Conkie, Danilo Giulianelli
System and method for distributed voice models across cloud and device for embedded text-to-speech

Patent number: 9761218

Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.

Type: Grant

Filed: November 30, 2015

Date of Patent: September 12, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Benjamin J. Stern, Mark Charles Beutnagel, Alistair D. Conkie, Horst J. Schroeter, Amanda Joy Stent
System and Method for Personalization of Acoustic Models for Automatic Speech Recognition

Publication number: 20170249937

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Application

Filed: May 15, 2017

Publication date: August 31, 2017

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Alistair D. CONKIE
Partitioning of Markup Language Documents

Publication number: 20170192943

Abstract: A hybrid markup language document (or “HMLD”) is scanned for a partition boundary. Content in the HMLD that precedes the partition boundary is discarded for simpler and faster processing.

Type: Application

Filed: March 18, 2017

Publication date: July 6, 2017

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Mark C. Beutnagel
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9653069

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: April 30, 2015

Date of Patent: May 16, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
Partitioning of markup language documents

Patent number: 9632989

Abstract: A hybrid markup language document (or “HMLD”) is scanned for a partition boundary. Content in the HMLD that precedes the partition boundary is discarded for simpler and faster processing.

Type: Grant

Filed: October 30, 2014

Date of Patent: April 25, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Mark C. Beutnagel, Alistair D. Conkie
SYSTEM AND METHOD FOR MANAGING MODELS FOR EMBEDDED SPEECH AND LANGUAGE PROCESSING

Publication number: 20170084270

Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.

Type: Application

Filed: December 1, 2016

Publication date: March 23, 2017

Inventors: Benjamin J. STERN, Enrico Luigi BOCCHIERI, Alistair D. CONKIE, Danilo GIULIANELLI
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9576582

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: February 23, 2016

Date of Patent: February 21, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and method for generalized preselection for unit selection synthesis

Patent number: 9564121

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Grant

Filed: August 7, 2014

Date of Patent: February 7, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Mark Beutnagel, Yeon-Jun Kim, Ann K. Syrdal
System and method for audibly presenting selected text

Patent number: 9558737

Abstract: Disclosed herein are methods for presenting speech from a selected text that is on a computing device. This method includes presenting text on a touch-sensitive display and having that text size within a threshold level so that the computing device can accurately determine the intent of the user when the user touches the touch screen. Once the user touch has been received, the computing device identifies and interprets the portion of text that is to be selected, and subsequently presents the text audibly to the user.

Type: Grant

Filed: July 21, 2015

Date of Patent: January 31, 2017

Assignee: Interactions LLC

Inventors: Alistair D. Conkie, Horst Schroeter
SYSTEM AND METHOD FOR DATA-DRIVEN SOCIALLY CUSTOMIZED MODELS FOR LANGUAGE GENERATION

Publication number: 20170004825

Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.

Type: Application

Filed: August 5, 2016

Publication date: January 5, 2017

Inventors: Taniya MISHRA, Alistair D. CONKIE, Svetlana STOYANCHEV
System and method for managing models for embedded speech and language processing

Patent number: 9530416

Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.

Type: Grant

Filed: October 28, 2013

Date of Patent: December 27, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Benjamin J. Stern, Enrico Luigi Bocchieri, Alistair D. Conkie, Danilo Giulianelli
System and method of synthetic voice generation and modification

Patent number: 9495954

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.

Type: Grant

Filed: February 22, 2016

Date of Patent: November 15, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Ann K. Syrdal
System and method for discriminative pronunciation modeling for voice search

Patent number: 9484019

Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.

Type: Grant

Filed: October 11, 2012

Date of Patent: November 1, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
System and method for pronunciation modeling

Patent number: 9431011

Abstract: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

Type: Grant

Filed: September 17, 2014

Date of Patent: August 30, 2016

Assignee: Interactions LLC

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
System and method for cloud-based text-to-speech web services

Patent number: 9412359

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.

Type: Grant

Filed: April 13, 2015

Date of Patent: August 9, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mark Charles Beutnagel, Alistair D. Conkie, Yeon-Jun Kim, Horst Juergen Schroeter
System and method for data-driven socially customized models for language generation

Patent number: 9412358

Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.

Type: Grant

Filed: May 13, 2014

Date of Patent: August 9, 2016

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev

prev 1 2 3 4 5 6 7 … next