Patents by Inventor Alistair D. Conkie

Alistair D. Conkie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170345411
    Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In an example process, text to be converted to speech is received. The text is represented as a sequence of target units. A plurality of candidate speech segments corresponding to the sequence of target units are selected. Predicted statistical parameters of acoustic features associated with the sequence of target units are determined. The predicted statistical parameters of acoustic features are used to determine target costs and concatenation costs associated with the plurality of candidate speech segments. Based on a combined cost determined from the target costs and concatenation costs, a subset of candidate speech segments is selected from the plurality of candidate speech segments. Speech corresponding to the received text is generated using the subset of candidate speech segments.
    Type: Application
    Filed: September 15, 2016
    Publication date: November 30, 2017
    Inventors: Tuomo J. RAITIO, Kishore Sunkeswari PRAHALLAD, Alistair D. CONKIE, Ladan GOLIPOUR, David A. WINARSKY
  • Patent number: 9799323
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for reducing latency in web-browsing TTS systems without the use of a plug-in or Flash® module. A system configured according to the disclosed methods allows the browser to send prosodically meaningful sections of text to a web server. A TTS server then converts intonational phrases of the text into audio and responds to the browser with the audio file. The system saves the audio file in a cache, with the file indexed by a unique identifier. As the system continues converting text into speech, when identical text appears the system uses the cached audio corresponding to the identical text without the need for re-synthesis via the TTS server.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: October 24, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Alistair D. Conkie, Mark Charles Beutnagel, Taniya Mishra
  • Patent number: 9773497
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.
    Type: Grant
    Filed: March 2, 2016
    Date of Patent: September 26, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Andrej Ljolje, Alistair D. Conkie
  • Patent number: 9773498
    Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: September 26, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Benjamin J. Stern, Enrico Luigi Bocchieri, Alistair D. Conkie, Danilo Giulianelli
  • Patent number: 9761218
    Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: September 12, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Benjamin J. Stern, Mark Charles Beutnagel, Alistair D. Conkie, Horst J. Schroeter, Amanda Joy Stent
  • Publication number: 20170249937
    Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.
    Type: Application
    Filed: May 15, 2017
    Publication date: August 31, 2017
    Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Alistair D. CONKIE
  • Publication number: 20170192943
    Abstract: A hybrid markup language document (or “HMLD”) is scanned for a partition boundary. Content in the HMLD that precedes the partition boundary is discarded for simpler and faster processing.
    Type: Application
    Filed: March 18, 2017
    Publication date: July 6, 2017
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Mark C. Beutnagel
  • Patent number: 9653069
    Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 16, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
  • Patent number: 9632989
    Abstract: A hybrid markup language document (or “HMLD”) is scanned for a partition boundary. Content in the HMLD that precedes the partition boundary is discarded for simpler and faster processing.
    Type: Grant
    Filed: October 30, 2014
    Date of Patent: April 25, 2017
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Mark C. Beutnagel, Alistair D. Conkie
  • Publication number: 20170084270
    Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.
    Type: Application
    Filed: December 1, 2016
    Publication date: March 23, 2017
    Inventors: Benjamin J. STERN, Enrico Luigi BOCCHIERI, Alistair D. CONKIE, Danilo GIULIANELLI
  • Patent number: 9576582
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: February 23, 2016
    Date of Patent: February 21, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9564121
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.
    Type: Grant
    Filed: August 7, 2014
    Date of Patent: February 7, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Mark Beutnagel, Yeon-Jun Kim, Ann K. Syrdal
  • Patent number: 9558737
    Abstract: Disclosed herein are methods for presenting speech from a selected text that is on a computing device. This method includes presenting text on a touch-sensitive display and having that text size within a threshold level so that the computing device can accurately determine the intent of the user when the user touches the touch screen. Once the user touch has been received, the computing device identifies and interprets the portion of text that is to be selected, and subsequently presents the text audibly to the user.
    Type: Grant
    Filed: July 21, 2015
    Date of Patent: January 31, 2017
    Assignee: Interactions LLC
    Inventors: Alistair D. Conkie, Horst Schroeter
  • Publication number: 20170004825
    Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.
    Type: Application
    Filed: August 5, 2016
    Publication date: January 5, 2017
    Inventors: Taniya MISHRA, Alistair D. CONKIE, Svetlana STOYANCHEV
  • Patent number: 9530416
    Abstract: Disclosed herein are systems, methods, and computer-readable storage devices for fetching speech processing models based on context changes in advance of speech requests using the speech processing models. An example local device configured to practice the method, having a local speech processor, and having access to remote speech models, detects a change in context. The change in context can be based on geographical location, language translation, speech in a different language, user language settings, installing or removing an app, and so forth. The local device can determine a speech processing model that is likely to be needed based on the change in context, and that is not stored on the local device. Independently of an explicit request to process speech, the local device can retrieve, from a remote server, the speech processing model for use on the mobile device.
    Type: Grant
    Filed: October 28, 2013
    Date of Patent: December 27, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Benjamin J. Stern, Enrico Luigi Bocchieri, Alistair D. Conkie, Danilo Giulianelli
  • Patent number: 9495954
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.
    Type: Grant
    Filed: February 22, 2016
    Date of Patent: November 15, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9484019
    Abstract: Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: November 1, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
  • Patent number: 9431011
    Abstract: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.
    Type: Grant
    Filed: September 17, 2014
    Date of Patent: August 30, 2016
    Assignee: Interactions LLC
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9412359
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.
    Type: Grant
    Filed: April 13, 2015
    Date of Patent: August 9, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mark Charles Beutnagel, Alistair D. Conkie, Yeon-Jun Kim, Horst Juergen Schroeter
  • Patent number: 9412358
    Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.
    Type: Grant
    Filed: May 13, 2014
    Date of Patent: August 9, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev