Patents by Inventor Laurence S. Gillick

Laurence S. Gillick has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Training speech recognition word models from word samples synthesized by Monte Carlo techniques

Patent number: 7133827

Abstract: A new word model is trained from synthetic word samples derived by Monte Carlo techniques from one or more prior word models. The prior word model can be a phonetic word model and the new word model can be a non-phonetic, whole-word, word model. The prior word model can be trained from data that has undergone a first channel normalization and the synthesized word samples from which the new word model is trained can undergo a different channel normalization similar to that to be used in a given speech recognition context. The prior word model can have a first model structure and the new word model can have a second, different, model structure. These differences in model structure can include, for example, differences of model topology; differences of model complexity; and differences in the type of basis function used in a description of such probability distributions.

Type: Grant

Filed: February 6, 2003

Date of Patent: November 7, 2006

Assignee: Voice Signal Technologies, Inc.

Inventors: Laurence S. Gillick, Donald R. McAllaster, Daniel L. Roth
Expanding an effective vocabulary of a speech recognition system

Patent number: 7120582

Abstract: The invention provides techniques for creating and using fragmented word models to increase the effective size of an active vocabulary of a speech recognition system. The active vocabulary represents all words and word fragments that the speech recognition system is able to recognize. Each word may be represented by a combination of acoustic models. As such, the active vocabulary represents the combinations of acoustic models that the speech recognition system may compare to a user's speech to identify acoustic models that best match the user's speech. The effective size of the active vocabulary may be increased by dividing words into constituent components or fragments (for example, prefixes, suffixes, separators, infixes, and roots) and including each component as a separate entry in the active vocabulary.

Type: Grant

Filed: September 7, 1999

Date of Patent: October 10, 2006

Assignee: Dragon Systems, Inc.

Inventors: Jonathan H. Young, Haakon L. Chevalier, Laurence S. Gillick, Toffee A. Albina, Marlboro B. Moore, III, Paul E. Rensing, Jonathan P. Yamron
Methods and systems of routing utterances based on confidence estimates

Patent number: 7003456

Abstract: A computer-based method of routing a message to a system includes receiving a message, and processing the message using large-vocabulary continuous speech recognition to generate a string of text corresponding to the message. The method includes generating a confidence estimate of the string of text corresponding to the message and comparing the confidence estimate to a predetermined threshold. If the confidence estimate satisfies the predetermined threshold, the string of text is forwarded to the system. If the confidence estimate does not satisfy the predetermined threshold, the information relating to the message is forwarded to a transcriptionist. The message may include one or more utterances. Each utterance in the message may be separately or jointly processed. In this way, a confidence estimate may be generated and evaluated for each utterance or for the whole message. Information relating to each utterance may be separately or jointly forwarded based on the results of the generation and evaluation.

Type: Grant

Filed: June 12, 2001

Date of Patent: February 21, 2006

Inventors: Laurence S. Gillick, Robert Roth, Linda Manganaro, Barbara R. Peskin, David C. Petty, Ashwin Rao
Signal-to-noise mediated speech recognition algorithm

Publication number: 20040260547

Abstract: A method of processing speech in a noisy environment includes determining, upon a wake-up command, when the environment is too noisy to yield reliable recognition of a user's spoken words, and alerting the user that the environment is too noisy. Determining when the environment is too noisy includes calculating a ratio of signal to noise. The signal corresponds to of an amount of energy in the spoken utterance, and the noise corresponds to an amount of energy in the background noise. The method further includes comparing the signal to noise to a threshold.

Type: Application

Filed: May 10, 2004

Publication date: December 23, 2004

Applicant: Voice Signal Technologies

Inventors: Jordan Cohen, Daniel L, Roth, Laurence S. Gillick
Multilingual speech recognition

Publication number: 20040210438

Abstract: A method for speech recognition. The method uses a single pronunciation estimator to train acoustic phoneme models and recognize utterances from multiple languages. The method includes accepting text spellings of training words in a plurality of sets of training words, each set corresponding to a different one of a plurality of languages. The method also includes, for each of the sets of training words in the plurality, receiving pronunciations for the training words in the set, the pronunciations being characteristic of native speakers of the language of the set, the pronunciations also being in terms of subword units at least some of which are common to two or more of the languages. The method also includes training a single pronunciation estimator using data comprising the text spellings and the pronunciations of the training words.

Type: Application

Filed: November 17, 2003

Publication date: October 21, 2004

Inventors: Laurence S. Gillick, Thomas E. Lynch, Michael J. Newman, Daniel L. Roth, Steven A. Wegmann, Jonathan P. Yamron
Apparatus, methods, and programming for speech synthesis via bit manipulations of compressed database

Publication number: 20040073428

Abstract: Text-to-speech synthesis modifies the pitch of the sounds it concatenates to generate speech, when such sounds are in compressed, coded form, so as to make them sound better together. The pitch, duration, and energy of such concatenated sounds can be altered to better match, respectively, pitch, duration, and/or energy contours generated from phonetic spelling of the speech to be synthesized, which can, in turn, be derived from the text to be synthesized. The synthesized speech can be generated from the encoded sound of sub-word snippets as well as of one or more whole words. The duration of concatenated sounds can be changed by inserting or deleting sound frames associated with individual snippets. Such text-to-speech can be used to say words recognized by speech recognition, such as to provide feedback on the recognition. Such text-to-speech synthesis can be used in portable devices such as cellphones, PDAs, and/or wrist phones.

Type: Application

Filed: October 10, 2002

Publication date: April 15, 2004

Inventors: Igor Zlokarnik, Laurence S. Gillick, Jordan R. Cohen
Using utterance-level confidence estimates

Publication number: 20020133341

Abstract: A computer-based method of routing a message to a system includes receiving a message, and processing the message using large-vocabulary continuous speech recognition to generate a string of text corresponding to the message. The method includes generating a confidence estimate of the string of text corresponding to the message and comparing the confidence estimate to a predetermined threshold. If the confidence estimate satisfies the predetermined threshold, the string of text is forwarded to the system. If the confidence estimate does not satisfy the predetermined threshold, the information relating to the message is forwarded to a transcriptionist. The message may include one or more utterances. Each utterance in the message may be separately or jointly processed. In this way, a confidence estimate may be generated and evaluated for each utterance or for the whole message. Information relating to each utterance may be separately or jointly forwarded based on the results of the generation and evaluation.

Type: Application

Filed: June 12, 2001

Publication date: September 19, 2002

Inventors: Laurence S. Gillick, Robert Roth, Linda Manganaro, Barbara R. Peskin, David C. Petty, Ashwin Rao
Speech recognition using nonparametric speech models

Patent number: 6224636

Abstract: The content of a speech sample is recognized using a computer system by evaluating the speech sample against a nonparametric set of training observations, for example, utterances from one or more human speakers. The content of the speech sample is recognized based on the evaluation results. The speech recognition process also may rely on a comparison between the speech sample and a parametric model of the training observations.

Type: Grant

Filed: February 28, 1997

Date of Patent: May 1, 2001

Assignee: Dragon Systems, Inc.

Inventors: Steven A. Wegmann, Laurence S. Gillick
Speech recognition language models

Patent number: 6167377

Abstract: Language model results are combined according to a combination expression to produce combined language model results for a set of candidates. A candidate is selected and the combination expression is adjusted using language model results associated with the selected candidate.

Type: Grant

Filed: March 28, 1997

Date of Patent: December 26, 2000

Assignee: Dragon Systems, Inc.

Inventors: Laurence S. Gillick, Joel M. Gould, Robert Roth, Paul A. van Mulbregt, Michael D. Bibeault
Rapid adaptation of speech models

Patent number: 6151575

Abstract: A source-adapted model for use in speech recognition is generated by defining a linear relationship between a first element of an initial model and a first element of the source-adapted model. Thereafter, speech data that corresponds to the first element of the initial model is assembled from a set of speech data for a particular source associated with the source-adapted model. A linear transform that maps between the assembled speech data and the first element of the initial model is then determined. Finally, a first element of the source-adapted model is produced from the first element of the initial model using the linear transform.

Type: Grant

Filed: October 28, 1997

Date of Patent: November 21, 2000

Assignee: Dragon Systems, Inc.

Inventors: Michael Jack Newman, Laurence S. Gillick, Venkatesh Nagesha
Text segmentation and identification of topic using language models

Patent number: 6052657

Abstract: System for segmenting text and identifying segment topics that match a user-specified topic. Topic tracking system creates a set of topic models from training text containing topic boundaries using a clustering algorithm. User supplies topic text. System creates a topic model of the topic text and adds the topic model to the set of topic models. User-supplied test text is segmented according to the set of topic models. Segments relating to the same topic as the topic text are selected.

Type: Grant

Filed: November 25, 1997

Date of Patent: April 18, 2000

Assignee: Dragon Systems, Inc.

Inventors: Jonathan P. Yamron, Paul G. Bamberg, James Barnett, Laurence S. Gillick, Paul A. van Mulbregt
Sequential, nonparametric speech recognition and speaker identification

Patent number: 6029124

Abstract: A speech sample is evaluated using a computer. Training data that include samples of speech are received and stored along with identification of speech elements to which portions of the training data are related. A speech sample is received and speech recognition is performed on the speech sample to produce recognition results. Finally, the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of the training data are related. The technique may be used to perform tasks such as speech recognition, speaker identification, and language identification.

Type: Grant

Filed: March 31, 1998

Date of Patent: February 22, 2000

Assignee: Dragon Systems, Inc.

Inventors: Laurence S. Gillick, Andres Corrada-Emmanuel, Michael J. Newman, Barbara R. Peskin
Speaker identification using unsupervised speech models

Patent number: 5946654

Abstract: A speech model is produced for use in determining whether a speaker associated with the speech model produced an unidentified speech sample. First a sample of speech of a particular speaker is obtained. Next, the contents of the sample of speech are identified using speech recognition. Finally, a speech model associated with the particular speaker is produced using the sample of speech and the identified contents thereof. The speech model is produced without using an external mechanism to monitor the accuracy with which the contents were identified.

Type: Grant

Filed: February 21, 1997

Date of Patent: August 31, 1999

Assignee: Dragon Systems, Inc.

Inventors: Michael Jack Newman, Laurence S. Gillick, Yoshiko Ito
Lexical tree pre-filtering in speech recognition

Patent number: 5822730

Abstract: A speech recognition technique uses lexical tree pre-filtering to obtain lists of words for use in performing speech recognition. The lexical tree pre-filtering includes representing a vocabulary of words using a lexical tree and identifying a first subset of the vocabulary that may correspond to speech spoken beginning at a first time by propagating through the lexical tree information about the speech spoken beginning at the first time. A second subset of the vocabulary that may correspond to speech spoken beginning at a second time is identified by propagating through the lexical tree information about the speech spoken beginning at the second time. Words included in the speech are recognized by comparing speech spoken beginning at the first time with words from the first subset of the vocabulary and speech spoken beginning at the second time with words from the second subset of the vocabulary. The state of the lexical tree is not reset between identifying the first and second subsets.

Type: Grant

Filed: August 22, 1996

Date of Patent: October 13, 1998

Assignee: Dragon Systems, Inc.

Inventors: Robert Roth, James K. Baker, Laurence S. Gillick, Alan Walsh
Apparatuses and methods for developing and using models for speech recognition

Patent number: 5715367

Abstract: A computerized system time aligns frames of spoken training data against models of the speech sounds; automatically selects different sets of phonetic context classifications which divide the speech sound models into speech sound groups aligned against acoustically similar frames; creates model components from the frames aligned against speech sound groups with related classifications; and uses these model components to build a separate model for each related speech sound group. A decision tree classifies speech sounds into such groups, and related speech sound groups descend from common tree nodes. New speech samples time aligned against a given speech sound group's model update models of related speech sound groups, decreasing the training data required to adapt the system. The phonetic context classifications can be based on knowledge of which contextual features are associated with acoustic similarity.

Type: Grant

Filed: January 23, 1995

Date of Patent: February 3, 1998

Assignee: Dragon Systems, Inc.

Inventors: Laurence S. Gillick, Francesco Scattone
Systems and methods for word recognition

Patent number: 5680511

Abstract: In one aspect, the invention provides word recognition systems that operate to recognize an unrecognized or ambiguous word that occurs within a passage of words. The system can offer several words as choice words for inserting into the passage to replace the unrecognized word. The system can select the best choice word by using the choice word to extract from a reference source, sample passages of text that relate to the choice word. For example, the system can select the dictionary passage that defines the choice word. The system then compares the selected passage to the current passage, and generates a score that indicates the likelihood that the choice word would occur within that passage of text. The system can select the choice word with the best score to substitute into the passage. The passage of words being analyzed can be any word sequence including an utterance, a portion of handwritten text, a portion of typewritten text or other such sequence of words, numbers and characters.

Type: Grant

Filed: June 7, 1995

Date of Patent: October 21, 1997

Assignee: Dragon Systems, Inc.

Inventors: Janet M. Baker, Laurence S. Gillick, James K. Baker, Jonathan P. Yamron
System for processing a succession of utterances spoken in continuous or discrete form

Patent number: 5526463

Abstract: The system of the invention relates to continuous speech pre-filtering systems for use in discrete and continuous speech recognition computer systems. The speech to be recognized is converted from utterances to frame data sets, which frame data sets are smoothed to generate a smooth frame model over a predetermined number of frames. A resident vocabulary is stored within the computer as clusters of word models which are acoustically similar over a succession of frame periods. A cluster score is generated by the system, which score includes the likelihood of the smooth frames evaluated using a probability model for the cluster against which the smooth frame model is being compared. Cluster sets having cluster scores below a predetermined acoustic threshold are removed from further consideration. The remaining cluster sets are unpacked for determination of a word score for each unpacked word.

Type: Grant

Filed: April 9, 1993

Date of Patent: June 11, 1996

Assignee: Dragon Systems, Inc.

Inventors: Laurence S. Gillick, Robert S. Roth
Large-vocabulary continuous speech prefiltering and processing system

Patent number: 5202952

Abstract: A continuous speech prefiltering system for use in continuous speech recognition computer systems. The speech to be recognized is converted from utterances to frame data sets, which frame data sets are smoothed to generate a smooth frame model over a predetermined number of frames. A resident vocabulary is stored within the computer as clusters of word models which are acoustically similar over a succession of frame periods. A cluster score is generated by the system, which score includes the likelihood of the smooth frames evaluated using a probability model for the cluster against which the smooth frame model is being compared. Cluster sets having cluster scores below a predetermined acoustic threshold are removed from further consideration. The remaining cluster sets are unpacked for determination of a word score for each unpacked word.

Type: Grant

Filed: June 22, 1990

Date of Patent: April 13, 1993

Assignee: Dragon Systems, Inc.

Inventors: Laurence S. Gillick, Robert S. Roth

prev 1 2