Specialized Equations Or Comparisons Patents (Class 704/236)

Correlation (Class 704/237)

Distance (Class 704/238)

Similarity (Class 704/239)

Probability (Class 704/240)

Dynamic time warping (Class 704/241)

Viterbi trellis (Class 704/242)

Determining redundancies in content object directories

Patent number: 7035867

Abstract: A system for identifying files can use fingerprints to compare various files and determine redundant files. Frequency representations of portions of files can be used, such as Fast Fourier Transforms, as the fingerprints.

Type: Grant

Filed: November 28, 2001

Date of Patent: April 25, 2006

Assignee: Aerocast.com, Inc.

Inventors: Mark R. Thompson, Nathan F. Raciborski
Speech recognition system including speech section detecting section

Patent number: 7035798

Abstract: A trained vector generation section 16 generates beforehand a trained vector v of unvoiced sounds. An LPC Cepstrum analysis section 18 generates a feature vector A of a voice within the non-voice period, an inner product operation section 19 calculates an inner product value VTA between the feature vector A and the trained vector V, and a threshold generation section 20 generates a threshold ?v on the basis of the inner product value VTA. Also, the LFC Cepstrum analysis section 18 generates a prediction residual power ? of the signal within the non-voice period, and the threshold generation section 22 generates a threshold THD on the basis of the prediction residual power ?.

Type: Grant

Filed: September 12, 2001

Date of Patent: April 25, 2006

Assignee: Pioneer Corporation

Inventor: Hajime Kobayashi
Assisted speech recognition by dual search acceleration technique

Patent number: 7031915

Abstract: A speech recognition method, system and program product, the method in one embodiment comprising: obtaining input speech data; initiating a first speech recognition search process with at least one hypothesis; initiating a second speech recognition search process with a plurality of hypotheses; obtaining partial results from the second speech recognition search process, where the partial results include an evaluation of at least one hypothesis that the first speech recognition search process has not evaluated at this point in time; and utilizing the partial results to alter the first speech recognition search process.

Type: Grant

Filed: January 23, 2003

Date of Patent: April 18, 2006

Assignee: Aurilab LLC

Inventor: James K. Baker
System for monitoring audio content available over a network

Patent number: 7031921

Abstract: A method is provided for monitoring audio content available over a network. According to the method, the network is searched for audio files, and audio identifying information is generated for each audio file that is found. It is determined whether the audio identifying information generated for each audio file matches audio identifying information in an audio content database. In one preferred embodiment, each audio file that is found is analyzed so as to generate the audio file information, which is an audio feature signature that is based on the content of the audio file. Also provided is a system for monitoring audio content available over a network.

Type: Grant

Filed: June 29, 2001

Date of Patent: April 18, 2006

Assignee: International Business Machines Corporation

Inventors: Michael C. Pitman, Blake G. Fitch, Steven Abrams, Robert S. Germain
Voice interface for a search engine

Patent number: 7027987

Abstract: A system provides search results from a voice search query. The system receives a voice search query from a user, derives one or more recognition hypotheses, each being associated with a weight, from the voice search query, and constructs a weighted boolean query using the recognition hypotheses. The system then provides the weighted boolean query to a search system and provides the results of the search system to a user.

Type: Grant

Filed: February 7, 2001

Date of Patent: April 11, 2006

Assignee: Google Inc.

Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
Speech and signal digitization by using recognition metrics to select from multiple techniques

Patent number: 7016835

Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.

Type: Grant

Filed: December 19, 2002

Date of Patent: March 21, 2006

Assignee: International Business Machines Corporation

Inventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
Speech recognition system, training arrangement and method of calculating iteration values for free parameters of a maximum-entropy speech model

Patent number: 7010486

Abstract: The invention relates to a speech recognition system and a method of calculating iteration values for free parameters ??ortho(n) of a maximum-entropy speech model MESM with the aid of the generalized-iterative scaling training algorithm in a computer-supported speech recognition system in accordance with the formula ??ortho(n+1)=G(??ortho(n), m?ortho, . . . ), where n is an iteration parameter, G a mathematical function, ? an attribute in the MESM and m?ortho a desired orthogonalized boundary value in the MESM for the attribute ?. It is an object of the invention to further develop the system and method so that they make a fast computation of the free parameters ? possible without a change of the original training object. According to the invention this object is achieved in that the desired orthogonalized boundary value m?ortho is calculated by a linear combination of the desired boundary value m? with desired boundary values m? from attributes ? that have a larger range than the attribute ?.

Type: Grant

Filed: February 13, 2002

Date of Patent: March 7, 2006

Assignee: Koninklijke Philips Electronics, N.V.

Inventor: Jochen Peters
Method of phrase verification with probabilistic confidence tagging

Patent number: 7010484

Abstract: A method of phrase verification to verify a phrase not only according to its confidence measures but also according to neighboring concepts and their confidence tags. First, an utterance is received, and the received utterance is parsed to find a concept sequence. Subsequently, a plurality of tag sequences corresponding to the concept sequence is produced. Then, a first score of each of the tag sequences is calculated. Finally, the tag sequence of the highest first score is selected as the most probable tag sequence, and the tags contained therein are selected as the most probable confidence tags, respectively corresponding to the concepts in the concept sequence.

Type: Grant

Filed: December 12, 2001

Date of Patent: March 7, 2006

Assignee: Industrial Technology Research Institute

Inventor: Yi-Chung Lin
Automated voice pattern filter

Patent number: 7003458

Abstract: An automated voice pattern filtering method implemented in a system having a client side and a server side is disclosed. At the client side, a speech signal is transformed into a first set of spectral parameters which are encoded into a set of spectral shapes that are compared to a second set of spectral parameters corresponding to one or more keywords. From the comparison, the client side determines if the speech signal is acceptable. If so, spectral information indicating a difference in a voice pattern between the speech signal and the keyword(s) is encoded and utilized as a basis to generate a voice pattern filter.

Type: Grant

Filed: January 15, 2002

Date of Patent: February 21, 2006

Assignee: General Motors Corporation

Inventors: Kai-Ten Feng, Jane F. MacFarlane, Stephen C. Habermas
Method and apparatus for phonetic context adaptation for improved speech recognition

Patent number: 6999925

Abstract: The present invention provides a computerized method and apparatus for automatically generating from a first speech recognizer a second speech recognizer which can be adapted to a specific domain. The first speech recognizer can include a first acoustic model with a first decision network and corresponding first phonetic contexts. The first acoustic model can be used as a starting point for the adaptation process. A second acoustic model with a second decision network and corresponding second phonetic contexts for the second speech recognizer can be generated by re-estimating the first decision network and the corresponding first phonetic contexts based on domain-specific training data.

Type: Grant

Filed: November 13, 2001

Date of Patent: February 14, 2006

Assignee: International Business Machines Corporation

Inventors: Volker Fischer, Siegfried Kunzmann, Eric-W. Janke, A. Jon Tyrrell
Method and apparatus for speech recognition which is robust to missing speech data

Patent number: 6993483

Abstract: A speech recognizer suitable for distributed speech recognition is robust to missing speech feature vectors. Speech is transmitted via a packet switched network in the form of basic feature vectors. Missing feature vectors are detected and replacement feature vectors are estimated by interpolation of received data prior to speech recognition. Features may be converted and interpolation may be accomplished in a spectral domain.

Type: Grant

Filed: November 2, 2000

Date of Patent: January 31, 2006

Assignee: British Telecommunications public limited company

Inventor: Benjamin P Milner
Detection of speech activity using feature model adaptation

Patent number: 6993481

Abstract: According to the invention, a method for detecting speech activity for a signal is disclosed. In one step, a plurality of features is extracted from the signal. An active speech probability density function (PDF) of the plurality of features is modeled, and an inactive speech PDF of the plurality of features is modeled. The active and inactive speech PDFs are adapted to respond to changes in the signal over time. The signal is probability-based classifyied based, at least in part, on the plurality of features. Speech in the signal is distinguished based, at least in part, upon the probability-based classification.

Type: Grant

Filed: December 4, 2001

Date of Patent: January 31, 2006

Assignee: Global IP Sound AB

Inventors: Jan K. Skoglund, Jan T. Linden
Model adaptation apparatus, model adaptation method, storage medium, and pattern recognition apparatus

Patent number: 6985860

Abstract: To achieve an improvement in recognition performance, a non-speech acoustic model correction unit adapts a non-speech acoustic model representing a non-speech state using input data observed during an interval immediately before a speech recognition interval during which speech recognition is performed, by means of one of the most likelihood method, the complex statistic method, and the minimum distance-maximum separation theorem.

Type: Grant

Filed: August 30, 2001

Date of Patent: January 10, 2006

Assignee: Sony Corporation

Inventor: Hironaga Nakatsuka
Method for adding phonetic descriptions to a speech recognition lexicon

Patent number: 6973427

Abstract: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user's pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

Type: Grant

Filed: December 26, 2000

Date of Patent: December 6, 2005

Assignee: Microsoft Corporation

Inventors: Mei-Yuh Hwang, Fileno A. Alleva, Rebecca C. Weiss
Methodology for implementing a vocabulary set for use in a speech recognition system

Patent number: 6970818

Abstract: The present invention comprises a methodology for implementing a vocabulary set for use in a speech recognition system, and may preferably include a recognizer for analyzing utterances from the vocabulary set to generate N-best lists of recognition candidates. The N-best lists may then be utilized to create an acoustical matrix configured to relate said utterances to top recognition candidates from said N-best lists, as well as a lexical matrix configured to relate the utterances to the top recognition candidates from the N-best lists only when second-highest recognition candidates from the N-best lists are correct recognition results. An utterance ranking may then preferably be created according to composite individual error/accuracy values for each of the utterances. The composite individual error/accuracy values may preferably be derived from both the acoustical matrix and the lexical matrix.

Type: Grant

Filed: March 14, 2002

Date of Patent: November 29, 2005

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menedez-Pidal, Lex S. Olorenshaw
Method of speech recognition using empirically determined word candidates

Patent number: 6963834

Abstract: A method for performing speech recognition can include determining a recognition result for received user speech. The recognition result can include recognized text and a corresponding confidence score. The confidence score of the recognition result can correspond to a predetermined minimum threshold. If the confidence score does not exceed the predetermined minimum threshold, the user can be presented with at least one empirically determined alternate word candidate corresponding to the recognition result.

Type: Grant

Filed: May 29, 2001

Date of Patent: November 8, 2005

Assignee: International Business Machines Corporation

Inventors: Matthew W. Hartley, James R. Lewis, David E. Reich
Voice recognition apparatus and method, and recording medium

Patent number: 6961701

Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.

Type: Grant

Filed: March 3, 2001

Date of Patent: November 1, 2005

Assignee: Sony Corporation

Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuharu Asano, Helmut Lucke
Method for robust voice recognition by analyzing redundant features of source signal

Patent number: 6957183

Abstract: A method for processing digitized speech signals by analyzing redundant features to provide more robust voice recognition. A primary transformation is applied to a source speech signal to extract primary features therefrom. Each of at least one secondary transformation is applied to the source speech signal or extracted primary features to yield at least one set of secondary features statistically dependant on the primary features. At least one predetermined function is then applied to combine the primary features with the secondary features. A recognition answer is generated by pattern matching this combination against predetermined voice recognition templates.

Type: Grant

Filed: March 20, 2002

Date of Patent: October 18, 2005

Assignee: Qualcomm Inc.

Inventors: Narendranath Malayath, Harinath Garudadri
Feature extraction apparatus and method and pattern recognition apparatus and method

Patent number: 6910010

Abstract: A feature extraction and pattern recognition system in which an observation vector forming input data, which represents a certain point in the observation vector space, is mapped to a distribution having a spread in the feature vector space, and a feature distribution parameter representing the distribution is determined. Pattern recognition of the input data is performed based on the feature distribution parameter.

Type: Grant

Filed: October 28, 1998

Date of Patent: June 21, 2005

Assignee: Sony Corporation

Inventors: Naoto Iwahashi, Hongchang Bao, Hitoshi Honda
Time-series segmentation

Patent number: 6907367

Abstract: A method for segmenting a signal into segments having similar spectral characteristics is provided. Initially the method generates a table of previous values from older signal values that contains a scoring value for the best segmentation of previous values and a segment length of the last previously identified segment. The method then receives a new sample of the signal and computes a new spectral characteristic function for the signal based on the received sample. A new scoring function is computed from the spectral characteristic function. Segments of the signal are recursively identified based on the newly computed scoring function and the table of previous values. The spectral characteristic function can be a selected one of an autocorrelation function and a discrete Fourier transform. An example is provided for segmenting a speech signal.

Type: Grant

Filed: August 31, 2001

Date of Patent: June 14, 2005

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: Paul M. Baggenstoss
Method for calculating HMM output probability and speech recognition apparatus

Patent number: 6901365

Abstract: The invention enables even a CPU having low processing performance to find an HMM output probability by simplifying arithmetic operations. The dimensions of an input vector are grouped into several sets, and tables are created for the sets. When an output probability is calculated, codes corresponding to the first dimension to n-the dimension of the input vector are sequentially obtained, and for each code, by referring to the corresponding table, output values for each table are obtained. By substituting the output values for each table for a formula for finding an output probability, the output probability is found.

Type: Grant

Filed: September 19, 2001

Date of Patent: May 31, 2005

Assignee: Seiko Epson Corporation

Inventor: Yasunaga Miyazawa
Language recognition using sequence frequency

Patent number: 6882970

Abstract: A system is provided for comparing an input query with a number of stored annotations to identify information to be retrieved from a database. The comparison technique divides the input query into a number of fixed-size fragments and identifies how many times each of the fragments occurs within each annotation using a dynamic programming matching technique. The frequencies of occurrence of the fragments in both the query and the annotation are then compared to provide a measure of the similarity between the query and the annotation. The information to be retrieved is then determined from the similarity measures obtained for all the annotations.

Type: Grant

Filed: October 25, 2000

Date of Patent: April 19, 2005

Assignee: Canon Kabushiki Kaisha

Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
Signal modification based on continuous time warping for low bit rate CELP coding

Patent number: 6879955

Abstract: A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.

Type: Grant

Filed: June 29, 2001

Date of Patent: April 12, 2005

Assignee: Microsoft Corporation

Inventor: Ajit V. Rao
Speech recognizer

Patent number: 6868382

Abstract: The generic word label series used for recognition of words uttered by unspecified speakers are stored in the vocabulary label network accumulation processing. The speech of a particular speaker is entered. Based on the input speech, the registered word label series extraction processing generates the registered word label series. The registered word label series of the particular speaker can then be registered with the vocabulary label network accumulation processing.

Type: Grant

Filed: March 9, 2001

Date of Patent: March 15, 2005

Assignee: Asahi Kasei Kabushiki Kaisha

Inventor: Makoto Shozakai
Method and apparatus providing hypothesis driven speech modelling for use in speech recognition

Patent number: 6868381

Abstract: A speech recognition system having an input for receiving an input signal indicative of a spoken utterance that is indicative of at least one speech element. The system further includes a first processing unit operative for processing the input signal to derive from a speech recognition dictionary a speech model associated to a given speech element that constitutes a potential match to the at least one speech element. The system further comprised a second processing unit for generating a modified version of the speech model on the basis of the input signal. The system further provides a third processing unit for processing the input signal on the basis of the modified version of the speech model to generate a recognition result indicative of whether the modified version of the at least one speech model constitutes a match to the input signal.

Type: Grant

Filed: December 21, 1999

Date of Patent: March 15, 2005

Assignee: Nortel Networks Limited

Inventors: Stephen Douglas Peters, Daniel Boies, Benoit Dumoulin
Speech recognition system and method for generating phonotic estimates

Patent number: 6868380

Abstract: A speech recognition system for transforming an acoustic signal into a stream of phonetic estimates includes a frequency analyzer for generating a short-time frequency representation of the acoustic signal. A novelty processor separates background components of the representation from region of interest components of the representation. The output of the novelty processor includes the region of interest components of the representation according to the novelty parameters. An attention processor produces a gating signal as a function of the novelty output according to attention parameters. A coincidence processor produces information regarding co-occurrences between samples of the novelty output over time and frequency. The coincidence processor selectively gates the coincidence output as a function of the gating signal according to one or more coincidence parameters.

Type: Grant

Filed: March 23, 2001

Date of Patent: March 15, 2005

Assignee: Eliza Corporation

Inventor: John Kroeker
Method for recognizing speech

Patent number: 6850885

Abstract: To increase the accuracy and the flexibility of a method for recognizing speech which employs a keyword spotting process on the basis of a combination of a keyword model (KM) and a garbage model (GM) it is suggested to associate at least one variable penalty value (Ptrans, P1, . . . , P6) with a global penalty (Pglob) so as to increase the recognition of keywords (Kj).

Type: Grant

Filed: December 12, 2001

Date of Patent: February 1, 2005

Assignee: Sony International (Europe) GmbH

Inventors: Daniela Raddino, Ralf Kompe, Thomas Kemp
System and method for hybrid voice recognition

Patent number: 6836758

Abstract: A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.

Type: Grant

Filed: January 9, 2001

Date of Patent: December 28, 2004

Assignee: Qualcomm Incorporated

Inventors: Ning Bi, Andrew P. DeJaco, Harinath Garudadri, Chienchung Chang, William Yee-Ming Huang, Narendranath Malayath, Suhail Jalil, David Puig Oses, Yingyong Qi
Variational inference and learning for segmental switching state space models of hidden speech dynamics

Publication number: 20040260548

Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.

Type: Application

Filed: June 20, 2003

Publication date: December 23, 2004

Inventors: Hagai Attias, Li Deng, Leo J. Lee
Speech recognition accuracy in a multimodal input system

Patent number: 6823308

Abstract: A speech recognition method for use in a multimodal input system comprises receiving a multimodal input comprising digitized speech as a first modality input and data in at least one further modality input. Features in the speech and in the data in at least one further modality are identified. The identified features in the speech and in the data are used in the recognition of words by comparing the identified features with states in models for the words. The models have states for the recognition of speech and for words having features in at least one further modality associated with the words, the models also have states for the recognition of events in the further modality or each further modality.

Type: Grant

Filed: February 16, 2001

Date of Patent: November 23, 2004

Assignee: Canon Kabushiki Kaisha

Inventors: Robert Alexander Keiller, Nicolas David Fortescue
Speech recognition apparatus and method performing speech recognition with feature parameter preceding lead voiced sound as feature parameter of lead consonant

Patent number: 6823304

Abstract: A lead consonant buffer stores a feature parameter preceding a lead voiced sound detected by a voiced sound detector as a feature parameter of a lead consonant. A matching processing unit performs matching processing of a feature parameter of a lead consonant stored in the lead consonant buffer with a feature parameter of a registered pattern. Hence, the matching processing unit can perform matching processing reflecting information on a lead consonant even when no lead consonant can be detected due to a noise.

Type: Grant

Filed: July 19, 2001

Date of Patent: November 23, 2004

Assignee: Renesas Technology Corp.

Inventor: Masahiko Ikeda
Quality assessment tool

Publication number: 20040186715

Abstract: This invention relates to a non-intrusive speech quality assessment system. The invention provides a method and apparatus for training a quality assessment tool in which a database comprising a plurality of samples, each with an associated mean opinion score, is divided into a plurality of distortion sets of samples according to a distortion criterion; and a distortion specific assessment handler for each distortion set is trained, such that a fit between a distortion specific quality measure generated from a distortion specific plurality of parameters for a sample and the mean opinion score associated with said sample is optimised.

Type: Application

Filed: January 14, 2004

Publication date: September 23, 2004

Applicant: PSYTECHNICS LIMITED

Inventors: Philip Gray, Ludovic Malfait
Mapping objective voice quality metrics to a MOS domain for field measurements

Publication number: 20040186716

Abstract: A processing unit and method are described herein that are capable of estimating a quality of a speech signal transmitted through a wireless network. The processing unit uses a logistic function to map a score output from an objective voice quality method (PESQ algorithm) into a mean of opinion (MOS) score which is an estimation of the quality of the speech signal that was transmitted through the wireless network. The logistic function has the form: y=1+4/(1+exp(−1.7244*x+5.0187)) where x is the score from the PESQ algoritm which is in the range of −0.5 to 4.5 and y is the mapped MOS score which is in the range of 1 to 5 wherein if y=5 then the quality of the speech signal is considered excellent and if y=1 then the quality of the speech signal is considered bad.

Type: Application

Filed: January 20, 2004

Publication date: September 23, 2004

Applicant: Telefonaktiebolaget LM Ericsson

Inventors: John C. Morfitt, Irina C. Cotanis
Speech recognition improvement through post-processsing

Publication number: 20040186714

Abstract: A method, program product and system for speech recognition for use with a base speech recognition process, but which does not affect scoring models in the base speech recognition process, the method comprising in one embodiment: obtaining an output hypothesis from a base speech recognition process that uses a first set of scoring models; obtaining a set of alternative hypotheses; scoring the set of alternative hypotheses based on a second set of different scoring models that is separate from and external to the base speech recognition process and does not affect the scoring models thereof; and selecting a hypothesis with a best score.

Type: Application

Filed: March 18, 2003

Publication date: September 23, 2004

Applicant: Aurilab, LLC

Inventor: James K. Baker
Bitstream-based feature extraction method for a front-end speech recognizer

Patent number: 6792405

Abstract: A feature extraction process for use in a wireless communication system provides automatic speech recognition based on both spectral envelope and voicing information. The shape of the spectral envelope is used to determine the LSPs of the incoming bitstream and the adaptive gain coefficients and fixed gain coefficients are used to generate the “voiced” and “unvoiced” feature parameter information.

Type: Grant

Filed: December 5, 2000

Date of Patent: September 14, 2004

Assignee: AT&T Corp.

Inventors: Richard Vandervoort Cox, Hong Kook Kim
Apparatus and method for providing call return service

Patent number: 6788767

Abstract: An apparatus and method for enabling provision of a call return service is disclosed. The apparatus utilizes a method of generating telephone numbers from voice messages. The method includes the step of using speech recognition to isolate a spoken number in a voice message, and confirming to a high degree of accuracy that the spoken number represents a telephone number. The method further includes the step of converting the spoken number into a data sequence representing the telephone number. This data sequence is then made available for immediate or later use.

Type: Grant

Filed: December 28, 2000

Date of Patent: September 7, 2004

Assignee: Gateway, Inc.

Inventor: Jay V. Lambke
Stochastic processor, driving method thereof, and recognition process device using the same

Publication number: 20040162725

Abstract: A stochastic processor of the present invention comprises a fluctuation generator (15) configured to output an analog quantity having a fluctuation, a fluctuation difference calculation means (401) configured to output fluctuation difference data with an output of the fluctuation generator added to analog difference between two data, a thresholding unit (47) configured to perform thresholding on an output of the fluctuation difference calculation means to thereby generate a pulse, and a pulse detection means configured to detect the pulse output from the thresholding unit.

Type: Application

Filed: February 20, 2004

Publication date: August 19, 2004

Applicant: Matsushita Electric Industrial Co., Ltd.

Inventors: Michihito Ueda, Kiyoyuki Morita
Automated speech recognition filter

Publication number: 20040158467

Abstract: An automated speech recognition filter is disclosed. The automated speech recognition filter device provides a speech signal to an automated speech platform that approximates an original speech signal as spoken into a transceiver by a user. In providing the speech signal, the automated speech recognition filter determines various models representative of a cumulative signal degradation of the original speech signal from various devices along a transmission signal path and a reception signal path between the transceiver and a device housing the filter. The automated speech platform can thereby provide an audio signal corresponding to a context of the original speech signal.

Type: Application

Filed: February 6, 2004

Publication date: August 12, 2004

Inventors: Stephen C. Habermas, Ognjen Todic, Kai-Ten Feng, Jane F. MacFarlane
Sound characterisation and/or identification based on prosodic listening

Publication number: 20040158466

Abstract: Vocal and vocal-like sounds can be characterised and/or identified by using an intelligent classifying method adapted to determine prosodic attributes of the sounds and base a classificatory scheme upon composite functions of these attributes, the composite functions defining a discrimination space. The sounds are segmented before prosodic analysis on a segment by segment basis. The prosodic analysis of the sounds involves pitch analysis, intensity analysis, formant analysis and timing analysis. This method can be implemented in systems including language-identification and singing-style-identification systems.

Type: Application

Filed: April 9, 2004

Publication date: August 12, 2004

Inventor: Eduardo Reck Miranda
Speech recognition over lossy transmission systems

Patent number: 6775652

Abstract: Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.

Type: Grant

Filed: June 30, 1998

Date of Patent: August 10, 2004

Assignee: AT&T Corp.

Inventors: Richard Vandervoort Cox, Stephen Michael Marcus, Mazin G. Rahim, Nambirajan Seshadri, Robert Douglas Sharp
Computationally efficient method and apparatus for speaker recognition

Patent number: 6772119

Abstract: A speaker recognition technique is provided that can operate within the memory and processing constraints of existing portable computing devices. A smaller memory footprint and computational efficiency are achieved using single Gaussian models for each enrolled speaker. During enrollment, features are extracted from one or more enrollment utterances from each enrolled speaker, to generate a target speaker model based on a sample covariance matrix. During a recognition phase, features are extracted from one or more test utterances to generate a test utterance model that is also based on the sample covariance matrix. A sphericity ratio is computed that compares the test utterance model to the target speaker model, as well as a background model. The sphericity ratio indicates how similar test utterance speech is to the speech used when the user was enrolled, as represented by the target speaker model, and how dissimilar the test utterance speech is from the background model.

Type: Grant

Filed: December 10, 2002

Date of Patent: August 3, 2004

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Ganesh N. Ramaswamy, Ran Zilca
Method of decoding telegraphic speech

Patent number: 6772116

Abstract: A method of selecting a language model for decoding received user spoken utterances in a speech recognition system can include a series of steps. The steps can include computing confidence scores for identified closed-class words and computing a running average of the confidence scores for a predetermined number of decoded closed-class words. Additionally, based upon the running average, telegraphic decoding can be selectively enabled.

Type: Grant

Filed: March 27, 2001

Date of Patent: August 3, 2004

Assignee: International Business Machines Corporation

Inventor: James R. Lewis
Compression of language model structures and word identifiers for automated speech recognition systems

Publication number: 20040138884

Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.

Type: Application

Filed: January 13, 2003

Publication date: July 15, 2004

Inventors: Edward W. D. Whittaker, Bhiksha Ramakrishnan
Lossless compression of ordered integer lists

Publication number: 20040138883

Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.

Type: Application

Filed: January 13, 2003

Publication date: July 15, 2004

Inventors: Bhiksha Ramakrishnan, Edward W. D. Whittaker
Perceptual harmonic cepstral coefficients as the front-end for speech recognition

Publication number: 20040128130

Abstract: Pitch estimation and classification into voiced, unvoiced and transitional speech were performed by a spectro-temporal auto-correlation technique. A peak picking formula was then employed. A weighting function was then applied to the power spectrum. The harmonics weighted power spectrum underwent mel-scaled band-pass filtering, and the log-energy of the filter's output was discrete cosine transformed to produce cepstral coefficients. A within-filter cubic-root amplitude compression was applied to reduce amplitude variation without compromise of the gain invariance properties.

Type: Application

Filed: May 19, 2003

Publication date: July 1, 2004

Inventors: Kenneth Rose, Liang Gu
Codebook re-ordering to reduce undesired packet generation

Patent number: 6754624

Abstract: A method and apparatus for enhancing coding efficiency by reducing illegal or other undesirable packet generation while encoding a signal. The probability of generating illegal or other undesirable packets while encoding a signal is reduced by first analyzing a history of the frequency of codebook values selected while quantizing speech parameters. Codebook entries are then reordered so that the index/indices that create illegal or other undesirable packets contain the least frequently used entry/entries. Reordering multiple codebooks for various parameters further reduces the probability that an illegal or other undesirable packet will be created during signal encoding. The method and apparatus may be applied to reduce the probability of generating illegal null traffic channel data packets while encoding eighth rate speech.

Type: Grant

Filed: February 13, 2001

Date of Patent: June 22, 2004

Assignee: Qualcomm, Inc.

Inventors: Eddie-Lun Tik Choy, Arasanipalai K. Ananthapadmanabhan, Andrew P. DeJaco
Creating a hierarchical tree of language models for a dialog system based on prompt and dialog context

Patent number: 6754626

Abstract: The invention disclosed herein concerns a method of converting speech to text using a hierarchy of contextual models. The hierarchy of contextual models can be statistically smoothed into a language model. The method can include processing text with a plurality of contextual models. Each one of the plurality of contextual models can correspond to a node in a hierarchy of the plurality of contextual models. Also included can be identifying at least one of the contextual models relating to the text and processing subsequent user spoken utterances with the identified at least one contextual model.

Type: Grant

Filed: March 1, 2001

Date of Patent: June 22, 2004

Assignee: International Business Machines Corporation

Inventor: Mark E. Epstein
COMPUTATIONALLY EFFICIENT METHOD AND APPARATUS FOR SPEAKER RECOGNITION

Publication number: 20040111261

Abstract: A speaker recognition technique is provided that can operate within the memory and processing constraints of existing portable computing devices. A smaller memory footprint and computational efficiency are achieved using single Gaussian models for each enrolled speaker. During enrollment, features are extracted from one or more enrollment utterances from each enrolled speaker, to generate a target speaker model based on a sample covariance matrix. During a recognition phase, features are extracted from one or more test utterances to generate a test utterance model that is also based on the sample covariance matrix. A sphericity ratio is computed that compares the test utterance model to the target speaker model, as well as a background model. The sphericity ratio indicates how similar test utterance speech is to the speech used when the user was enrolled, as represented by the target speaker model, and how dissimilar the test utterance speech is from the background model.

Type: Application

Filed: December 10, 2002

Publication date: June 10, 2004

Applicant: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Ganesh N. Ramaswamy, Ran Zilca
Method and system for context-sensitive recognition of human input

Publication number: 20040102971

Abstract: In a particular embodiment, the disclosure is directed to a method of recognizing input that includes receiving input data; receiving context data associated with the input data, the context data associated with an interpretation mapping; and generating symbolic data from the input data using the interpretation mapping. In another particular embodiment, the disclosure is directed to an input recognition system that includes a context module, an input capture module, and a recognition module. The context module is configured to receive context input and provide context data. The input capture module is configured to receive input data and is configured to provide digitized input data. The recognition module is coupled to the context module and is coupled to the input capture module. The recognition module is configured to receive the digitized input data and to interpret the digitized input data utilizing an interpretation mapping associated with the context data.

Type: Application

Filed: August 11, 2003

Publication date: May 27, 2004

Applicant: RECARE, Inc.

Inventors: Randolph B. Lipscher, Michael D. Dahlin
Cancellation of loudspeaker words in speech recognition

Patent number: 6725193

Abstract: A voice recognition system for use with a communication system having an incoming line carrying an incoming signal from a first end to a second end operably attached to a speaker and the outgoing line carrying an outgoing signal from a microphone near the speaker. A first speech recognition unit (SRU) detects selected incoming words and a second SRU detect outgoing words. A comparator/signal generator compares the outgoing word with the incoming word and outputs the outgoing word when the outgoing word does not match the incoming word. The first SRU may be delayed relative to the second SRU. The SRU's may also search only for selected words in template, or may ignore words which are first detected by the other SRU. A signaler may also provide a signal indicating inclusion of one of the selected words in a known incoming signal with an SRU being responsive to that signal to ignore the included one command word in the template for a selected period of time.

Type: Grant

Filed: September 13, 2000

Date of Patent: April 20, 2004

Assignee: Telefonaktiebolaget LM Ericsson

Inventor: Thomas J. Makovicka

prev … 9 10 11 12 13 14 15 16 next