Specialized Equations Or Comparisons Patents (Class 704/236)
  • Patent number: 7035867
    Abstract: A system for identifying files can use fingerprints to compare various files and determine redundant files. Frequency representations of portions of files can be used, such as Fast Fourier Transforms, as the fingerprints.
    Type: Grant
    Filed: November 28, 2001
    Date of Patent: April 25, 2006
    Assignee: Aerocast.com, Inc.
    Inventors: Mark R. Thompson, Nathan F. Raciborski
  • Patent number: 7035798
    Abstract: A trained vector generation section 16 generates beforehand a trained vector v of unvoiced sounds. An LPC Cepstrum analysis section 18 generates a feature vector A of a voice within the non-voice period, an inner product operation section 19 calculates an inner product value VTA between the feature vector A and the trained vector V, and a threshold generation section 20 generates a threshold ?v on the basis of the inner product value VTA. Also, the LFC Cepstrum analysis section 18 generates a prediction residual power ? of the signal within the non-voice period, and the threshold generation section 22 generates a threshold THD on the basis of the prediction residual power ?.
    Type: Grant
    Filed: September 12, 2001
    Date of Patent: April 25, 2006
    Assignee: Pioneer Corporation
    Inventor: Hajime Kobayashi
  • Patent number: 7031915
    Abstract: A speech recognition method, system and program product, the method in one embodiment comprising: obtaining input speech data; initiating a first speech recognition search process with at least one hypothesis; initiating a second speech recognition search process with a plurality of hypotheses; obtaining partial results from the second speech recognition search process, where the partial results include an evaluation of at least one hypothesis that the first speech recognition search process has not evaluated at this point in time; and utilizing the partial results to alter the first speech recognition search process.
    Type: Grant
    Filed: January 23, 2003
    Date of Patent: April 18, 2006
    Assignee: Aurilab LLC
    Inventor: James K. Baker
  • Patent number: 7031921
    Abstract: A method is provided for monitoring audio content available over a network. According to the method, the network is searched for audio files, and audio identifying information is generated for each audio file that is found. It is determined whether the audio identifying information generated for each audio file matches audio identifying information in an audio content database. In one preferred embodiment, each audio file that is found is analyzed so as to generate the audio file information, which is an audio feature signature that is based on the content of the audio file. Also provided is a system for monitoring audio content available over a network.
    Type: Grant
    Filed: June 29, 2001
    Date of Patent: April 18, 2006
    Assignee: International Business Machines Corporation
    Inventors: Michael C. Pitman, Blake G. Fitch, Steven Abrams, Robert S. Germain
  • Patent number: 7027987
    Abstract: A system provides search results from a voice search query. The system receives a voice search query from a user, derives one or more recognition hypotheses, each being associated with a weight, from the voice search query, and constructs a weighted boolean query using the recognition hypotheses. The system then provides the weighted boolean query to a search system and provides the results of the search system to a user.
    Type: Grant
    Filed: February 7, 2001
    Date of Patent: April 11, 2006
    Assignee: Google Inc.
    Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
  • Patent number: 7016835
    Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.
    Type: Grant
    Filed: December 19, 2002
    Date of Patent: March 21, 2006
    Assignee: International Business Machines Corporation
    Inventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
  • Patent number: 7010486
    Abstract: The invention relates to a speech recognition system and a method of calculating iteration values for free parameters ??ortho(n) of a maximum-entropy speech model MESM with the aid of the generalized-iterative scaling training algorithm in a computer-supported speech recognition system in accordance with the formula ??ortho(n+1)=G(??ortho(n), m?ortho, . . . ), where n is an iteration parameter, G a mathematical function, ? an attribute in the MESM and m?ortho a desired orthogonalized boundary value in the MESM for the attribute ?. It is an object of the invention to further develop the system and method so that they make a fast computation of the free parameters ? possible without a change of the original training object. According to the invention this object is achieved in that the desired orthogonalized boundary value m?ortho is calculated by a linear combination of the desired boundary value m? with desired boundary values m? from attributes ? that have a larger range than the attribute ?.
    Type: Grant
    Filed: February 13, 2002
    Date of Patent: March 7, 2006
    Assignee: Koninklijke Philips Electronics, N.V.
    Inventor: Jochen Peters
  • Patent number: 7010484
    Abstract: A method of phrase verification to verify a phrase not only according to its confidence measures but also according to neighboring concepts and their confidence tags. First, an utterance is received, and the received utterance is parsed to find a concept sequence. Subsequently, a plurality of tag sequences corresponding to the concept sequence is produced. Then, a first score of each of the tag sequences is calculated. Finally, the tag sequence of the highest first score is selected as the most probable tag sequence, and the tags contained therein are selected as the most probable confidence tags, respectively corresponding to the concepts in the concept sequence.
    Type: Grant
    Filed: December 12, 2001
    Date of Patent: March 7, 2006
    Assignee: Industrial Technology Research Institute
    Inventor: Yi-Chung Lin
  • Patent number: 7003458
    Abstract: An automated voice pattern filtering method implemented in a system having a client side and a server side is disclosed. At the client side, a speech signal is transformed into a first set of spectral parameters which are encoded into a set of spectral shapes that are compared to a second set of spectral parameters corresponding to one or more keywords. From the comparison, the client side determines if the speech signal is acceptable. If so, spectral information indicating a difference in a voice pattern between the speech signal and the keyword(s) is encoded and utilized as a basis to generate a voice pattern filter.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: February 21, 2006
    Assignee: General Motors Corporation
    Inventors: Kai-Ten Feng, Jane F. MacFarlane, Stephen C. Habermas
  • Patent number: 6999925
    Abstract: The present invention provides a computerized method and apparatus for automatically generating from a first speech recognizer a second speech recognizer which can be adapted to a specific domain. The first speech recognizer can include a first acoustic model with a first decision network and corresponding first phonetic contexts. The first acoustic model can be used as a starting point for the adaptation process. A second acoustic model with a second decision network and corresponding second phonetic contexts for the second speech recognizer can be generated by re-estimating the first decision network and the corresponding first phonetic contexts based on domain-specific training data.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: February 14, 2006
    Assignee: International Business Machines Corporation
    Inventors: Volker Fischer, Siegfried Kunzmann, Eric-W. Janke, A. Jon Tyrrell
  • Patent number: 6993483
    Abstract: A speech recognizer suitable for distributed speech recognition is robust to missing speech feature vectors. Speech is transmitted via a packet switched network in the form of basic feature vectors. Missing feature vectors are detected and replacement feature vectors are estimated by interpolation of received data prior to speech recognition. Features may be converted and interpolation may be accomplished in a spectral domain.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: January 31, 2006
    Assignee: British Telecommunications public limited company
    Inventor: Benjamin P Milner
  • Patent number: 6993481
    Abstract: According to the invention, a method for detecting speech activity for a signal is disclosed. In one step, a plurality of features is extracted from the signal. An active speech probability density function (PDF) of the plurality of features is modeled, and an inactive speech PDF of the plurality of features is modeled. The active and inactive speech PDFs are adapted to respond to changes in the signal over time. The signal is probability-based classifyied based, at least in part, on the plurality of features. Speech in the signal is distinguished based, at least in part, upon the probability-based classification.
    Type: Grant
    Filed: December 4, 2001
    Date of Patent: January 31, 2006
    Assignee: Global IP Sound AB
    Inventors: Jan K. Skoglund, Jan T. Linden
  • Patent number: 6985860
    Abstract: To achieve an improvement in recognition performance, a non-speech acoustic model correction unit adapts a non-speech acoustic model representing a non-speech state using input data observed during an interval immediately before a speech recognition interval during which speech recognition is performed, by means of one of the most likelihood method, the complex statistic method, and the minimum distance-maximum separation theorem.
    Type: Grant
    Filed: August 30, 2001
    Date of Patent: January 10, 2006
    Assignee: Sony Corporation
    Inventor: Hironaga Nakatsuka
  • Patent number: 6973427
    Abstract: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user's pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.
    Type: Grant
    Filed: December 26, 2000
    Date of Patent: December 6, 2005
    Assignee: Microsoft Corporation
    Inventors: Mei-Yuh Hwang, Fileno A. Alleva, Rebecca C. Weiss
  • Patent number: 6970818
    Abstract: The present invention comprises a methodology for implementing a vocabulary set for use in a speech recognition system, and may preferably include a recognizer for analyzing utterances from the vocabulary set to generate N-best lists of recognition candidates. The N-best lists may then be utilized to create an acoustical matrix configured to relate said utterances to top recognition candidates from said N-best lists, as well as a lexical matrix configured to relate the utterances to the top recognition candidates from the N-best lists only when second-highest recognition candidates from the N-best lists are correct recognition results. An utterance ranking may then preferably be created according to composite individual error/accuracy values for each of the utterances. The composite individual error/accuracy values may preferably be derived from both the acoustical matrix and the lexical matrix.
    Type: Grant
    Filed: March 14, 2002
    Date of Patent: November 29, 2005
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menedez-Pidal, Lex S. Olorenshaw
  • Patent number: 6963834
    Abstract: A method for performing speech recognition can include determining a recognition result for received user speech. The recognition result can include recognized text and a corresponding confidence score. The confidence score of the recognition result can correspond to a predetermined minimum threshold. If the confidence score does not exceed the predetermined minimum threshold, the user can be presented with at least one empirically determined alternate word candidate corresponding to the recognition result.
    Type: Grant
    Filed: May 29, 2001
    Date of Patent: November 8, 2005
    Assignee: International Business Machines Corporation
    Inventors: Matthew W. Hartley, James R. Lewis, David E. Reich
  • Patent number: 6961701
    Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.
    Type: Grant
    Filed: March 3, 2001
    Date of Patent: November 1, 2005
    Assignee: Sony Corporation
    Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuharu Asano, Helmut Lucke
  • Patent number: 6957183
    Abstract: A method for processing digitized speech signals by analyzing redundant features to provide more robust voice recognition. A primary transformation is applied to a source speech signal to extract primary features therefrom. Each of at least one secondary transformation is applied to the source speech signal or extracted primary features to yield at least one set of secondary features statistically dependant on the primary features. At least one predetermined function is then applied to combine the primary features with the secondary features. A recognition answer is generated by pattern matching this combination against predetermined voice recognition templates.
    Type: Grant
    Filed: March 20, 2002
    Date of Patent: October 18, 2005
    Assignee: Qualcomm Inc.
    Inventors: Narendranath Malayath, Harinath Garudadri
  • Patent number: 6910010
    Abstract: A feature extraction and pattern recognition system in which an observation vector forming input data, which represents a certain point in the observation vector space, is mapped to a distribution having a spread in the feature vector space, and a feature distribution parameter representing the distribution is determined. Pattern recognition of the input data is performed based on the feature distribution parameter.
    Type: Grant
    Filed: October 28, 1998
    Date of Patent: June 21, 2005
    Assignee: Sony Corporation
    Inventors: Naoto Iwahashi, Hongchang Bao, Hitoshi Honda
  • Patent number: 6907367
    Abstract: A method for segmenting a signal into segments having similar spectral characteristics is provided. Initially the method generates a table of previous values from older signal values that contains a scoring value for the best segmentation of previous values and a segment length of the last previously identified segment. The method then receives a new sample of the signal and computes a new spectral characteristic function for the signal based on the received sample. A new scoring function is computed from the spectral characteristic function. Segments of the signal are recursively identified based on the newly computed scoring function and the table of previous values. The spectral characteristic function can be a selected one of an autocorrelation function and a discrete Fourier transform. An example is provided for segmenting a speech signal.
    Type: Grant
    Filed: August 31, 2001
    Date of Patent: June 14, 2005
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Paul M. Baggenstoss
  • Patent number: 6901365
    Abstract: The invention enables even a CPU having low processing performance to find an HMM output probability by simplifying arithmetic operations. The dimensions of an input vector are grouped into several sets, and tables are created for the sets. When an output probability is calculated, codes corresponding to the first dimension to n-the dimension of the input vector are sequentially obtained, and for each code, by referring to the corresponding table, output values for each table are obtained. By substituting the output values for each table for a formula for finding an output probability, the output probability is found.
    Type: Grant
    Filed: September 19, 2001
    Date of Patent: May 31, 2005
    Assignee: Seiko Epson Corporation
    Inventor: Yasunaga Miyazawa
  • Patent number: 6882970
    Abstract: A system is provided for comparing an input query with a number of stored annotations to identify information to be retrieved from a database. The comparison technique divides the input query into a number of fixed-size fragments and identifies how many times each of the fragments occurs within each annotation using a dynamic programming matching technique. The frequencies of occurrence of the fragments in both the query and the annotation are then compared to provide a measure of the similarity between the query and the annotation. The information to be retrieved is then determined from the similarity measures obtained for all the annotations.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: April 19, 2005
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
  • Patent number: 6879955
    Abstract: A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.
    Type: Grant
    Filed: June 29, 2001
    Date of Patent: April 12, 2005
    Assignee: Microsoft Corporation
    Inventor: Ajit V. Rao
  • Patent number: 6868382
    Abstract: The generic word label series used for recognition of words uttered by unspecified speakers are stored in the vocabulary label network accumulation processing. The speech of a particular speaker is entered. Based on the input speech, the registered word label series extraction processing generates the registered word label series. The registered word label series of the particular speaker can then be registered with the vocabulary label network accumulation processing.
    Type: Grant
    Filed: March 9, 2001
    Date of Patent: March 15, 2005
    Assignee: Asahi Kasei Kabushiki Kaisha
    Inventor: Makoto Shozakai
  • Patent number: 6868381
    Abstract: A speech recognition system having an input for receiving an input signal indicative of a spoken utterance that is indicative of at least one speech element. The system further includes a first processing unit operative for processing the input signal to derive from a speech recognition dictionary a speech model associated to a given speech element that constitutes a potential match to the at least one speech element. The system further comprised a second processing unit for generating a modified version of the speech model on the basis of the input signal. The system further provides a third processing unit for processing the input signal on the basis of the modified version of the speech model to generate a recognition result indicative of whether the modified version of the at least one speech model constitutes a match to the input signal.
    Type: Grant
    Filed: December 21, 1999
    Date of Patent: March 15, 2005
    Assignee: Nortel Networks Limited
    Inventors: Stephen Douglas Peters, Daniel Boies, Benoit Dumoulin
  • Patent number: 6868380
    Abstract: A speech recognition system for transforming an acoustic signal into a stream of phonetic estimates includes a frequency analyzer for generating a short-time frequency representation of the acoustic signal. A novelty processor separates background components of the representation from region of interest components of the representation. The output of the novelty processor includes the region of interest components of the representation according to the novelty parameters. An attention processor produces a gating signal as a function of the novelty output according to attention parameters. A coincidence processor produces information regarding co-occurrences between samples of the novelty output over time and frequency. The coincidence processor selectively gates the coincidence output as a function of the gating signal according to one or more coincidence parameters.
    Type: Grant
    Filed: March 23, 2001
    Date of Patent: March 15, 2005
    Assignee: Eliza Corporation
    Inventor: John Kroeker
  • Patent number: 6850885
    Abstract: To increase the accuracy and the flexibility of a method for recognizing speech which employs a keyword spotting process on the basis of a combination of a keyword model (KM) and a garbage model (GM) it is suggested to associate at least one variable penalty value (Ptrans, P1, . . . , P6) with a global penalty (Pglob) so as to increase the recognition of keywords (Kj).
    Type: Grant
    Filed: December 12, 2001
    Date of Patent: February 1, 2005
    Assignee: Sony International (Europe) GmbH
    Inventors: Daniela Raddino, Ralf Kompe, Thomas Kemp
  • Patent number: 6836758
    Abstract: A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.
    Type: Grant
    Filed: January 9, 2001
    Date of Patent: December 28, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Ning Bi, Andrew P. DeJaco, Harinath Garudadri, Chienchung Chang, William Yee-Ming Huang, Narendranath Malayath, Suhail Jalil, David Puig Oses, Yingyong Qi
  • Publication number: 20040260548
    Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.
    Type: Application
    Filed: June 20, 2003
    Publication date: December 23, 2004
    Inventors: Hagai Attias, Li Deng, Leo J. Lee
  • Patent number: 6823308
    Abstract: A speech recognition method for use in a multimodal input system comprises receiving a multimodal input comprising digitized speech as a first modality input and data in at least one further modality input. Features in the speech and in the data in at least one further modality are identified. The identified features in the speech and in the data are used in the recognition of words by comparing the identified features with states in models for the words. The models have states for the recognition of speech and for words having features in at least one further modality associated with the words, the models also have states for the recognition of events in the further modality or each further modality.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: November 23, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventors: Robert Alexander Keiller, Nicolas David Fortescue
  • Patent number: 6823304
    Abstract: A lead consonant buffer stores a feature parameter preceding a lead voiced sound detected by a voiced sound detector as a feature parameter of a lead consonant. A matching processing unit performs matching processing of a feature parameter of a lead consonant stored in the lead consonant buffer with a feature parameter of a registered pattern. Hence, the matching processing unit can perform matching processing reflecting information on a lead consonant even when no lead consonant can be detected due to a noise.
    Type: Grant
    Filed: July 19, 2001
    Date of Patent: November 23, 2004
    Assignee: Renesas Technology Corp.
    Inventor: Masahiko Ikeda
  • Publication number: 20040186715
    Abstract: This invention relates to a non-intrusive speech quality assessment system. The invention provides a method and apparatus for training a quality assessment tool in which a database comprising a plurality of samples, each with an associated mean opinion score, is divided into a plurality of distortion sets of samples according to a distortion criterion; and a distortion specific assessment handler for each distortion set is trained, such that a fit between a distortion specific quality measure generated from a distortion specific plurality of parameters for a sample and the mean opinion score associated with said sample is optimised.
    Type: Application
    Filed: January 14, 2004
    Publication date: September 23, 2004
    Applicant: PSYTECHNICS LIMITED
    Inventors: Philip Gray, Ludovic Malfait
  • Publication number: 20040186716
    Abstract: A processing unit and method are described herein that are capable of estimating a quality of a speech signal transmitted through a wireless network. The processing unit uses a logistic function to map a score output from an objective voice quality method (PESQ algorithm) into a mean of opinion (MOS) score which is an estimation of the quality of the speech signal that was transmitted through the wireless network. The logistic function has the form: y=1+4/(1+exp(−1.7244*x+5.0187)) where x is the score from the PESQ algoritm which is in the range of −0.5 to 4.5 and y is the mapped MOS score which is in the range of 1 to 5 wherein if y=5 then the quality of the speech signal is considered excellent and if y=1 then the quality of the speech signal is considered bad.
    Type: Application
    Filed: January 20, 2004
    Publication date: September 23, 2004
    Applicant: Telefonaktiebolaget LM Ericsson
    Inventors: John C. Morfitt, Irina C. Cotanis
  • Publication number: 20040186714
    Abstract: A method, program product and system for speech recognition for use with a base speech recognition process, but which does not affect scoring models in the base speech recognition process, the method comprising in one embodiment: obtaining an output hypothesis from a base speech recognition process that uses a first set of scoring models; obtaining a set of alternative hypotheses; scoring the set of alternative hypotheses based on a second set of different scoring models that is separate from and external to the base speech recognition process and does not affect the scoring models thereof; and selecting a hypothesis with a best score.
    Type: Application
    Filed: March 18, 2003
    Publication date: September 23, 2004
    Applicant: Aurilab, LLC
    Inventor: James K. Baker
  • Patent number: 6792405
    Abstract: A feature extraction process for use in a wireless communication system provides automatic speech recognition based on both spectral envelope and voicing information. The shape of the spectral envelope is used to determine the LSPs of the incoming bitstream and the adaptive gain coefficients and fixed gain coefficients are used to generate the “voiced” and “unvoiced” feature parameter information.
    Type: Grant
    Filed: December 5, 2000
    Date of Patent: September 14, 2004
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Patent number: 6788767
    Abstract: An apparatus and method for enabling provision of a call return service is disclosed. The apparatus utilizes a method of generating telephone numbers from voice messages. The method includes the step of using speech recognition to isolate a spoken number in a voice message, and confirming to a high degree of accuracy that the spoken number represents a telephone number. The method further includes the step of converting the spoken number into a data sequence representing the telephone number. This data sequence is then made available for immediate or later use.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: September 7, 2004
    Assignee: Gateway, Inc.
    Inventor: Jay V. Lambke
  • Publication number: 20040162725
    Abstract: A stochastic processor of the present invention comprises a fluctuation generator (15) configured to output an analog quantity having a fluctuation, a fluctuation difference calculation means (401) configured to output fluctuation difference data with an output of the fluctuation generator added to analog difference between two data, a thresholding unit (47) configured to perform thresholding on an output of the fluctuation difference calculation means to thereby generate a pulse, and a pulse detection means configured to detect the pulse output from the thresholding unit.
    Type: Application
    Filed: February 20, 2004
    Publication date: August 19, 2004
    Applicant: Matsushita Electric Industrial Co., Ltd.
    Inventors: Michihito Ueda, Kiyoyuki Morita
  • Publication number: 20040158467
    Abstract: An automated speech recognition filter is disclosed. The automated speech recognition filter device provides a speech signal to an automated speech platform that approximates an original speech signal as spoken into a transceiver by a user. In providing the speech signal, the automated speech recognition filter determines various models representative of a cumulative signal degradation of the original speech signal from various devices along a transmission signal path and a reception signal path between the transceiver and a device housing the filter. The automated speech platform can thereby provide an audio signal corresponding to a context of the original speech signal.
    Type: Application
    Filed: February 6, 2004
    Publication date: August 12, 2004
    Inventors: Stephen C. Habermas, Ognjen Todic, Kai-Ten Feng, Jane F. MacFarlane
  • Publication number: 20040158466
    Abstract: Vocal and vocal-like sounds can be characterised and/or identified by using an intelligent classifying method adapted to determine prosodic attributes of the sounds and base a classificatory scheme upon composite functions of these attributes, the composite functions defining a discrimination space. The sounds are segmented before prosodic analysis on a segment by segment basis. The prosodic analysis of the sounds involves pitch analysis, intensity analysis, formant analysis and timing analysis. This method can be implemented in systems including language-identification and singing-style-identification systems.
    Type: Application
    Filed: April 9, 2004
    Publication date: August 12, 2004
    Inventor: Eduardo Reck Miranda
  • Patent number: 6775652
    Abstract: Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: August 10, 2004
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Stephen Michael Marcus, Mazin G. Rahim, Nambirajan Seshadri, Robert Douglas Sharp
  • Patent number: 6772119
    Abstract: A speaker recognition technique is provided that can operate within the memory and processing constraints of existing portable computing devices. A smaller memory footprint and computational efficiency are achieved using single Gaussian models for each enrolled speaker. During enrollment, features are extracted from one or more enrollment utterances from each enrolled speaker, to generate a target speaker model based on a sample covariance matrix. During a recognition phase, features are extracted from one or more test utterances to generate a test utterance model that is also based on the sample covariance matrix. A sphericity ratio is computed that compares the test utterance model to the target speaker model, as well as a background model. The sphericity ratio indicates how similar test utterance speech is to the speech used when the user was enrolled, as represented by the target speaker model, and how dissimilar the test utterance speech is from the background model.
    Type: Grant
    Filed: December 10, 2002
    Date of Patent: August 3, 2004
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Ganesh N. Ramaswamy, Ran Zilca
  • Patent number: 6772116
    Abstract: A method of selecting a language model for decoding received user spoken utterances in a speech recognition system can include a series of steps. The steps can include computing confidence scores for identified closed-class words and computing a running average of the confidence scores for a predetermined number of decoded closed-class words. Additionally, based upon the running average, telegraphic decoding can be selectively enabled.
    Type: Grant
    Filed: March 27, 2001
    Date of Patent: August 3, 2004
    Assignee: International Business Machines Corporation
    Inventor: James R. Lewis
  • Publication number: 20040138884
    Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.
    Type: Application
    Filed: January 13, 2003
    Publication date: July 15, 2004
    Inventors: Edward W. D. Whittaker, Bhiksha Ramakrishnan
  • Publication number: 20040138883
    Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.
    Type: Application
    Filed: January 13, 2003
    Publication date: July 15, 2004
    Inventors: Bhiksha Ramakrishnan, Edward W. D. Whittaker
  • Publication number: 20040128130
    Abstract: Pitch estimation and classification into voiced, unvoiced and transitional speech were performed by a spectro-temporal auto-correlation technique. A peak picking formula was then employed. A weighting function was then applied to the power spectrum. The harmonics weighted power spectrum underwent mel-scaled band-pass filtering, and the log-energy of the filter's output was discrete cosine transformed to produce cepstral coefficients. A within-filter cubic-root amplitude compression was applied to reduce amplitude variation without compromise of the gain invariance properties.
    Type: Application
    Filed: May 19, 2003
    Publication date: July 1, 2004
    Inventors: Kenneth Rose, Liang Gu
  • Patent number: 6754624
    Abstract: A method and apparatus for enhancing coding efficiency by reducing illegal or other undesirable packet generation while encoding a signal. The probability of generating illegal or other undesirable packets while encoding a signal is reduced by first analyzing a history of the frequency of codebook values selected while quantizing speech parameters. Codebook entries are then reordered so that the index/indices that create illegal or other undesirable packets contain the least frequently used entry/entries. Reordering multiple codebooks for various parameters further reduces the probability that an illegal or other undesirable packet will be created during signal encoding. The method and apparatus may be applied to reduce the probability of generating illegal null traffic channel data packets while encoding eighth rate speech.
    Type: Grant
    Filed: February 13, 2001
    Date of Patent: June 22, 2004
    Assignee: Qualcomm, Inc.
    Inventors: Eddie-Lun Tik Choy, Arasanipalai K. Ananthapadmanabhan, Andrew P. DeJaco
  • Patent number: 6754626
    Abstract: The invention disclosed herein concerns a method of converting speech to text using a hierarchy of contextual models. The hierarchy of contextual models can be statistically smoothed into a language model. The method can include processing text with a plurality of contextual models. Each one of the plurality of contextual models can correspond to a node in a hierarchy of the plurality of contextual models. Also included can be identifying at least one of the contextual models relating to the text and processing subsequent user spoken utterances with the identified at least one contextual model.
    Type: Grant
    Filed: March 1, 2001
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventor: Mark E. Epstein
  • Publication number: 20040111261
    Abstract: A speaker recognition technique is provided that can operate within the memory and processing constraints of existing portable computing devices. A smaller memory footprint and computational efficiency are achieved using single Gaussian models for each enrolled speaker. During enrollment, features are extracted from one or more enrollment utterances from each enrolled speaker, to generate a target speaker model based on a sample covariance matrix. During a recognition phase, features are extracted from one or more test utterances to generate a test utterance model that is also based on the sample covariance matrix. A sphericity ratio is computed that compares the test utterance model to the target speaker model, as well as a background model. The sphericity ratio indicates how similar test utterance speech is to the speech used when the user was enrolled, as represented by the target speaker model, and how dissimilar the test utterance speech is from the background model.
    Type: Application
    Filed: December 10, 2002
    Publication date: June 10, 2004
    Applicant: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Ganesh N. Ramaswamy, Ran Zilca
  • Publication number: 20040102971
    Abstract: In a particular embodiment, the disclosure is directed to a method of recognizing input that includes receiving input data; receiving context data associated with the input data, the context data associated with an interpretation mapping; and generating symbolic data from the input data using the interpretation mapping. In another particular embodiment, the disclosure is directed to an input recognition system that includes a context module, an input capture module, and a recognition module. The context module is configured to receive context input and provide context data. The input capture module is configured to receive input data and is configured to provide digitized input data. The recognition module is coupled to the context module and is coupled to the input capture module. The recognition module is configured to receive the digitized input data and to interpret the digitized input data utilizing an interpretation mapping associated with the context data.
    Type: Application
    Filed: August 11, 2003
    Publication date: May 27, 2004
    Applicant: RECARE, Inc.
    Inventors: Randolph B. Lipscher, Michael D. Dahlin
  • Patent number: 6725193
    Abstract: A voice recognition system for use with a communication system having an incoming line carrying an incoming signal from a first end to a second end operably attached to a speaker and the outgoing line carrying an outgoing signal from a microphone near the speaker. A first speech recognition unit (SRU) detects selected incoming words and a second SRU detect outgoing words. A comparator/signal generator compares the outgoing word with the incoming word and outputs the outgoing word when the outgoing word does not match the incoming word. The first SRU may be delayed relative to the second SRU. The SRU's may also search only for selected words in template, or may ignore words which are first detected by the other SRU. A signaler may also provide a signal indicating inclusion of one of the selected words in a known incoming signal with an SRU being responsive to that signal to ignore the included one command word in the template for a selected period of time.
    Type: Grant
    Filed: September 13, 2000
    Date of Patent: April 20, 2004
    Assignee: Telefonaktiebolaget LM Ericsson
    Inventor: Thomas J. Makovicka