Distance Patents (Class 704/238)
  • Patent number: 7664639
    Abstract: A telephone dialing speech recognition method includes determining a location associated with a cellular telephone from geographic indications provided by the cellular telephone, and selects associated search information as a function of the location. Speech based dialers operating in a car environment often have difficulty determining the digits said since some digits have similar sounding names in certain languages. To improve recognition performance, constraints are added to the recognition process, based on the natural constraints of the dialing process. The method utilizes the selected associated search information when recognizing the incoming speech signal. For speech dialing, if the user defines a location where the phone is used, then the “numbering plan” of that country may be used to constrain certain digits. Such constraining of the speech recognizer significantly improves the recognition results.
    Type: Grant
    Filed: January 14, 2004
    Date of Patent: February 16, 2010
    Assignee: Art Advanced Recognition Technologies, Inc.
    Inventors: Ran Mochary, Eran Dukas
  • Patent number: 7629902
    Abstract: The present invention relates to methods and apparatus for preventing power imbalance in a multiple input multiple output (MIMO) wireless precoding system. According to one aspect of the present invention, a codebook is constructed with a first subset of codewords that are constant modulus matrices, and a second subset of codewords that are non-constant modulus matrices. A mapping scheme is established between the first subset of codewords and the second subset of codewords. When a unit of user equipment feeds back a first codeword that is a non-constant modulus matrix, the Node-B may replace the first codewords with a second codeword that is selected from the first subset of codewords and that corresponds to the first codeword in accordance with the mapping scheme.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: December 8, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jianzhong Zhang, Cornelius Van Rensburg, Farooq Khan, Bruno Clerckx, Juho Lee, Zhouyue Pi
  • Patent number: 7590626
    Abstract: A distributional similarity between a word of a search query and a term of a candidate word sequences is used to determine an error model probability that describes the probability of the search query given the candidate word sequence. The error model probability is used to determine a probability of the candidate word sequence given the search query. The probability of the candidate word sequence given the search query is used to select a candidate word sequence as a corrected word sequence for the search query. Distributional similarity is also used to build features that are applied in maximum entropy model to compute the probability of the candidate word sequence given the search query.
    Type: Grant
    Filed: October 30, 2006
    Date of Patent: September 15, 2009
    Assignee: Microsoft Corporation
    Inventors: Mu Li, Ming Zhou
  • Patent number: 7590537
    Abstract: A speech recognition method and apparatus perform speaker clustering and speaker adaptation using average model variation information over speakers while analyzing the quantity variation amount and the directional variation amount. In the speaker clustering method, a speaker group model variation is generated based on the model variation between a speaker-independent model and a training speaker ML model. In the speaker adaptation method, the model in which the model variation between a test speaker ML model and a speaker group ML model to which the test speaker belongs which is most similar to a training speaker group model variation is found, and speaker adaptation is performed on the found model. Herein, the model variation in the speaker clustering and the speaker adaptation are calculated while analyzing both the quantity variation amount and the directional variation amount. The present invention may be applied to any speaker adaptation algorithm of MLLR and MAP.
    Type: Grant
    Filed: December 27, 2004
    Date of Patent: September 15, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Namhoon Kim, Injeong Choi, Yoonkyung Song
  • Patent number: 7565290
    Abstract: A speech recognition apparatus includes a word dictionary having recognition target words, a first acoustic model which expresses a reference pattern of a speech unit by one or more states, a second acoustic model which is lower in precision than said first acoustic model, selection means for selecting one of said first acoustic model and said second acoustic model on the basis of a parameter associated with a state of interest, and likelihood calculation means for calculating a likelihood of an acoustic feature parameter with respect to said acoustic model selected by said selection means.
    Type: Grant
    Filed: June 24, 2005
    Date of Patent: July 21, 2009
    Assignee: Canon Kabushiki Kaisha
    Inventors: Hideo Kuboyama, Toshiaki Fukada, Yasuhiro Komori
  • Patent number: 7478045
    Abstract: In a method for characterizing a signal representing an audio content a measure is determined for a tonality of the signal, whereupon a statement is made about the audio content of the signal on the basis of the measure for the tonality of the signal. The measure for the tonality is derived from a quotient whose numerator is the mean of the summed values of spectral components of the signal exponentiated with a first power and whose denominator is the mean of the summed values of spectral components exponentiated with a second power, the first and second powers differing from each other. The measure for the tonality of the signal for the content analysis is robust in relation to a signal distortion, due e.g. to MP3 coding, and has a high correlation with the content of the analyzed signal.
    Type: Grant
    Filed: July 15, 2002
    Date of Patent: January 13, 2009
    Assignee: M2ANY GmbH
    Inventors: Eric Allamanche, Jürgen Herre, Oliver Hellmuth, Thorsten Kastner
  • Patent number: 7475013
    Abstract: A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution.
    Type: Grant
    Filed: March 26, 2004
    Date of Patent: January 6, 2009
    Assignee: Honda Motor Co., Ltd.
    Inventor: Ryan Rifkin
  • Publication number: 20080255839
    Abstract: A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least o
    Type: Application
    Filed: September 14, 2005
    Publication date: October 16, 2008
    Applicant: ZENTIAN LIMITED
    Inventors: Guy Larri, Mark Catchpole, Damian Kelly Harris-Dowsett, Timothy Brian Reynolds
  • Patent number: 7421305
    Abstract: The present invention relates to a system and methodology to facilitate automatic management and pruning of audio files residing in a database. Audio fingerprinting is a powerful tool for identifying streaming or file-based audio, using a database of fingerprints. Duplicate detection identifies duplicate audio clips in a set, even if the clips differ in compression quality or duration. The present invention can be provided as a self-contained application that it does not require an external database of fingerprints. Also, a user interface provides various options for managing and pruning the audio files.
    Type: Grant
    Filed: February 24, 2004
    Date of Patent: September 2, 2008
    Assignee: Microsoft Corporation
    Inventors: Christopher J. C. Burges, John C. Platt, Daniel Plastina, Erin L. Renshaw
  • Patent number: 7403891
    Abstract: The present invention relates to an apparatus and method for recognizing biological named entity from biological literature based on united medical language system (UMLS). The apparatus and the method receives metathesaurus from the UMLS, constructs a concept name database, a single name database and a category keyterm database, which are language resources to be used recognize a named entity, receives each concept name stored in the concept name database, extracts features of each of the concept names by using data stored in the single name database and the category keyterm database, constructs a rule database by creating rules used to recognize the named entity and filtering the rules by using the extracted features, receives a biological literature, extracts nouns and noun phrases that are candidate named entities, applies the rules stored in the rule database to the nouns and the noun phrases, and recognizes the named entities.
    Type: Grant
    Filed: February 13, 2004
    Date of Patent: July 22, 2008
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Soo Jun Park, Tae Hyun Kim, Hyun Sook Lee, Hyun Chul Jang, Seon Hee Park
  • Patent number: 7379868
    Abstract: A differential compression technique is disclosed for compression individual speaker models, such as Gaussian mixture models, by computing a delta model from the difference between an individual speaker model and a baseline model. Further compression may be applied to the delta model to reduce the large storage requirements generally attributed to speaker models.
    Type: Grant
    Filed: January 2, 2003
    Date of Patent: May 27, 2008
    Assignee: Massachusetts Institute of Technology
    Inventor: Douglas A. Reynolds
  • Patent number: 7369993
    Abstract: A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.
    Type: Grant
    Filed: December 29, 2006
    Date of Patent: May 6, 2008
    Assignee: AT&T Corp.
    Inventor: Bishnu Saroop Atal
  • Patent number: 7363222
    Abstract: A method and database system is disclosed for searching data in at least two databases (Dn), particularly for searching telephone directories or the like. To allow simultaneous access to two or more databases by means of speech recognition in order to perform a search therein as in a single database, a search term is input by speech via a voice controlled user interface (28) connected to a database primary control apparatus (26) and comprises speech recognition front end means (8, 9) for processing a sound sequence of a search term input by speech to obtain a comparable speech pattern (X) thereof. By means of speech recognition back end means (6) associated with databases (D1-D6), the comparable speech pattern (X) is compared with corresponding speech patterns (An,i) of database entries (En,i) to determine for each of the at least two databases (Dn) at least that database entry (En,j) the speech pattern (An,j) which best matches the comparable speech pattern (X) of the search term.
    Type: Grant
    Filed: June 24, 2002
    Date of Patent: April 22, 2008
    Assignee: Nokia Corporation
    Inventor: Michael Josenhans
  • Patent number: 7356466
    Abstract: A method and apparatus for calculating an observation probability includes a first operation unit that subtracts a mean of a first plurality of parameters of an input voice signal from a second parameter of an input voice signal, and multiplies the subtraction result to obtain a first output. The first output is squared and accumulated N times in a second operation unit to obtain a second output. A third operation unit subtracts a given weighted value from the second output to obtain a third output, and a comparator stores the third output for a comparator stores the third output in order to extract L outputs therefrom, and stores the L extracted outputs based on an order of magnitude of the extracted L outputs.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: April 8, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung-Ho Min, Tae-Su Kim, Hyun-Woo Park, Ho-Rang Jang, Keun-Cheol Hong, Sung-Jae Kim
  • Patent number: 7349576
    Abstract: A method for recognition of a handwritten character comprises the steps of determining a plurality of position features defining the handwritten character, and comparing the handwritten character to reference characters stored in a database in order to find the closest matching reference character. The step of comparing comprises the steps of computing a difference between one of the plurality of position features of the handwritten character and a corresponding position feature of one of the reference characters, determining, by lookup in a predefined table, a distance measure based on the computed difference and determining a distance measure for each of the plurality of position features of the handwritten character, and computing a cost function based on the determined distance measures. A device and a computer program for implementing the method are also described.
    Type: Grant
    Filed: January 11, 2002
    Date of Patent: March 25, 2008
    Assignee: Zi Decuma AB
    Inventor: Anders Holtsberg
  • Patent number: 7328153
    Abstract: Copies of original sound recordings are identified by extracting features from the copy, creating a vector of those features, and comparing that vector against a database of vectors. Identification can be performed for copies of sound recordings that have been subjected to compression and other manipulation such that they are not exact replicas of the original. Computational efficiency permits many hundreds of queries to be serviced at the same time. The vectors may be less than 100 bytes, so that many millions of vectors can be stored on a portable device.
    Type: Grant
    Filed: July 22, 2002
    Date of Patent: February 5, 2008
    Assignee: Gracenote, Inc.
    Inventors: Maxwell Wells, Vidya Venkatachalam, Luca Cazzanti, Kwan Fai Cheung, Navdeep Dhillon, Somsak Sukittanon
  • Patent number: 7321853
    Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.
    Type: Grant
    Filed: February 24, 2006
    Date of Patent: January 22, 2008
    Assignee: Sony Corporation
    Inventor: Yasuharu Asano
  • Patent number: 7315819
    Abstract: A process of identifying a speaker in coded speech data and a process of searching for the speaker are efficiently performed with fewer computations and with a smaller storage capacity. In an information search apparatus, an LSP decoding section extracts and decodes only LSP information from coded speech data which is read for each block. An LPC conversion section converts the LSP information into LPC information. A Cepstrum conversion section converts the obtained LPC information into an LPC Cepstrum which represents features of speech. A vector quantization section performs vector quantization on the LPC Cepstrum. A speaker identification section identifies a speaker on the basis of the result of the vector quantization. Furthermore, the identified speaker is compared with a search condition in a condition comparison section, and based on the result, the search result is output.
    Type: Grant
    Filed: July 23, 2002
    Date of Patent: January 1, 2008
    Assignee: Sony Corporation
    Inventors: Yasuhiro Toguri, Masayuki Nishiguchi
  • Patent number: 7315813
    Abstract: A method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure is disclosed. This method is based on comparison of speech segments segmented from a speech corpus, wherein speech segments are fully prosody-aligned to each other before distortion measure. With prosody alignment embedded in selection process, distortion resulting from possible prosody modification in synthesis could be taken into account objectively in selection phase. In order to carry out the purpose of the present invention, automatic segmentation, pitch marking and PSOLA method work together for prosody alignment. Two distortion measures, MFCC and PSQM are used for comparing two prosody-aligned segments of speech because of human perceptual consideration.
    Type: Grant
    Filed: July 29, 2002
    Date of Patent: January 1, 2008
    Assignee: Industrial Technology Research Institute
    Inventors: Chih-Chung Kuo, Chi-Shiang Kuo
  • Patent number: 7295978
    Abstract: A system for recognizing speech receives an input speech vector and identifies a Gaussian distribution. The system determines an address from the input speech vector (610) and uses the address to retrieve a distance value for the Gaussian distribution from a table (620). The system then determines the probability of the Gaussian distribution using the distance value (630) and recognizes the input speech vector based on the determined probability (640).
    Type: Grant
    Filed: September 5, 2000
    Date of Patent: November 13, 2007
    Assignees: Verizon Corporate Services Group Inc., BBN Technologies Corp.
    Inventors: Richard Mark Schwartz, Jason Charles Davenport, James Donald Van Sciver, Long Nguyen
  • Publication number: 20070185712
    Abstract: A method of measuring confidence of speech recognition in a speech recognizer compares a phase change point with a phoneme string change point and uses a difference between the phase change point and the phoneme string change point and a likelihood ratio, and an apparatus using the method is provided. That is, the method of the present invention includes detecting a phase change point of a speech signal; detecting a phoneme string change point according to a result of speech recognition; calculating confidence of the speech recognition by using a difference between the detected phase change point and phoneme string change point. According to the present invention, a performance of measuring confidence may become improved by simultaneously taking not only a likelihood ratio, but also taking a comparison result of a phase change point with a phoneme string change point into consideration.
    Type: Application
    Filed: June 30, 2006
    Publication date: August 9, 2007
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jae-Hoon Jeong, Kwang Cheol Oh
  • Patent number: 7219059
    Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score.
    Type: Grant
    Filed: July 3, 2002
    Date of Patent: May 15, 2007
    Assignee: Lucent Technologies Inc.
    Inventors: Sunil K. Gupta, Ziyi Lu, Fengguang Zhao
  • Patent number: 7219058
    Abstract: An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a mobile device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.
    Type: Grant
    Filed: October 1, 2001
    Date of Patent: May 15, 2007
    Assignee: AT&T Corp.
    Inventors: Richard C. Rose, Sarangarajan Pathasarathy, Aaron Edward Rosenberg, Shrikanth Sambasivan Narayanan
  • Patent number: 7216076
    Abstract: A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.
    Type: Grant
    Filed: December 19, 2005
    Date of Patent: May 8, 2007
    Assignee: AT&T Corp.
    Inventor: Bishnu Saroop Atal
  • Patent number: 7165030
    Abstract: A method for concatenative speech synthesis includes a processing stage that selects segments based on their symbolic labeling in an efficient graph-based search, which uses a finite-state transducer formalism. This graph-based search uses a representation of concatenation constraints and costs that does not necessarily grow with the size of the source corpus thereby limiting the increase in computation required for the search as the size of the source corpus increases. In one application of this method, multiple alternative segment sequences are generated and a best segment sequence is then be selected using characteristics that depend on specific signal characteristics of the segments.
    Type: Grant
    Filed: September 17, 2001
    Date of Patent: January 16, 2007
    Assignee: Massachusetts Institute of Technology
    Inventors: Jon Rong-Wei Yi, James Robert Glass, Irvine Lee Hetherington
  • Patent number: 7031917
    Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.
    Type: Grant
    Filed: October 21, 2002
    Date of Patent: April 18, 2006
    Assignee: Sony Corporation
    Inventor: Yasuharu Asano
  • Patent number: 7006969
    Abstract: A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.
    Type: Grant
    Filed: November 1, 2001
    Date of Patent: February 28, 2006
    Assignee: AT&T Corp.
    Inventor: Bishnu Saroop Atal
  • Patent number: 6983245
    Abstract: A spectral distance calculator includes a calculator for performing spectral distance calculations to compare a spectrum of an an input signal in the presence of a noise signal and a reference spectrum. A memory pre-stores a noise spectrum from the noise signal. A masking unit masks the spectral distance between the input spectrum and the reference spectrum with respect to the pre-stored noise spectrum.
    Type: Grant
    Filed: June 7, 2000
    Date of Patent: January 3, 2006
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Alberto Jimenez Felstrom, Jim Rasmusson
  • Patent number: 6978237
    Abstract: A speech recognition support method in a system to retrieve a map in response to a user's input speech. The user's speech is recognized and a recognition result is obtained. If the recognition result represents a point on the map, a distance between the point and a base point on the map is calculated. The distance is decided to be above a threshold or not. If the distance is above the threshold, an inquiry to confirm whether the recognition result is correct is output to the user.
    Type: Grant
    Filed: October 20, 2003
    Date of Patent: December 20, 2005
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Mitsuyoshi Tachimori, Hiroshi Kanazawa
  • Patent number: 6976019
    Abstract: The present invention relates to phonetic self-improving search engines. The search engine may include a phonetic database having a plurality of phonetic equivalent formulas stored therein, each of the phonetic equivalent formulas being associated with at least one respective pronounceable unit. After an initial query in a primary database fails to produce a positive result, an error memory database may be queried with a search string to obtain a positive result based on records of previously failed searches which ultimately found a positive result. If no record is found, the search string may be parsed into at least one pronounceable unit. Phonetically equivalent formulas may be applied to the at least one pronounceable unit to create at least one phonetic search string which is re-queried into the error memory database and the primary database. Successful positive results may be stored with the search string in the error memory database.
    Type: Grant
    Filed: April 19, 2002
    Date of Patent: December 13, 2005
    Inventor: Arash M Davallou
  • Patent number: 6961702
    Abstract: The invention relates to a method for generating an adapted reference for automatic speech recognition. In a first step, recognition is performed based on a spoken utterance and a recognition result which corresponds to a currently valid reference is obtained. In a second step, the currently valid reference is adapted in accordance with the utterance in order to create an adapted reference. In a third step, the adapted reference is assessed and it is decided if the adapted reference is used for further recognition.
    Type: Grant
    Filed: November 6, 2001
    Date of Patent: November 1, 2005
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Stefan Dobler, Andreas Kiessling, Ralph Schleifer, Raymond Brückner
  • Patent number: 6907398
    Abstract: A method is described for compressing the storage space required by HMM prototypes in an electronic memory. For this purpose prescribed HMM prototypes are mapped onto compressed HMM prototypes with the aid of a neural network (encoder). These can be stored with a smaller storage space than the uncompressed HMM prototypes. A second neural network (decoder) serves to reconstruct the HMM prototypes.
    Type: Grant
    Filed: September 6, 2001
    Date of Patent: June 14, 2005
    Assignee: Siemens Aktiengesellschaft
    Inventor: Harald Hoege
  • Patent number: 6879954
    Abstract: A method is provided for improving pattern matching in a speech recognition system having a plurality of acoustic models. The improved method includes: receiving continuous speech input; generating a sequence of acoustic feature vectors that represent temporal and spectral behavior of the speech input; loading a first group of acoustic feature vectors from the sequence of acoustic feature vectors into a memory workspace accessible to a processor; loading an acoustic model from the plurality of acoustic models into the memory workspace; and determining a similarity measure for each acoustic feature vector of the first group of acoustic feature vectors in relation to the acoustic model. Prior to retrieving another group of acoustic feature vectors, similarity measures are computed for the first group of acoustic feature vectors in relation to each of the acoustic models employed by the speech recognition system.
    Type: Grant
    Filed: April 22, 2002
    Date of Patent: April 12, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Patrick Nguyen, Luca Rigazio
  • Patent number: 6873955
    Abstract: Partial waveform data representative of a waveform shape variation are extracted from supplied waveform data, and the extracted partial waveform data are stored along with time position information indicative of their respective time positions. In reproduction, the partial waveform data and time position information are read out, then the partial waveform data are arranged on the time axis in accordance with the time position information, and a waveform is produced on the basis of the waveform data arranged on the time axis. In another implementation, sets of sample identification information and time position information are obtained in accordance with a performance tone waveform to be reproduced, and sample data are obtained from a database in accordance with the sample identification information. The thus-obtained sample data are arranged on the time axis in accordance with the time position information, and the desired waveform is produced on the basis of the sample data arranged on the time axis.
    Type: Grant
    Filed: September 22, 2000
    Date of Patent: March 29, 2005
    Assignee: Yamaha Corporation
    Inventors: Hideo Suzuki, Motoichi Tamura, Satoshi Usa
  • Patent number: 6870848
    Abstract: A communication system includes a packet-based data network coupled to various network entities. The communications system includes a community that has a call processing system and various other devices, such as an integrated voice response (IVR) system and plural agent systems. The call processing system includes a combination of logical entities that perform call processing tasks. As an example, the logical entities may include a Session Initiation Protocol (SIP) proxy, a SIP client, and a SIP server. In response to a call request from outside the community, the call processing system, under control of the server, sends back responses to the originating system. The call processing system also establishes a call with a network element inside the community, such as the IVR system. The IVR system is capable of receiving input data from the originating system. Based on the received input data, the call processing system can reconnect or forward the call to one of the agent systems.
    Type: Grant
    Filed: June 7, 2000
    Date of Patent: March 22, 2005
    Assignee: Nortel Networks Limited
    Inventor: Andrew J. Prokop
  • Patent number: 6868378
    Abstract: The invention relates to a process and a system for voice recognition in a noisy signal. In a preferred embodiment, the system (2) comprises modules for detecting speech (30) and for formulating a noise model (31), a module (40) for quantifying the energy level of the noise and for comparing with preestablished energy spans, a parameterization pathway (5) comprising an optional denoising module (51), with Wiener filter, a module (52) for calculating the spectral energy in Bark windows, a module (50, 530) for applying a configuration of shift values (531), by adding these values to the Bark coefficients, as a function of the quantification (40), so as to modify the parameterization, a module (54) for calculating vectors of parameters, and a block (6) for recognizing shapes, performing the voice recognition by comparison with vectors of parameters prerecorded during a learning phase.
    Type: Grant
    Filed: November 19, 1999
    Date of Patent: March 15, 2005
    Assignee: Thomson-CSF Sextant
    Inventor: Pierre-Albert Breton
  • Patent number: 6839671
    Abstract: In this invention dialogue states for a dialogue model are created using a training corpus of example human—human dialogues. Dialogue states are modelled at the turn level rather than at the move level, and the dialogue states are derived from the training corpus. The range of operator dialogue utterances is actually quite small in many services and therefore may be categorized into a set of predetermined meanings. This is an important assumption which is not true of general conversation, but is often true of conversations between telephone operators and people. Phrases are specified which have specific substitution and deletion penalties, for example the two phrases “I would like to” and “can I” may be specified as a possible substitution with low or zero penalty. Thus allows common equivalent phrases are given low substitution penalties. Insignificant phrases such as ‘erm’ are given low or zero deletion penalties.
    Type: Grant
    Filed: December 19, 2000
    Date of Patent: January 4, 2005
    Assignee: British Telecommunications public limited company
    Inventors: David J. Attwater, Michael D. Edgington, Peter J. Durston
  • Patent number: 6836758
    Abstract: A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.
    Type: Grant
    Filed: January 9, 2001
    Date of Patent: December 28, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Ning Bi, Andrew P. DeJaco, Harinath Garudadri, Chienchung Chang, William Yee-Ming Huang, Narendranath Malayath, Suhail Jalil, David Puig Oses, Yingyong Qi
  • Patent number: 6836760
    Abstract: A method and apparatus to use semantic inference with speech recognition systems includes recognizing at least one spoken word, processing the spoken word using a context-free grammar, deriving an output from the context-free grammar, and translating the output to a predetermined command.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: December 28, 2004
    Assignee: Apple Computer, Inc.
    Inventors: Jerome R. Bellegarda, Kim E. A. Silverman
  • Patent number: 6823304
    Abstract: A lead consonant buffer stores a feature parameter preceding a lead voiced sound detected by a voiced sound detector as a feature parameter of a lead consonant. A matching processing unit performs matching processing of a feature parameter of a lead consonant stored in the lead consonant buffer with a feature parameter of a registered pattern. Hence, the matching processing unit can perform matching processing reflecting information on a lead consonant even when no lead consonant can be detected due to a noise.
    Type: Grant
    Filed: July 19, 2001
    Date of Patent: November 23, 2004
    Assignee: Renesas Technology Corp.
    Inventor: Masahiko Ikeda
  • Patent number: 6788767
    Abstract: An apparatus and method for enabling provision of a call return service is disclosed. The apparatus utilizes a method of generating telephone numbers from voice messages. The method includes the step of using speech recognition to isolate a spoken number in a voice message, and confirming to a high degree of accuracy that the spoken number represents a telephone number. The method further includes the step of converting the spoken number into a data sequence representing the telephone number. This data sequence is then made available for immediate or later use.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: September 7, 2004
    Assignee: Gateway, Inc.
    Inventor: Jay V. Lambke
  • Patent number: 6778957
    Abstract: Disclosed is a method of automated handset identification, comprising receiving a sample speech input signal from a sample handset; deriving a cepstral covariance sample matrix from said first sample speech signal; calculating, with a distance metric, all distances between said sample matrix and one or more cepstral covariance handset matrices, wherein each said handset matrix is derived from a plurality of speech signals taken from different speakers through the same handset; and determining if the smallest of said distances is below a predetermined threshold value.
    Type: Grant
    Filed: August 21, 2001
    Date of Patent: August 17, 2004
    Assignee: International Business Machines Corporation
    Inventors: Zhong-Hua Wang, David Lubensky, Cheng Wu
  • Publication number: 20040158468
    Abstract: A method, program product, and system for speech recognition, the method comprising in one embodiment pruning a hypothesis based on a first criteria; storing information about the pruned hypothesis; and reactivating the pruned hypothesis if a second criterion is met. In an embodiment, the first criteria may be that another hypothesis has a better score at that time by some predetermined amount. In an embodiment, the stored information may comprise at least one of a score for the pruned hypothesis, an identification of the hypothesis that caused the pruning and the frame in which the pruning took place. In a further embodiment, the reactivating step may use at least some of the stored information about the pruned hypothesis in performing the reactivation and the second criteria may be that a revised score for the hypothesis that caused the pruning is worse by some predetermined amount from an original expected score calculated for that hypothesis.
    Type: Application
    Filed: February 12, 2003
    Publication date: August 12, 2004
    Applicant: Aurilab, LLC
    Inventor: James K. Baker
  • Patent number: 6754626
    Abstract: The invention disclosed herein concerns a method of converting speech to text using a hierarchy of contextual models. The hierarchy of contextual models can be statistically smoothed into a language model. The method can include processing text with a plurality of contextual models. Each one of the plurality of contextual models can correspond to a node in a hierarchy of the plurality of contextual models. Also included can be identifying at least one of the contextual models relating to the text and processing subsequent user spoken utterances with the identified at least one contextual model.
    Type: Grant
    Filed: March 1, 2001
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventor: Mark E. Epstein
  • Patent number: 6754624
    Abstract: A method and apparatus for enhancing coding efficiency by reducing illegal or other undesirable packet generation while encoding a signal. The probability of generating illegal or other undesirable packets while encoding a signal is reduced by first analyzing a history of the frequency of codebook values selected while quantizing speech parameters. Codebook entries are then reordered so that the index/indices that create illegal or other undesirable packets contain the least frequently used entry/entries. Reordering multiple codebooks for various parameters further reduces the probability that an illegal or other undesirable packet will be created during signal encoding. The method and apparatus may be applied to reduce the probability of generating illegal null traffic channel data packets while encoding eighth rate speech.
    Type: Grant
    Filed: February 13, 2001
    Date of Patent: June 22, 2004
    Assignee: Qualcomm, Inc.
    Inventors: Eddie-Lun Tik Choy, Arasanipalai K. Ananthapadmanabhan, Andrew P. DeJaco
  • Patent number: 6754629
    Abstract: A method and system that combines voice recognition engines and resolves differences between the results of individual voice recognition engines using a mapping function. Speaker independent voice recognition engines and speaker-dependent voice recognition engines are combined. Hidden Markov Model (HMM) engines and Dynamic Time Warping (DTW) engines are combined.
    Type: Grant
    Filed: September 8, 2000
    Date of Patent: June 22, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Yingyong Qi, Ning Bi, Harinath Garudadri
  • Patent number: 6754628
    Abstract: Methods and apparatus for facilitating speaker recognition, wherein, from target data that is provided relating to a target speaker and background data that is provided relating to at least one background speaker, a set of cohort data is selected from the background data that has at least one proximate characteristic with respect to the target data. The target data and the cohort data are then combined in a manner to produce at least one new cohort model for use in subsequent speaker verification. Similar methods and apparatus are contemplated for non-voice-based applications, such as verification through fingerprints.
    Type: Grant
    Filed: June 13, 2000
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Stephane H. Maes, Jiri Navratil
  • Publication number: 20040107100
    Abstract: A method is provided for real-time speaker change detection and speaker tracking in a speech signal. The method is a “coarse-to-refine” process, which consists of two stages: pre-segmentation and refinement. In the pre-segmentation process, the covariance of a feature vector of each segment of speech is built initially. A distance is determined based on the covariance of the current segment and a previous segment; and the distance is used to determine if there is a potential speaker change between these two segments. If there is no speaker change, the model of current identified speaker model is updated by incorporating data of the current segment. Otherwise, if there is a speaker change, a refinement process is utilized to confirm the potential speaker change point.
    Type: Application
    Filed: November 29, 2002
    Publication date: June 3, 2004
    Inventors: Lie Lu, Hong-Jiang Zhang
  • Patent number: 6741962
    Abstract: A speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.
    Type: Grant
    Filed: March 7, 2002
    Date of Patent: May 25, 2004
    Assignee: NEC Corporation
    Inventor: Kenichi Iso
  • Patent number: 6725196
    Abstract: A method and apparatus is provided for matching a first sequence of patterns representative of a first signal with a second sequence of patterns representative of a second signal. The system uses a plurality of different pruning thresholds (th) to control the propagation of paths which represent possible matchings between a sequence of second signal patterns and a sequence of first signal patterns ending at the current first signal pattern. In particular, the pruning threshold used for a given path during the processing of a current first signal pattern depends upon the position, within the sequence of patterns representing the second signal, of the second signal pattern which is at the end of the given path.
    Type: Grant
    Filed: March 20, 2001
    Date of Patent: April 20, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventors: Robert Alexander Keiller, Eli Tzirkel-Hancock, Julian Richard Seward