Correlation Patents (Class 704/237)
  • Patent number: 8175882
    Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: May 8, 2012
    Assignee: International Business Machines Corporation
    Inventors: Sara H. Basson, Dimitiri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
  • Publication number: 20120095762
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Application
    Filed: October 19, 2011
    Publication date: April 19, 2012
    Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
  • Patent number: 8140329
    Abstract: A method and apparatus are proposed for automatically recognizing observed audio data. An observation vector is created of audio features extracted from the observed audio data and the observed audio data is recognized from the observation vector. The audio features include features are selected from a group of 3 types of features obtained from the observed audio data: (i) ICA features obtained by processing the observed audio data, (ii) first MFCC features obtained by removing a logarithm step from the conventional MFCC process, or (iii) second MFCC features obtained by applying the ICA process to results of a mel scale filter bank.
    Type: Grant
    Filed: April 5, 2004
    Date of Patent: March 20, 2012
    Assignee: Sony Corporation
    Inventors: Jian Zhang, Wei Lu, Xiaobing Sun
  • Patent number: 8103506
    Abstract: The present disclosure provides method and system for converting a free text expression of an identity to a phonetic equivalent code. The conversion follows a set of rules based on phonetic groupings and compresses the expression to a shorter series of characters than the expression. The phonetic equivalent code may be compared to one or more other phonetic equivalent code to establish a correlation between the codes. The phonetic equivalent code of the free text expression may be associated with the code of a known identity. The known identity may be provided to a user for confirmation of the identity. Further, a plurality of expressions stored in a database may be consolidated by converting the expressions to phonetic equivalent codes, comparing the codes to find correlations, and if appropriate reducing the number of expressions or mapping the expressions to a fewer number of expressions.
    Type: Grant
    Filed: September 20, 2007
    Date of Patent: January 24, 2012
    Assignee: United Services Automobile Association
    Inventors: Gregory Brian Meyer, James Elden Nicholson
  • Publication number: 20110288864
    Abstract: A method for detecting speech using a first microphone adapted to produce a first signal (x), and a second microphone adapted to produce a second signal (x2), the method comprising the steps of: (i) applying gain to the second signal to produce a normalised second signal, which signal is normalised relative to the first signal; (ii) constructing one or more signal components from the first signal and the normalised second signal; (iii) constructing an adaptive differential microphone (ADM) having a constructed microphone response constructed from the one or more signal components which response has at least one directional null; (iv) producing one or more ADM outputs (yf, yb) from the constructed microphone response in response to detected sound; (v) computing a ratio of a parameter of either a first signal component or a constructed microphone response to a parameter of an output of the ADM; (vi) comparing the ratio to an adaptive threshold value; (vii) detecting speech if the ratio is greater than or equ
    Type: Application
    Filed: November 19, 2010
    Publication date: November 24, 2011
    Applicant: NXP B.V.
    Inventors: Patrick Kechichian, Cornelis Pieter Janse, Rene Martinus Maria Derkx, Wouter Joos Tirry
  • Patent number: 8060365
    Abstract: A dialog processing system which includes a target expression data extraction unit for extracting a plurality of target expression data each including a pattern matching portion which matches an utterance pattern, which are inputted by an utterance pattern input unit and is an utterance structure derived from contents of field-independent general conversations, among a plurality of utterance data which are inputted by an utterance data input unit and obtained by converting contents of a plurality of conversations in one field; a feature extraction unit for retrieving the pattern matching portions, respectively, from the plurality of target expression data extracted, and then for extracting feature quantity common to the plurality of pattern matching portions; and a mandatory data extraction unit for extracting mandatory data in the one field included in the plurality of utterance data by use of the feature quantities extracted.
    Type: Grant
    Filed: July 3, 2008
    Date of Patent: November 15, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Nobuyasu Itoh, Shiho Negishi, Hironori Takeuchi
  • Publication number: 20110276323
    Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.
    Type: Application
    Filed: May 6, 2010
    Publication date: November 10, 2011
    Applicant: Senam Consulting, Inc.
    Inventor: Serge Olegovich Seyfetdinov
  • Publication number: 20110231186
    Abstract: A speech detection method is presented, which includes the following steps. A first voice captured device samples a first signal and a second voice captured device samples a second signal. The first voice captured device is closer to a speech signal source than the second voice captured device. A first energy corresponding to the first signal within an interval is calculated, a second energy corresponding to the second signal within the interval is calculated, and a first ratio is calculated according to the first energy and the second energy. The first ratio is transformed into a second ratio. A threshold value is set. It is determined whether the speech signal source is detected by comparing the second ratio and the threshold value.
    Type: Application
    Filed: July 30, 2010
    Publication date: September 22, 2011
    Applicant: ISSC TECHNOLOGIES CORP.
    Inventors: Ying Tsung Lin, Yung Chen Ting, Pansop Kim
  • Patent number: 8010356
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: February 17, 2006
    Date of Patent: August 30, 2011
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Patent number: 7974392
    Abstract: A communication device and method are provided for audibly outputting a received text message to a user, the text message being received from a sender. A text message to present audibly is received. An output voice to present the text message is retrieved, wherein the output voice is synthesized using predefined voice characteristic information to represent the sender's voice. The output voice is used to audibly present the text message to the user.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: July 5, 2011
    Assignee: Research In Motion Limited
    Inventor: Eric Ng
  • Patent number: 7974844
    Abstract: A speech recognition apparatus includes a first-candidate selecting unit that selects a recognition result of a first speech from first recognition candidates based on likelihood of the first recognition candidates; a second-candidate selecting unit that extracts recognition candidates of a object word contained in the first speech and recognition candidates of a clue word from second recognition candidates, acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, and selects a recognition result of the second speech based on the acquired relevance ratio; a correction-portion identifying unit that identifies a portion corresponding to the object word in the first speech; and a correcting unit that corrects the word on identified portion.
    Type: Grant
    Filed: March 1, 2007
    Date of Patent: July 5, 2011
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kazuo Sumita
  • Publication number: 20110153317
    Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.
    Type: Application
    Filed: December 23, 2009
    Publication date: June 23, 2011
    Applicant: QUALCOMM INCORPORATED
    Inventors: Yinian Mao, Gene Marsh
  • Patent number: 7917362
    Abstract: A method for determining a bit boundary of a repetition-coded signal including bits each having a plurality of epochs includes (a) counting the epochs repeatedly from an initial number to a predetermined number in a predetermined time, (b) sensing sign changes in the epochs, (c) recording each sensed sign change with a weighting function to a corresponding counting number of the epoch, and (d) determining the bit boundary according to a result of step (c).
    Type: Grant
    Filed: April 19, 2006
    Date of Patent: March 29, 2011
    Assignee: MediaTek Inc.
    Inventor: Jia-Horng Shieh
  • Publication number: 20110019805
    Abstract: Methods and systems are provided for searching audio records. Certain embodiments of the invention may be applied to search audio records containing a user's voice for instances where a specific sound, such as a word or phrase, is vocalized by the user. An audio sample is provided by recording the user vocalizing the sound. The audio sample is compared with the audio records to locate matches to the audio sample. In some embodiments, the audio records comprise recordings of calls between a near-end caller and a far-end caller, and the audio sample is a recording of a sound spoken by the near-end caller. The same input device may be used to record both the audio sample and the audio records.
    Type: Application
    Filed: January 14, 2009
    Publication date: January 27, 2011
    Applicant: ALGO COMMUNICATION PRODUCTS LTD.
    Inventor: Paul William Zoehner
  • Publication number: 20100299148
    Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.
    Type: Application
    Filed: March 29, 2010
    Publication date: November 25, 2010
    Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
  • Patent number: 7822614
    Abstract: A language analyzer performs speech recognition on a speech input by a speech input unit, specifies a possible word which is represented by the speech, and the score thereof, and supplies word data representing them to an agent processing unit. The agent processing unit stores process item data which defines a data acquisition process to acquire word data or the like, a discrimination process, and an input/output process, and wires or data defining transition from one process to another and giving a weighting factor to the transition, and executes a flow represented generally by the process item data and the wires to thereby control devices belonging to an input/output target device group. To which process in the flow the transition takes place is determined by the weighting factor of each wire, which is determined by the connection relationship between a point where the process has proceeded and the wire, and the score of word data.
    Type: Grant
    Filed: December 6, 2004
    Date of Patent: October 26, 2010
    Assignee: Kabushikikaisha Kenwood
    Inventor: Rika Koyama
  • Patent number: 7805301
    Abstract: A reliable full covariance matrix estimation algorithm for pattern unit's state output distribution in pattern recognition system is discussed. An intermediate hierarchical tree structure is built to relate models for product units. Full covariance matrices of pattern unit's state output distribution are estimated based on all the related nodes in the tree.
    Type: Grant
    Filed: July 1, 2005
    Date of Patent: September 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Ye Tian, Frank Kao-Ping Soong, Jian-Lai Zhou
  • Patent number: 7804956
    Abstract: The present invention provides a biometrics-based cryptographic key generation system and method. A user-dependent distinguishable feature transform unit provides a feature transformation for each authentic user, which receives N-dimensional biometric features and performs a feature transformation to produce M-dimensional feature signals, such that the transformed feature signals of the authentic user are compact in the transformed feature space while those of other users presumed as imposters are either diverse or far away from those of the authentic user. A stable key generation unit receives the transformed feature signals to produce a cryptographic key based on bit information respectively provided by the M-dimensional feature signals, wherein the length of the bit information provided by the feature signal of each dimension is proportional to the degree of distinguishability in the dimension.
    Type: Grant
    Filed: March 11, 2005
    Date of Patent: September 28, 2010
    Assignee: Industrial Technology Research Institute
    Inventors: Yao-Jen Chang, Tsu-Han Chen, Wen-De Zhang
  • Patent number: 7774337
    Abstract: A method for controlling a relational database system, with a query statement comprised of keywords being analyzed, with the RTN being formed of independent RTN building blocks. Each RTN building block has an inner, directed decision graph which is defined independently from the inner, directed decision graphs of the other RTN building blocks with at least one decision position along at least one decision path. The inner decision graphs of all RTN building blocks are run by means of the keywords in a selection step and all possible paths of this decision graph are followed until either no match with the respectively selected path is determined by the decision graph and the process is interrupted, or the respectively chosen path is run until the end.
    Type: Grant
    Filed: July 10, 2007
    Date of Patent: August 10, 2010
    Assignee: Mediareif Moestl & Reif Kommunikations-und Informationstechnologien OEG
    Inventor: Matthias Moestl
  • Patent number: 7756715
    Abstract: Apparatus, method, and medium for processing an audio signal using a correlation between bands are provided. The apparatus includes an encoding unit encoding an input audio signal and a decoding unit decoding the encoded input audio signal.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: July 13, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Junghoe Kim, Dohyung Kim, Sihwa Lee
  • Patent number: 7738635
    Abstract: A method for improving the recognition confidence of alphanumeric spoken input, suitable for use in a speech recognition telephony application such as a voice response system. An alphanumeric candidate is determined from the spoken input, which may be the best available representation of the spoken input. Recognition confidence is compared with a preestablished threshold. If the recognition confidence exceeds the threshold, the alphanumeric candidate is selected to represent the spoken input. Otherwise, present call data associated with the spoken input is determined. Call data may include automatic number identification (ANI) information, caller-ID information, and/or dialed number information service (DNIS) information. Information associated with the alphanumeric candidate and information associated with the present call data are correlated in order to select alphanumeric information that best represents the spoken input.
    Type: Grant
    Filed: January 6, 2005
    Date of Patent: June 15, 2010
    Assignee: International Business Machines Corporation Nuance Communications, Inc.
    Inventors: Christopher Ryan Groves, Kevin James Muterspaugh
  • Patent number: 7680657
    Abstract: Possible segmentations for an audio signal are scored based on distortions for feature vectors of the audio signal and the total number of segments in the segmentation. The scores are used to select a segmentation and the selected segmentation is used to identify a starting point and an ending point for a speech signal in the audio signal.
    Type: Grant
    Filed: August 15, 2006
    Date of Patent: March 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Yu Shi, Frank Kao-ping Soong, Jian-Iai Zhou
  • Patent number: 7624020
    Abstract: An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: November 24, 2009
    Assignee: Language Weaver, Inc.
    Inventors: Kenji Yamada, Kevin Knight, Greg Langmead
  • Patent number: 7610198
    Abstract: A method of searching a signed codebook to quantize a vector includes weighting a shape codevector in a set of shape codevectors with a weighting function for a Weighted Mean Square Error (WMSE) criteria, to produce a weighted shape codevector. The method further includes correlating the weighted shape codevector with the vector to produce a weighted correlation term. The method also includes determining, based on a sign of the weighted correlation term, a preferred one of a positive and a negative signed codevector associated with the shape codevector. The method further includes determining whether one of the signed codevectors does not belong to an illegal space defining illegal vectors.
    Type: Grant
    Filed: June 7, 2002
    Date of Patent: October 27, 2009
    Assignee: Broadcom Corporation
    Inventor: Jes Thyssen
  • Patent number: 7603269
    Abstract: A speech recognition grammar creating apparatus, which is capable of eliminating complex labor associated with preparing all rules by taking into account changes of the order of component elements of a speech-recognizing object and possible combinations of component elements including at least one component element that can be omitted. In the speech recognition grammar creating apparatus, an image edit section groups together at least one component element that cannot be omitted and at least one component element that can be omitted, as the speech-recognizing object, into a component element group as an omission-allowed group. An augmented BNF converting section creates the speech recognition grammar by expanding the component element group obtained by the grouping.
    Type: Grant
    Filed: June 29, 2005
    Date of Patent: October 13, 2009
    Assignee: Canon Kabushiki Kaisha
    Inventors: Kazue Kaneko, Michio Aizawa
  • Patent number: 7596495
    Abstract: A method is provided for recurrently estimating a spectrum of noise at each signal observation interval from a sound signal which contains the noise and which is observed at each signal observation interval. In the method, there are acquired an envelope of a previous spectrum of the noise which has been previously estimated from the sound signal observed at a previous signal observation interval, and an envelope of a current spectrum of the sound signal which is observed at a current signal observation interval subsequent to the previous signal observation interval. Then, a value of correlation is computed between the envelop of the previous spectrum of the noise and the envelope of the current spectrum of the sound signal. A current spectrum of the noise contained in the sound signal observed at the current signal observation interval is estimated in accordance with the computed value of the correlation and based on the previous spectrum of the noise and the current spectrum of the sound signal.
    Type: Grant
    Filed: March 29, 2005
    Date of Patent: September 29, 2009
    Assignee: Yamaha Corporation
    Inventors: Michiko Kazama, Mikio Tohyama, Toru Hirai
  • Patent number: 7496513
    Abstract: Input is received from at least two different input sources. Information from these sources are combined together to provide a result. In a particular example, input from one source corresponds to potential recognition candidates, and input from another source corresponds to other potential candidates. These candidates are combined together to select a result.
    Type: Grant
    Filed: June 28, 2005
    Date of Patent: February 24, 2009
    Assignee: Microsoft Corporation
    Inventors: Frank Kao-Ping Soong, Jian-Lai Zhou, Ye Tian
  • Patent number: 7412384
    Abstract: A digital signal processing method and learning method and devices therefor, and a program storage medium which are capable of further improving the waveform reproducibility of a digital signal. Self correlation coefficients are calculated by cutting parts out of the digital signal by multiple windows having different sizes, and the parts are classified based on the calculation results of the self correlation coefficients. Then, the digital signal is converted by the prediction method corresponding to the classified class, so that the conversion further suitable for the features of the digital signal can be conducted.
    Type: Grant
    Filed: July 31, 2001
    Date of Patent: August 12, 2008
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Tsutomu Watanabe
  • Patent number: 7337109
    Abstract: A multiple step adaptive method for time scaling. Synthesizing S3[n] signal from signal S1[n]signal and S2[n]signal. Comprising following steps: (a) calculating a first magnitude of a cross-correlation function of S1[n]signal and S2[n]signal according to a first index; (b) comparing the first magnitude with a threshold value; (c) if first magnitude is smaller than threshold value, calculating a first reference magnitude of cross-correlation function of S1[n]signal and S2[n]signal according to a first reference index behind the first index by a first determined number, or calculating a second reference magnitude of the cross-correlation function of the S1[n] signal and the S2[n] signal according to a second reference index behind the first index by a second number; (d) synthesizing the S3[n] signal by adding S1[n]signal to the S2[n] signal in accordance with a maximum index corresponding to a largest magnitude among all the magnitudes calculated in (c).
    Type: Grant
    Filed: October 2, 2003
    Date of Patent: February 26, 2008
    Assignee: ALI Corporation
    Inventor: Gin-Der Wu
  • Patent number: 7284255
    Abstract: A system and method are disclosed for performing audience surveys of broadcast audio from radio and television. A small body-worn portable collection unit samples the audio environment of the survey member and stores highly compressed features of the audio programming. A central computer simultaneously collects the audio outputs from a number of radio and television receivers representing the possible selections that a survey member may choose. On a regular schedule the central computer interrogates the portable units used in the survey and transfers the captured audio feature samples. The central computer then applies a feature pattern recognition technique to identify which radio or television station the survey member was listening to at various times of day. This information is then used to estimate the popularity of the various broadcast stations.
    Type: Grant
    Filed: November 16, 1999
    Date of Patent: October 16, 2007
    Inventors: Steven G. Apel, Stephen C. Kenyon
  • Patent number: 7212968
    Abstract: A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: May 1, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
  • Patent number: 7139705
    Abstract: A method of determining the time relation between an original or input speech signal (10) and an output speech signal (15) affected by time warping in a communications system, such as a VoIP (Voice over Internet Protocol) system. Wherein corresponding speech bursts (11, 12; 16, 17) of the input (10) and output speech signal (15) are located in accordance with a predefined signal property thereof. The corresponding speech bursts (11, 12; 16, 17) thus located and time aligned (10, 30) for the correction of continuous and discontinuous warping effects. A performance estimate is generated by comparing the time aligned input and output speech signals (10, 30) applying cross-correlation techniques and PSQM (Perceptual Speech Quality Measure) or PSQM+ (Enhanced Perceptual Speech Quality Measure) techniques.
    Type: Grant
    Filed: November 13, 2000
    Date of Patent: November 21, 2006
    Assignee: Koninklijke KPN N.V.
    Inventors: John Gerard Beerends, Andries Pieter Hekstra
  • Patent number: 7130292
    Abstract: A method and apparatus for enhancing the receiving and information identification functions of multiple access communications systems by employing one or more optical processors configured as a bank of 1-D correlators. The present invention is particularly useful in a DS/SS CDMA communications system, resulting in a multiuser CDMA system that approaches carrier to noise performance (C/N) as opposed to being limited by multiple access interference (MAI). The correlators are arranged in parallel to detect and/or demodulate the received signal, in conjunction with one or more complex algorithms to perform near-optimum multiuser detection, perform multipath combining and/or perform carrier Doppler compensation.
    Type: Grant
    Filed: January 19, 2001
    Date of Patent: October 31, 2006
    Assignee: Essex Corporation
    Inventors: Terry M. Turpin, James L. Lafuse
  • Patent number: 6996291
    Abstract: After one or both of a pair of images are obtained, an auto-correlation function for one of those images is generated to determine a smear amount and possibly a smear direction. The smear amount and direction are used to identify potential locations of a peak portion of the correlation function between the pair of images. The pair of images is then correlated only at offset positions corresponding to the one or more of the potential peak locations. In some embodiments, the pair of images is correlated according to a sparse set of image correlation function value points around the potential peak locations. In other embodiments, the pair of images is correlated at a dense set of correlation function value points around the potential peak locations. The correlation function values of these correlation function value points are then analyzed to determine the offset position of the true correlation function peak.
    Type: Grant
    Filed: August 6, 2001
    Date of Patent: February 7, 2006
    Assignee: Mitutoyo Corporation
    Inventor: Michael Nahum
  • Patent number: 6965631
    Abstract: One embodiment of the present invention includes a circular shift register, K storage elements, and a code register. The circular shift register having N data samples circularly shifts a first data sample of the N data samples into a data position at a first clock frequency. The N data samples correspond to signal received from one of K satellites in a global positioning system (GPS). The N data samples are loaded into the circular shift register at a second clock frequency. The K storage elements store K code sequences, respectively. Each of the K code sequences has N code samples and includes a first code sample being written at a code position corresponding to the data position at a third clock frequency. The K storage elements correspond to the K satellites. The code register stores the N code samples loaded from one of the K storage elements at a fourth clock frequency. The fourth clock frequency is K times faster than the first clock frequency.
    Type: Grant
    Filed: March 13, 2001
    Date of Patent: November 15, 2005
    Assignee: PRI Research & Development Corp.
    Inventors: Kaveh Shakeri, Alireza Mehrnia, Farshid Soheili-Najafabadi
  • Patent number: 6823305
    Abstract: Speaker normalization is carried out based on biometric information available about a speaker, such as his height, or a dimension of a bodily member or article of clothing. The chosen biometric parameter correlates with the vocal tract length. Speech can be normalized based on the biometric parameter, which thus indirectly normalizes the speech based on the vocal tract length of the speaker. The inventive normalization can be used in model formation, or in actual speech recognition usage, or both. Substantial improvements in accuracy have been noted at little cost. The preferred biometric parameter is height, and the preferred form of scaling is linear scaling with the scale factor proportional to the height of the speaker.
    Type: Grant
    Filed: December 21, 2000
    Date of Patent: November 23, 2004
    Assignee: International Business Machines Corporation
    Inventor: Ellen M. Eide
  • Patent number: 6687672
    Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.
    Type: Grant
    Filed: March 15, 2002
    Date of Patent: February 3, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
  • Publication number: 20030225581
    Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.
    Type: Application
    Filed: March 14, 2003
    Publication date: December 4, 2003
    Applicant: International Business Machines Corporation
    Inventors: Tetsuya Takiguchi, Masafumi Nishimura
  • Publication number: 20030182117
    Abstract: A question as to the physical condition and a question as to the feeling are asked, answers are accepted by voice, acoustic features are extracted from the answer to the question as to the physical condition, and character string information is extracted from the answer to the question as to the feeling. A correlation between the acoustic features and the character string information is set, and character string information is identified from a newly accepted acoustic feature, thereby performing feeling presumption. Feeling is presumed from the voice in response to the subject's severalty and changes in the subject's age and physical condition. A medical examination by interview on the mental condition is performed on the subject by voice output and voice input, and the subject's mental condition is diagnosed based on the contents of the subject's answer and analysis of the answer by voice.
    Type: Application
    Filed: January 31, 2003
    Publication date: September 25, 2003
    Applicant: SANYO ELECTRIC CO., LTD.
    Inventors: Rie Monchi, Masakazu Asano, Hirokazu Genno
  • Publication number: 20030093273
    Abstract: There are provided: a data generating section 3 which differentiates an input aural signal, detects as a sample point a point where a differentiating value satisfies a predetermined condition, and obtains discrete amplitude data on detected sample points and timing data indicative of a time interval between the sample points, and a correlating section 4 for computing correlation data by using the amplitude data and the timing data. Input speech is recognized by matching correlation data, which is generated for input speech by a correlating section 4, with correlation data which is generated in the same manner in advance for a variety of speech and is stored in a data memory 6.
    Type: Application
    Filed: October 3, 2002
    Publication date: May 15, 2003
    Inventor: Yukio Koyanagi
  • Patent number: 6560575
    Abstract: An apparatus is provided for checking the consistency between two training words which can be used in, for example, a speech recognition or verification system. Two training examples are aligned using a dynamic programming alignment process and an average frame score is calculated from the alignment results together with the worst score in a number of consecutive frames. These values are then compared with similar values obtained from training examples which are known to be consistent to determine if the training examples are consistent.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: May 6, 2003
    Assignee: Canon Kabushiki Kaisha
    Inventor: Robert Alexander Keiller
  • Publication number: 20030009331
    Abstract: Pre-computed context-dependent phoneme representations of a number of constituents of a grammar are processed dynamically by a speech recognizer. The approach provides a configurable tradeoff between data size and recognition-time computation. This tradeoff can be obtained without sacrificing recognition accuracy, and in particular, allows full modeling of all cross-word phoneme contexts. In one aspect of the invention, a specification of a grammar is processed. This specification includes specifications of a number of constituents of the grammar. A first subset of the constituents of the grammar are selected, and the remaining of the constituents form a second subset. For each of the constituents in the first subset the method first includes processing the specification of the constituent to form a first processed representation that defines sequences of elements that are associated with that constituent and that includes words and references to constituents in the first subset.
    Type: Application
    Filed: July 16, 2001
    Publication date: January 9, 2003
    Inventors: Johan Schalkwyk, Michael S. Phillips
  • Publication number: 20020184018
    Abstract: To propose a digital signal processing method and learning method and devices therefor, and a program storage medium which are capable of further improving the waveform reproducibility of a digital signal. Self correlation coefficients D40 and D41 are calculated respectively by cutting parts out of the digital signal D10 by multiple windows having different sizes, and the parts are classified based on the calculation results D15 of the self correlation coefficients D40 and D41 and then, the digital signal D10 is converted by the prediction method corresponding to the classified class, so that the conversion further suitable for the features of the digital signal D10 can be conducted.
    Type: Application
    Filed: March 29, 2002
    Publication date: December 5, 2002
    Inventor: Tetsujiro Kondo
  • Patent number: 6314392
    Abstract: In a computerized method a continuous signal is segmented in order to determine statistically stationary units of the signal. The continuous signal is sampled at periodic intervals to produce a timed sequence of digital samples. Fixed numbers of adjacent digital samples are grouped into a plurality of disjoint sets or frames. A statistical distance between adjacent frames is determined. The adjacent sets are merged into a larger set of samples or cluster if the statistical distance is less than a predetermined threshold. In an iterative process, the statistical distance between the adjacent sets are determined, and as long as the distance is less than the predetermined threshold, the sets are iteratively merged to segment the signal into statistically stationary units.
    Type: Grant
    Filed: September 20, 1996
    Date of Patent: November 6, 2001
    Assignee: Digital Equipment Corporation
    Inventors: Brian S. Eberman, William D. Goldenthal
  • Patent number: 6275799
    Abstract: A first parameter set constituting reference patterns of each category in speech recognition based on pattern matching with a reference pattern is to be determined from a plurality of learning utterance data. The first parameter set is determined so that a third evaluation function, represented by a sum of a first evaluation function and a second evaluation function is maximized. The first evaluation function represents a matching degree between all learning utterances and corresponding reference patterns. The second evaluation function represents a matching degree between elements of the first parameter set.
    Type: Grant
    Filed: February 2, 1995
    Date of Patent: August 14, 2001
    Assignee: NEC Corporation
    Inventor: Ken-ichi Iso
  • Patent number: 6253175
    Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves “synchrosqueezing” spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as “K-mean Wastrum.” In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as “formant-based wastrum.” Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added.
    Type: Grant
    Filed: November 30, 1998
    Date of Patent: June 26, 2001
    Assignee: International Business Machines Corporation
    Inventors: Sankar Basu, Stephane H. Maes
  • Patent number: 6201960
    Abstract: An improved method and system of measuring the perceived speech quality in mobile telecommunications networks is disclosed herein. In an embodiment of the invention, the method uses both radio link parameters and an objective measuring technique performed on received signals to estimate the speech quality perceived by the end-user. A radio link processing stage extracts temporal information from a set of available radio link parameters such as the BER, FER, RxLev, handover statistics, soft information, and speech energy. Concurrently, a speech processing stage is used to process a sequence of original signals and received signals, obtained from the output of a telecommunications system. The signal sequences are processed by an objective measuring technique such as Perceptual Speech Quality Measure (PSQM). The outputs from the radio link processing and speech processing stages are utilized to calculate an estimate for speech quality.
    Type: Grant
    Filed: June 24, 1997
    Date of Patent: March 13, 2001
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Tor Björn Minde, Anders Tomas Uvliden, Per Anders Karlsson, Per Gunnar Heikkilä
  • Patent number: 6199041
    Abstract: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.
    Type: Grant
    Filed: November 20, 1998
    Date of Patent: March 6, 2001
    Assignee: International Business Machines Corporation
    Inventors: Fu-Hua Liu, Michael A. Picheny
  • Patent number: 6157830
    Abstract: A method and system for measuring the speech quality in a mobile cellular telecommunications network using available radio link parameters is disclosed herein. In a preferred embodiment, the method includes receiving a set of radio link parameters, as defined in a standard or otherwise available, such as the BER, FER, RxLev, handover statistics, soft information, and speech energy. Temporal information is obtained from the radio link parameters to create a set of temporal parameters which can be statistically analyzed, for example, for the maximum and minimum, mean, standard deviation, and autocorrelation values for a time interval. The temporal parameters are combined to yield a set of correlated parameters that are more closely related to the speech quality. An estimator then uses the correlated parameters to calculate an estimate for the speech quality. The method of the present invention takes advantage of temporal information and correlated relationships from the transmitted parameters.
    Type: Grant
    Filed: May 22, 1997
    Date of Patent: December 5, 2000
    Assignee: Telefonaktiebolaget LM Ericsson
    Inventors: Tor Bjorn Minde, Anders Tomas Uvliden, Per Anders Karlsson, Per Gunnar Heikkil.ang.
  • Patent number: 5787395
    Abstract: A voice recognizing method in which a plurality of voice recognition objective words are provided. Scores are accumulated for an unknown input voice signal as compared to the voice recognition objective words by using parameters which are calculated in advance. Upon receipt of an unknown voice signal, a corresponding voice recognition objective word is extracted and recognized. The voice recognition objective words are structured into an overlapping hierarchical structure by using correlation values between each pair of voice recognition objective words. This correlation may be computed from acoustic features, HMM parameters or the like. Score calculation is performed on the unknown input voice signal by using a dictionary of the voice recognition objective words structured in the hierarchical structure. Upon preliminary recognition, the dictionary of the voice recognition objective words is resorted without recalculation of the correlation values.
    Type: Grant
    Filed: July 18, 1996
    Date of Patent: July 28, 1998
    Assignee: Sony Corporation
    Inventor: Katsuki Minamino