Correlation Patents (Class 704/237)

Method and system for accent correction

Patent number: 8175882

Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.

Type: Grant

Filed: January 25, 2008

Date of Patent: May 8, 2012

Assignee: International Business Machines Corporation

Inventors: Sara H. Basson, Dimitiri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
FRONT-END PROCESSOR FOR SPEECH RECOGNITION, AND SPEECH RECOGNIZING APPARATUS AND METHOD USING THE SAME

Publication number: 20120095762

Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.

Type: Application

Filed: October 19, 2011

Publication date: April 19, 2012

Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
Method and apparatus for automatically recognizing audio data

Patent number: 8140329

Abstract: A method and apparatus are proposed for automatically recognizing observed audio data. An observation vector is created of audio features extracted from the observed audio data and the observed audio data is recognized from the observation vector. The audio features include features are selected from a group of 3 types of features obtained from the observed audio data: (i) ICA features obtained by processing the observed audio data, (ii) first MFCC features obtained by removing a logarithm step from the conventional MFCC process, or (iii) second MFCC features obtained by applying the ICA process to results of a mel scale filter bank.

Type: Grant

Filed: April 5, 2004

Date of Patent: March 20, 2012

Assignee: Sony Corporation

Inventors: Jian Zhang, Wei Lu, Xiaobing Sun
Free text matching system and method

Patent number: 8103506

Abstract: The present disclosure provides method and system for converting a free text expression of an identity to a phonetic equivalent code. The conversion follows a set of rules based on phonetic groupings and compresses the expression to a shorter series of characters than the expression. The phonetic equivalent code may be compared to one or more other phonetic equivalent code to establish a correlation between the codes. The phonetic equivalent code of the free text expression may be associated with the code of a known identity. The known identity may be provided to a user for confirmation of the identity. Further, a plurality of expressions stored in a database may be consolidated by converting the expressions to phonetic equivalent codes, comparing the codes to find correlations, and if appropriate reducing the number of expressions or mapping the expressions to a fewer number of expressions.

Type: Grant

Filed: September 20, 2007

Date of Patent: January 24, 2012

Assignee: United Services Automobile Association

Inventors: Gregory Brian Meyer, James Elden Nicholson
SPEECH DETECTOR

Publication number: 20110288864

Abstract: A method for detecting speech using a first microphone adapted to produce a first signal (x), and a second microphone adapted to produce a second signal (x2), the method comprising the steps of: (i) applying gain to the second signal to produce a normalised second signal, which signal is normalised relative to the first signal; (ii) constructing one or more signal components from the first signal and the normalised second signal; (iii) constructing an adaptive differential microphone (ADM) having a constructed microphone response constructed from the one or more signal components which response has at least one directional null; (iv) producing one or more ADM outputs (yf, yb) from the constructed microphone response in response to detected sound; (v) computing a ratio of a parameter of either a first signal component or a constructed microphone response to a parameter of an output of the ADM; (vi) comparing the ratio to an adaptive threshold value; (vii) detecting speech if the ratio is greater than or equ

Type: Application

Filed: November 19, 2010

Publication date: November 24, 2011

Applicant: NXP B.V.

Inventors: Patrick Kechichian, Cornelis Pieter Janse, Rene Martinus Maria Derkx, Wouter Joos Tirry
Dialog processing system, dialog processing method and computer program

Patent number: 8060365

Abstract: A dialog processing system which includes a target expression data extraction unit for extracting a plurality of target expression data each including a pattern matching portion which matches an utterance pattern, which are inputted by an utterance pattern input unit and is an utterance structure derived from contents of field-independent general conversations, among a plurality of utterance data which are inputted by an utterance data input unit and obtained by converting contents of a plurality of conversations in one field; a feature extraction unit for retrieving the pattern matching portions, respectively, from the plurality of target expression data extracted, and then for extracting feature quantity common to the plurality of pattern matching portions; and a mandatory data extraction unit for extracting mandatory data in the one field included in the plurality of utterance data by use of the feature quantities extracted.

Type: Grant

Filed: July 3, 2008

Date of Patent: November 15, 2011

Assignee: Nuance Communications, Inc.

Inventors: Nobuyasu Itoh, Shiho Negishi, Hironori Takeuchi
SPEECH-BASED SPEAKER RECOGNITION SYSTEMS AND METHODS

Publication number: 20110276323

Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.

Type: Application

Filed: May 6, 2010

Publication date: November 10, 2011

Applicant: Senam Consulting, Inc.

Inventor: Serge Olegovich Seyfetdinov
SPEECH DETECTION METHOD

Publication number: 20110231186

Abstract: A speech detection method is presented, which includes the following steps. A first voice captured device samples a first signal and a second voice captured device samples a second signal. The first voice captured device is closer to a speech signal source than the second voice captured device. A first energy corresponding to the first signal within an interval is calculated, a second energy corresponding to the second signal within the interval is calculated, and a first ratio is calculated according to the first energy and the second energy. The first ratio is transformed into a second ratio. A threshold value is set. It is determined whether the speech signal source is detected by comparing the second ratio and the threshold value.

Type: Application

Filed: July 30, 2010

Publication date: September 22, 2011

Applicant: ISSC TECHNOLOGIES CORP.

Inventors: Ying Tsung Lin, Yung Chen Ting, Pansop Kim
Parameter learning in a hidden trajectory model

Patent number: 8010356

Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

Type: Grant

Filed: February 17, 2006

Date of Patent: August 30, 2011

Assignee: Microsoft Corporation

Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
System and method for personalized text-to-voice synthesis

Patent number: 7974392

Abstract: A communication device and method are provided for audibly outputting a received text message to a user, the text message being received from a sender. A text message to present audibly is received. An output voice to present the text message is retrieved, wherein the output voice is synthesized using predefined voice characteristic information to represent the sender's voice. The output voice is used to audibly present the text message to the user.

Type: Grant

Filed: March 2, 2010

Date of Patent: July 5, 2011

Assignee: Research In Motion Limited

Inventor: Eric Ng
Apparatus, method and computer program product for recognizing speech

Patent number: 7974844

Abstract: A speech recognition apparatus includes a first-candidate selecting unit that selects a recognition result of a first speech from first recognition candidates based on likelihood of the first recognition candidates; a second-candidate selecting unit that extracts recognition candidates of a object word contained in the first speech and recognition candidates of a clue word from second recognition candidates, acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, and selects a recognition result of the second speech based on the acquired relevance ratio; a correction-portion identifying unit that identifies a portion corresponding to the object word in the first speech; and a correcting unit that corrects the word on identified portion.

Type: Grant

Filed: March 1, 2007

Date of Patent: July 5, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventor: Kazuo Sumita
GENDER DETECTION IN MOBILE PHONES

Publication number: 20110153317

Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.

Type: Application

Filed: December 23, 2009

Publication date: June 23, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: Yinian Mao, Gene Marsh
Method and apparatus for determining a bit boundary of a repetition-coded signal

Patent number: 7917362

Abstract: A method for determining a bit boundary of a repetition-coded signal including bits each having a plurality of epochs includes (a) counting the epochs repeatedly from an initial number to a predetermined number in a predetermined time, (b) sensing sign changes in the epochs, (c) recording each sensed sign change with a weighting function to a corresponding counting number of the epoch, and (d) determining the bit boundary according to a result of step (c).

Type: Grant

Filed: April 19, 2006

Date of Patent: March 29, 2011

Assignee: MediaTek Inc.

Inventor: Jia-Horng Shieh
METHODS AND SYSTEMS FOR SEARCHING AUDIO RECORDS

Publication number: 20110019805

Abstract: Methods and systems are provided for searching audio records. Certain embodiments of the invention may be applied to search audio records containing a user's voice for instances where a specific sound, such as a word or phrase, is vocalized by the user. An audio sample is provided by recording the user vocalizing the sound. The audio sample is compared with the audio records to locate matches to the audio sample. In some embodiments, the audio records comprise recordings of calls between a near-end caller and a far-end caller, and the audio sample is a recording of a sound spoken by the near-end caller. The same input device may be used to record both the audio sample and the audio records.

Type: Application

Filed: January 14, 2009

Publication date: January 27, 2011

Applicant: ALGO COMMUNICATION PRODUCTS LTD.

Inventor: Paul William Zoehner
Systems and Methods for Measuring Speech Intelligibility

Publication number: 20100299148

Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.

Type: Application

Filed: March 29, 2010

Publication date: November 25, 2010

Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
Device control, speech recognition device, agent device, control method

Patent number: 7822614

Abstract: A language analyzer performs speech recognition on a speech input by a speech input unit, specifies a possible word which is represented by the speech, and the score thereof, and supplies word data representing them to an agent processing unit. The agent processing unit stores process item data which defines a data acquisition process to acquire word data or the like, a discrimination process, and an input/output process, and wires or data defining transition from one process to another and giving a weighting factor to the transition, and executes a flow represented generally by the process item data and the wires to thereby control devices belonging to an input/output target device group. To which process in the flow the transition takes place is determined by the weighting factor of each wire, which is determined by the connection relationship between a point where the process has proceeded and the wire, and the score of word data.

Type: Grant

Filed: December 6, 2004

Date of Patent: October 26, 2010

Assignee: Kabushikikaisha Kenwood

Inventor: Rika Koyama
Covariance estimation for pattern recognition

Patent number: 7805301

Abstract: A reliable full covariance matrix estimation algorithm for pattern unit's state output distribution in pattern recognition system is discussed. An intermediate hierarchical tree structure is built to relate models for product units. Full covariance matrices of pattern unit's state output distribution are estimated based on all the related nodes in the tree.

Type: Grant

Filed: July 1, 2005

Date of Patent: September 28, 2010

Assignee: Microsoft Corporation

Inventors: Ye Tian, Frank Kao-Ping Soong, Jian-Lai Zhou
Biometrics-based cryptographic key generation system and method

Patent number: 7804956

Abstract: The present invention provides a biometrics-based cryptographic key generation system and method. A user-dependent distinguishable feature transform unit provides a feature transformation for each authentic user, which receives N-dimensional biometric features and performs a feature transformation to produce M-dimensional feature signals, such that the transformed feature signals of the authentic user are compact in the transformed feature space while those of other users presumed as imposters are either diverse or far away from those of the authentic user. A stable key generation unit receives the transformed feature signals to produce a cryptographic key based on bit information respectively provided by the M-dimensional feature signals, wherein the length of the bit information provided by the feature signal of each dimension is proportional to the degree of distinguishability in the dimension.

Type: Grant

Filed: March 11, 2005

Date of Patent: September 28, 2010

Assignee: Industrial Technology Research Institute

Inventors: Yao-Jen Chang, Tsu-Han Chen, Wen-De Zhang
Method for controlling a relational database system

Patent number: 7774337

Abstract: A method for controlling a relational database system, with a query statement comprised of keywords being analyzed, with the RTN being formed of independent RTN building blocks. Each RTN building block has an inner, directed decision graph which is defined independently from the inner, directed decision graphs of the other RTN building blocks with at least one decision position along at least one decision path. The inner decision graphs of all RTN building blocks are run by means of the keywords in a selection step and all possible paths of this decision graph are followed until either no match with the respectively selected path is determined by the decision graph and the process is interrupted, or the respectively chosen path is run until the end.

Type: Grant

Filed: July 10, 2007

Date of Patent: August 10, 2010

Assignee: Mediareif Moestl & Reif Kommunikations-und Informationstechnologien OEG

Inventor: Matthias Moestl
Apparatus, method, and medium for processing audio signal using correlation between bands

Patent number: 7756715

Abstract: Apparatus, method, and medium for processing an audio signal using a correlation between bands are provided. The apparatus includes an encoding unit encoding an input audio signal and a decoding unit decoding the encoded input audio signal.

Type: Grant

Filed: November 17, 2005

Date of Patent: July 13, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Junghoe Kim, Dohyung Kim, Sihwa Lee
Correlating call data and speech recognition information in a telephony application

Patent number: 7738635

Abstract: A method for improving the recognition confidence of alphanumeric spoken input, suitable for use in a speech recognition telephony application such as a voice response system. An alphanumeric candidate is determined from the spoken input, which may be the best available representation of the spoken input. Recognition confidence is compared with a preestablished threshold. If the recognition confidence exceeds the threshold, the alphanumeric candidate is selected to represent the spoken input. Otherwise, present call data associated with the spoken input is determined. Call data may include automatic number identification (ANI) information, caller-ID information, and/or dialed number information service (DNIS) information. Information associated with the alphanumeric candidate and information associated with the present call data are correlated in order to select alphanumeric information that best represents the spoken input.

Type: Grant

Filed: January 6, 2005

Date of Patent: June 15, 2010

Assignee: International Business Machines Corporation Nuance Communications, Inc.

Inventors: Christopher Ryan Groves, Kevin James Muterspaugh
Auto segmentation based partitioning and clustering approach to robust endpointing

Patent number: 7680657

Abstract: Possible segmentations for an audio signal are scored based on distortions for feature vectors of the audio signal and the total number of segments in the segmentation. The scores are used to select a segmentation and the selected segmentation is used to identify a starting point and an ending point for a speech signal in the audio signal.

Type: Grant

Filed: August 15, 2006

Date of Patent: March 16, 2010

Assignee: Microsoft Corporation

Inventors: Yu Shi, Frank Kao-ping Soong, Jian-Iai Zhou
Adapter for allowing both online and offline training of a text to text system

Patent number: 7624020

Abstract: An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.

Type: Grant

Filed: September 9, 2005

Date of Patent: November 24, 2009

Assignee: Language Weaver, Inc.

Inventors: Kenji Yamada, Kevin Knight, Greg Langmead
Robust quantization with efficient WMSE search of a sign-shape codebook using illegal space

Patent number: 7610198

Abstract: A method of searching a signed codebook to quantize a vector includes weighting a shape codevector in a set of shape codevectors with a weighting function for a Weighted Mean Square Error (WMSE) criteria, to produce a weighted shape codevector. The method further includes correlating the weighted shape codevector with the vector to produce a weighted correlation term. The method also includes determining, based on a sign of the weighted correlation term, a preferred one of a positive and a negative signed codevector associated with the shape codevector. The method further includes determining whether one of the signed codevectors does not belong to an illegal space defining illegal vectors.

Type: Grant

Filed: June 7, 2002

Date of Patent: October 27, 2009

Assignee: Broadcom Corporation

Inventor: Jes Thyssen
Speech recognition grammar creating apparatus, control method therefor, program for implementing the method, and storage medium storing the program

Patent number: 7603269

Abstract: A speech recognition grammar creating apparatus, which is capable of eliminating complex labor associated with preparing all rules by taking into account changes of the order of component elements of a speech-recognizing object and possible combinations of component elements including at least one component element that can be omitted. In the speech recognition grammar creating apparatus, an image edit section groups together at least one component element that cannot be omitted and at least one component element that can be omitted, as the speech-recognizing object, into a component element group as an omission-allowed group. An augmented BNF converting section creates the speech recognition grammar by expanding the component element group obtained by the grouping.

Type: Grant

Filed: June 29, 2005

Date of Patent: October 13, 2009

Assignee: Canon Kabushiki Kaisha

Inventors: Kazue Kaneko, Michio Aizawa
Current noise spectrum estimation method and apparatus with correlation between previous noise and current noise signal

Patent number: 7596495

Abstract: A method is provided for recurrently estimating a spectrum of noise at each signal observation interval from a sound signal which contains the noise and which is observed at each signal observation interval. In the method, there are acquired an envelope of a previous spectrum of the noise which has been previously estimated from the sound signal observed at a previous signal observation interval, and an envelope of a current spectrum of the sound signal which is observed at a current signal observation interval subsequent to the previous signal observation interval. Then, a value of correlation is computed between the envelop of the previous spectrum of the noise and the envelope of the current spectrum of the sound signal. A current spectrum of the noise contained in the sound signal observed at the current signal observation interval is estimated in accordance with the computed value of the correlation and based on the previous spectrum of the noise and the current spectrum of the sound signal.

Type: Grant

Filed: March 29, 2005

Date of Patent: September 29, 2009

Assignee: Yamaha Corporation

Inventors: Michiko Kazama, Mikio Tohyama, Toru Hirai
Combined input processing for a computing device

Patent number: 7496513

Abstract: Input is received from at least two different input sources. Information from these sources are combined together to provide a result. In a particular example, input from one source corresponds to potential recognition candidates, and input from another source corresponds to other potential candidates. These candidates are combined together to select a result.

Type: Grant

Filed: June 28, 2005

Date of Patent: February 24, 2009

Assignee: Microsoft Corporation

Inventors: Frank Kao-Ping Soong, Jian-Lai Zhou, Ye Tian
Digital signal processing method, learning method, apparatuses for them, and program storage medium

Patent number: 7412384

Abstract: A digital signal processing method and learning method and devices therefor, and a program storage medium which are capable of further improving the waveform reproducibility of a digital signal. Self correlation coefficients are calculated by cutting parts out of the digital signal by multiple windows having different sizes, and the parts are classified based on the calculation results of the self correlation coefficients. Then, the digital signal is converted by the prediction method corresponding to the classified class, so that the conversion further suitable for the features of the digital signal can be conducted.

Type: Grant

Filed: July 31, 2001

Date of Patent: August 12, 2008

Assignee: Sony Corporation

Inventors: Tetsujiro Kondo, Tsutomu Watanabe
Multiple step adaptive method for time scaling

Patent number: 7337109

Abstract: A multiple step adaptive method for time scaling. Synthesizing S3[n] signal from signal S1[n]signal and S2[n]signal. Comprising following steps: (a) calculating a first magnitude of a cross-correlation function of S1[n]signal and S2[n]signal according to a first index; (b) comparing the first magnitude with a threshold value; (c) if first magnitude is smaller than threshold value, calculating a first reference magnitude of cross-correlation function of S1[n]signal and S2[n]signal according to a first reference index behind the first index by a first determined number, or calculating a second reference magnitude of the cross-correlation function of the S1[n] signal and the S2[n] signal according to a second reference index behind the first index by a second number; (d) synthesizing the S3[n] signal by adding S1[n]signal to the S2[n] signal in accordance with a maximum index corresponding to a largest magnitude among all the magnitudes calculated in (c).

Type: Grant

Filed: October 2, 2003

Date of Patent: February 26, 2008

Assignee: ALI Corporation

Inventor: Gin-Der Wu
Audience survey system, and system and methods for compressing and correlating audio signals

Patent number: 7284255

Abstract: A system and method are disclosed for performing audience surveys of broadcast audio from radio and television. A small body-worn portable collection unit samples the audio environment of the survey member and stores highly compressed features of the audio programming. A central computer simultaneously collects the audio outputs from a number of radio and television receivers representing the possible selections that a survey member may choose. On a regular schedule the central computer interrogates the portable units used in the survey and transfers the captured audio feature samples. The central computer then applies a feature pattern recognition technique to identify which radio or television station the survey member was listening to at various times of day. This information is then used to estimate the popularity of the various broadcast stations.

Type: Grant

Filed: November 16, 1999

Date of Patent: October 16, 2007

Inventors: Steven G. Apel, Stephen C. Kenyon
Pattern matching method and apparatus

Patent number: 7212968

Abstract: A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.

Type: Grant

Filed: October 25, 2000

Date of Patent: May 1, 2007

Assignee: Canon Kabushiki Kaisha

Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
Determination of the time relation between speech signals affected by time warping

Patent number: 7139705

Abstract: A method of determining the time relation between an original or input speech signal (10) and an output speech signal (15) affected by time warping in a communications system, such as a VoIP (Voice over Internet Protocol) system. Wherein corresponding speech bursts (11, 12; 16, 17) of the input (10) and output speech signal (15) are located in accordance with a predefined signal property thereof. The corresponding speech bursts (11, 12; 16, 17) thus located and time aligned (10, 30) for the correction of continuous and discontinuous warping effects. A performance estimate is generated by comparing the time aligned input and output speech signals (10, 30) applying cross-correlation techniques and PSQM (Perceptual Speech Quality Measure) or PSQM+ (Enhanced Perceptual Speech Quality Measure) techniques.

Type: Grant

Filed: November 13, 2000

Date of Patent: November 21, 2006

Assignee: Koninklijke KPN N.V.

Inventors: John Gerard Beerends, Andries Pieter Hekstra
Optical processor enhanced receiver architecture (opera)

Patent number: 7130292

Abstract: A method and apparatus for enhancing the receiving and information identification functions of multiple access communications systems by employing one or more optical processors configured as a bank of 1-D correlators. The present invention is particularly useful in a DS/SS CDMA communications system, resulting in a multiuser CDMA system that approaches carrier to noise performance (C/N) as opposed to being limited by multiple access interference (MAI). The correlators are arranged in parallel to detect and/or demodulate the received signal, in conjunction with one or more complex algorithms to perform near-optimum multiuser detection, perform multipath combining and/or perform carrier Doppler compensation.

Type: Grant

Filed: January 19, 2001

Date of Patent: October 31, 2006

Assignee: Essex Corporation

Inventors: Terry M. Turpin, James L. Lafuse
Systems and methods for correlating images in an image correlation system with reduced computational loads

Patent number: 6996291

Abstract: After one or both of a pair of images are obtained, an auto-correlation function for one of those images is generated to determine a smear amount and possibly a smear direction. The smear amount and direction are used to identify potential locations of a peak portion of the correlation function between the pair of images. The pair of images is then correlated only at offset positions corresponding to the one or more of the potential peak locations. In some embodiments, the pair of images is correlated according to a sparse set of image correlation function value points around the potential peak locations. In other embodiments, the pair of images is correlated at a dense set of correlation function value points around the potential peak locations. The correlation function values of these correlation function value points are then analyzed to determine the offset position of the true correlation function peak.

Type: Grant

Filed: August 6, 2001

Date of Patent: February 7, 2006

Assignee: Mitutoyo Corporation

Inventor: Michael Nahum
Low power passive correlators for multichannel global positioning system signal receiver

Patent number: 6965631

Abstract: One embodiment of the present invention includes a circular shift register, K storage elements, and a code register. The circular shift register having N data samples circularly shifts a first data sample of the N data samples into a data position at a first clock frequency. The N data samples correspond to signal received from one of K satellites in a global positioning system (GPS). The N data samples are loaded into the circular shift register at a second clock frequency. The K storage elements store K code sequences, respectively. Each of the K code sequences has N code samples and includes a first code sample being written at a code position corresponding to the data position at a third clock frequency. The K storage elements correspond to the K satellites. The code register stores the N code samples loaded from one of the K storage elements at a fourth clock frequency. The fourth clock frequency is K times faster than the first clock frequency.

Type: Grant

Filed: March 13, 2001

Date of Patent: November 15, 2005

Assignee: PRI Research & Development Corp.

Inventors: Kaveh Shakeri, Alireza Mehrnia, Farshid Soheili-Najafabadi
Apparatus and method for speaker normalization based on biometrics

Patent number: 6823305

Abstract: Speaker normalization is carried out based on biometric information available about a speaker, such as his height, or a dimension of a bodily member or article of clothing. The chosen biometric parameter correlates with the vocal tract length. Speech can be normalized based on the biometric parameter, which thus indirectly normalizes the speech based on the vocal tract length of the speaker. The inventive normalization can be used in model formation, or in actual speech recognition usage, or both. Substantial improvements in accuracy have been noted at little cost. The preferred biometric parameter is height, and the preferred form of scaling is linear scaling with the scale factor proportional to the height of the speaker.

Type: Grant

Filed: December 21, 2000

Date of Patent: November 23, 2004

Assignee: International Business Machines Corporation

Inventor: Ellen M. Eide
Methods and apparatus for blind channel estimation based upon speech correlation structure

Patent number: 6687672

Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.

Type: Grant

Filed: March 15, 2002

Date of Patent: February 3, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Speech recognition system and program thereof

Publication number: 20030225581

Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.

Type: Application

Filed: March 14, 2003

Publication date: December 4, 2003

Applicant: International Business Machines Corporation

Inventors: Tetsuya Takiguchi, Masafumi Nishimura
Information processing method, information processing system, information processing apparatus, health care terminal apparatus, and recording medium

Publication number: 20030182117

Abstract: A question as to the physical condition and a question as to the feeling are asked, answers are accepted by voice, acoustic features are extracted from the answer to the question as to the physical condition, and character string information is extracted from the answer to the question as to the feeling. A correlation between the acoustic features and the character string information is set, and character string information is identified from a newly accepted acoustic feature, thereby performing feeling presumption. Feeling is presumed from the voice in response to the subject's severalty and changes in the subject's age and physical condition. A medical examination by interview on the mental condition is performed on the subject by voice output and voice input, and the subject's mental condition is diagnosed based on the contents of the subject's answer and analysis of the answer by voice.

Type: Application

Filed: January 31, 2003

Publication date: September 25, 2003

Applicant: SANYO ELECTRIC CO., LTD.

Inventors: Rie Monchi, Masakazu Asano, Hirokazu Genno
Speech recognition method and device, speech synthesis method and device, recording medium

Publication number: 20030093273

Abstract: There are provided: a data generating section 3 which differentiates an input aural signal, detects as a sample point a point where a differentiating value satisfies a predetermined condition, and obtains discrete amplitude data on detected sample points and timing data indicative of a time interval between the sample points, and a correlating section 4 for computing correlation data by using the amplitude data and the timing data. Input speech is recognized by matching correlation data, which is generated for input speech by a correlating section 4, with correlation data which is generated in the same manner in advance for a variety of speech and is stored in a data memory 6.

Type: Application

Filed: October 3, 2002

Publication date: May 15, 2003

Inventor: Yukio Koyanagi
Speech processing apparatus and method

Patent number: 6560575

Abstract: An apparatus is provided for checking the consistency between two training words which can be used in, for example, a speech recognition or verification system. Two training examples are aligned using a dynamic programming alignment process and an average frame score is calculated from the alignment results together with the worst score in a number of consecutive frames. These values are then compared with similar values obtained from training examples which are known to be consistent to determine if the training examples are consistent.

Type: Grant

Filed: September 30, 1999

Date of Patent: May 6, 2003

Assignee: Canon Kabushiki Kaisha

Inventor: Robert Alexander Keiller
Grammars for speech recognition

Publication number: 20030009331

Abstract: Pre-computed context-dependent phoneme representations of a number of constituents of a grammar are processed dynamically by a speech recognizer. The approach provides a configurable tradeoff between data size and recognition-time computation. This tradeoff can be obtained without sacrificing recognition accuracy, and in particular, allows full modeling of all cross-word phoneme contexts. In one aspect of the invention, a specification of a grammar is processed. This specification includes specifications of a number of constituents of the grammar. A first subset of the constituents of the grammar are selected, and the remaining of the constituents form a second subset. For each of the constituents in the first subset the method first includes processing the specification of the constituent to form a first processed representation that defines sequences of elements that are associated with that constituent and that includes words and references to constituents in the first subset.

Type: Application

Filed: July 16, 2001

Publication date: January 9, 2003

Inventors: Johan Schalkwyk, Michael S. Phillips
Digital signal processing method, learning method,apparatuses for them ,and program storage medium

Publication number: 20020184018

Abstract: To propose a digital signal processing method and learning method and devices therefor, and a program storage medium which are capable of further improving the waveform reproducibility of a digital signal. Self correlation coefficients D40 and D41 are calculated respectively by cutting parts out of the digital signal D10 by multiple windows having different sizes, and the parts are classified based on the calculation results D15 of the self correlation coefficients D40 and D41 and then, the digital signal D10 is converted by the prediction method corresponding to the classified class, so that the conversion further suitable for the features of the digital signal D10 can be conducted.

Type: Application

Filed: March 29, 2002

Publication date: December 5, 2002

Inventor: Tetsujiro Kondo
Method and apparatus for clustering-based signal segmentation

Patent number: 6314392

Abstract: In a computerized method a continuous signal is segmented in order to determine statistically stationary units of the signal. The continuous signal is sampled at periodic intervals to produce a timed sequence of digital samples. Fixed numbers of adjacent digital samples are grouped into a plurality of disjoint sets or frames. A statistical distance between adjacent frames is determined. The adjacent sets are merged into a larger set of samples or cluster if the statistical distance is less than a predetermined threshold. In an iterative process, the statistical distance between the adjacent sets are determined, and as long as the distance is less than the predetermined threshold, the sets are iteratively merged to segment the signal into statistically stationary units.

Type: Grant

Filed: September 20, 1996

Date of Patent: November 6, 2001

Assignee: Digital Equipment Corporation

Inventors: Brian S. Eberman, William D. Goldenthal
Reference pattern learning system

Patent number: 6275799

Abstract: A first parameter set constituting reference patterns of each category in speech recognition based on pattern matching with a reference pattern is to be determined from a plurality of learning utterance data. The first parameter set is determined so that a third evaluation function, represented by a sum of a first evaluation function and a second evaluation function is maximized. The first evaluation function represents a matching degree between all learning utterances and corresponding reference patterns. The second evaluation function represents a matching degree between elements of the first parameter set.

Type: Grant

Filed: February 2, 1995

Date of Patent: August 14, 2001

Assignee: NEC Corporation

Inventor: Ken-ichi Iso
Wavelet-based energy binning cepstal features for automatic speech recognition

Patent number: 6253175

Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves “synchrosqueezing” spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as “K-mean Wastrum.” In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as “formant-based wastrum.” Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added.

Type: Grant

Filed: November 30, 1998

Date of Patent: June 26, 2001

Assignee: International Business Machines Corporation

Inventors: Sankar Basu, Stephane H. Maes
Speech quality measurement based on radio link parameters and objective measurement of received speech signals

Patent number: 6201960

Abstract: An improved method and system of measuring the perceived speech quality in mobile telecommunications networks is disclosed herein. In an embodiment of the invention, the method uses both radio link parameters and an objective measuring technique performed on received signals to estimate the speech quality perceived by the end-user. A radio link processing stage extracts temporal information from a set of available radio link parameters such as the BER, FER, RxLev, handover statistics, soft information, and speech energy. Concurrently, a speech processing stage is used to process a sequence of original signals and received signals, obtained from the output of a telecommunications system. The signal sequences are processed by an objective measuring technique such as Perceptual Speech Quality Measure (PSQM). The outputs from the radio link processing and speech processing stages are utilized to calculate an estimate for speech quality.

Type: Grant

Filed: June 24, 1997

Date of Patent: March 13, 2001

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Tor Björn Minde, Anders Tomas Uvliden, Per Anders Karlsson, Per Gunnar Heikkilä
System and method for sampling rate transformation in speech recognition

Patent number: 6199041

Abstract: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.

Type: Grant

Filed: November 20, 1998

Date of Patent: March 6, 2001

Assignee: International Business Machines Corporation

Inventors: Fu-Hua Liu, Michael A. Picheny
Speech quality measurement in mobile telecommunication networks based on radio link parameters

Patent number: 6157830

Abstract: A method and system for measuring the speech quality in a mobile cellular telecommunications network using available radio link parameters is disclosed herein. In a preferred embodiment, the method includes receiving a set of radio link parameters, as defined in a standard or otherwise available, such as the BER, FER, RxLev, handover statistics, soft information, and speech energy. Temporal information is obtained from the radio link parameters to create a set of temporal parameters which can be statistically analyzed, for example, for the maximum and minimum, mean, standard deviation, and autocorrelation values for a time interval. The temporal parameters are combined to yield a set of correlated parameters that are more closely related to the speech quality. An estimator then uses the correlated parameters to calculate an estimate for the speech quality. The method of the present invention takes advantage of temporal information and correlated relationships from the transmitted parameters.

Type: Grant

Filed: May 22, 1997

Date of Patent: December 5, 2000

Assignee: Telefonaktiebolaget LM Ericsson

Inventors: Tor Bjorn Minde, Anders Tomas Uvliden, Per Anders Karlsson, Per Gunnar Heikkil.ang.
Word and pattern recognition through overlapping hierarchical tree defined by relational features

Patent number: 5787395

Abstract: A voice recognizing method in which a plurality of voice recognition objective words are provided. Scores are accumulated for an unknown input voice signal as compared to the voice recognition objective words by using parameters which are calculated in advance. Upon receipt of an unknown voice signal, a corresponding voice recognition objective word is extracted and recognized. The voice recognition objective words are structured into an overlapping hierarchical structure by using correlation values between each pair of voice recognition objective words. This correlation may be computed from acoustic features, HMM parameters or the like. Score calculation is performed on the unknown input voice signal by using a dictionary of the voice recognition objective words structured in the hierarchical structure. Upon preliminary recognition, the dictionary of the voice recognition objective words is resorted without recalculation of the correlation values.

Type: Grant

Filed: July 18, 1996

Date of Patent: July 28, 1998

Assignee: Sony Corporation

Inventor: Katsuki Minamino

prev 1 2 3 next