Preliminary Matching Patents (Class 704/247)

Methods and systems for secured access to devices and systems

Patent number: 7864987

Abstract: An access system in one embodiment that first determines that someone has correct credentials by using a non-biometric authentication method such as typing in a password, presenting a Smart card containing a cryptographic secret, or having a valid digital signature. Once the credentials are authenticated, then the user must take at least two biometric tests, which can be chosen randomly. In one approach, the biometric tests need only check a template generated from the user who desires access with the stored templates matching the holder of the credentials authenticated by the non-biometric test. Access desirably will be allowed when both biometric tests are passed.

Type: Grant

Filed: April 18, 2006

Date of Patent: January 4, 2011

Assignee: Infosys Technologies Ltd.

Inventors: Kumar Balepur Venkatanna, Rajat Moona, S V Subrahmanya
Handheld electronic device and method for dual-mode disambiguation of text input

Patent number: 7843364

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.

Type: Grant

Filed: December 30, 2008

Date of Patent: November 30, 2010

Assignee: Research In Motion Limited

Inventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
Replacing text representing a concept with an alternate written form of the concept

Patent number: 7831423

Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.

Type: Grant

Filed: May 25, 2006

Date of Patent: November 9, 2010

Assignee: Multimodal Technologies, Inc.

Inventor: Kjell Schubert
Target specific data filter to speed processing

Patent number: 7831424

Abstract: A method is presented which reduces data flow and thereby increases processing capacity while preserving a high level of accuracy in a distributed speech processing environment for speaker detection. The method and system of the present invention includes filtering out data based on a target speaker specific subset of labels using data filters. The method preserves accuracy and passes only a fraction of the data by optimizing target specific performance measures. Therefore, a high level of speaker recognition accuracy is maintained while utilizing existing processing capabilities.

Type: Grant

Filed: April 2, 2008

Date of Patent: November 9, 2010

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Juan M. Huerta, Ganesh N. Ramaswamy, Olivier Verscheure
Method and apparatus for training a text independent speaker recognition system using speech data with text labels

Patent number: 7813927

Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.

Type: Grant

Filed: June 4, 2008

Date of Patent: October 12, 2010

Assignee: Nuance Communications, Inc.

Inventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
Multiple sound fragments processing and load balancing

Patent number: 7788097

Abstract: A method, system and article of manufacture of recognizing a voice command. One embodiment of the invention comprises: receiving a voice input; using the number of sound fragments, determining a number of sound fragments to be processed in a first set of sound fragments; determining whether the first set of sound fragments of the voice input matches with the first set of sound fragments of a voice command; and if the first set of sound fragments matches with the first set of sound fragments of the voice command, then determining whether one or more remaining sound fragments matches with one or more remaining sound fragments of the voice command.

Type: Grant

Filed: October 31, 2006

Date of Patent: August 31, 2010

Assignee: Nuance Communications, Inc.

Inventors: Joseph H. McIntyre, Victor S. Moore
Speaker recognition in a multi-speaker environment and comparison of several voice prints to many

Patent number: 7778832

Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.

Type: Grant

Filed: September 26, 2007

Date of Patent: August 17, 2010

Assignee: American Express Travel Related Services Company, Inc.

Inventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
Quantizing feature vectors in decision-making applications

Patent number: 7769583

Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.

Type: Grant

Filed: May 13, 2006

Date of Patent: August 3, 2010

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
System and method for rescoring N-best hypotheses of an automatic speech recognition system

Patent number: 7761296

Abstract: A system and method for rescoring the N-best hypotheses from an automatic speech recognition system by comparing an original speech waveform to synthetic speech waveforms that are generated for each text sequence of the N-best hypotheses. A distance is calculated from the original speech waveform to each of the synthesized waveforms, and the text associated with the synthesized waveform that is determined to be closest to the original waveform is selected as the final hypothesis. The original waveform and each synthesized waveform are aligned to a corresponding text sequence on a phoneme level. The mean of the feature vectors which align to each phoneme is computed for the original waveform as well as for each of the synthesized hypotheses.

Type: Grant

Filed: April 2, 1999

Date of Patent: July 20, 2010

Assignee: International Business Machines Corporation

Inventors: Raimo Bakis, Ellen M. Eide
SIMILAR SPEAKER RECOGNITION METHOD AND SYSTEM USING NONLINEAR ANALYSIS

Publication number: 20100145697

Abstract: Disclosed herein is a similar speaker recognition method and system using nonlinear analysis. The recognition method extracts a nonlinear feature of a sound signal through nonlinear analysis of the sound signal and combines the nonlinear feature with a linear feature such as spectrum. The method transforms sound data in a time domain into status vectors in a phase domain and uses a nonlinear time series analysis method capable of representing nonlinear features of the status vectors to extract nonlinear information of a sound. The method can overcome technical limitations of conventional linear algorithms. The recognition method can be applied to sound-related application systems other than speaker recognition systems.

Type: Application

Filed: October 28, 2009

Publication date: June 10, 2010

Applicant: IUCF-HYU INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY

Inventors: Young-Hun Kwon, Kun-Sang Lee, Sung-IL Yang, Sung-Wook Chang, Jung-Pa Seo, Min-Su Kim, In-Chan Baek
Device,system, and method of liveness detection utilizing voice biometrics

Publication number: 20100131273

Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.

Type: Application

Filed: November 25, 2009

Publication date: May 27, 2010

Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
Method and apparatus for segmentation of audio interactions

Patent number: 7716048

Abstract: A method and apparatus for segmenting an audio interaction, by locating anchor segment from each side of the interaction, iteratively classifying additional segments into one of the two sides, and scoring the resulting segmentation, If the score result is below a threshold, the process is repeated until the segmentation score is satisfactory or until a stopping criterion is met. The anchoring and the scoring steps comprise using additional data associated with the interaction, a speaker thereof, internal or external information related to the interaction or to a speaker thereof or the like.

Type: Grant

Filed: January 25, 2006

Date of Patent: May 11, 2010

Assignee: Nice Systems, Ltd.

Inventors: Oren Pereg, Moshe Waserblat
SPEAKER SELECTING DEVICE, SPEAKER ADAPTIVE MODEL CREATING DEVICE, SPEAKER SELECTING METHOD, SPEAKER SELECTING PROGRAM, AND SPEAKER ADAPTIVE MODEL MAKING PROGRAM

Publication number: 20100114572

Abstract: To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment. A speaker score calculating means (22) calculates a long-time speaker score (log likelihood of each of a plurality of speaker models stored in a speaker model storage section (31) with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and calculates a short-time speaker score based on a short-time utterance, for example. A long-time speaker selecting means 23 selects speakers corresponding to a predetermined number of speaker models having a high long-time speaker score.

Type: Application

Filed: February 29, 2008

Publication date: May 6, 2010

Inventors: Masahiro Tani, Tadashi Emori, Yoshifumi Onishi
Speech recognition device and speech recognition method

Patent number: 7711560

Abstract: A speech recognition apparatus equipped with the garbage acoustic model storage unit storing the garbage acoustic model which learned the collection of unnecessary words. A feature value calculation unit calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis. A garbage acoustic score calculation unit calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model, and a garbage acoustic score correction unit corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit so as to raise it in the frame where the non-language speech is inputted.

Type: Grant

Filed: February 4, 2004

Date of Patent: May 4, 2010

Assignee: Panasonic Corporation

Inventors: Maki Yamada, Makoto Nishizaki, Yoshihisa Nakatoh, Shinichi Yoshizawa
Microphone initialization enhancement for speech recognition

Patent number: 7636661

Abstract: A method and arrangement for improved speech recognition in a telephonically challenging speakerphone in-car environment. The method includes receiving a signal from a microphone representative of speech to be recognized, performing detection of a transition in the signal indicative of switch on of the microphone, and, in response to the detection, performing speech recognition on the signal with reduced contribution from an initial portion thereof. The initial portion may be treated as optional speech, the speech recognition may be performed with a predetermined redundant sound, and a user may be requested to speak the predetermined redundant sound when speech recognition has fallen below a predetermined threshold. Thus, recognition may be made possible when otherwise it would not be possible, recognition match scoring will be increased as the low weighting given by deleted initial sounds will be eliminated and therefore confusion of the recognized phrase will be reduced.

Type: Grant

Filed: June 30, 2005

Date of Patent: December 22, 2009

Assignee: Nuance Communications, Inc.

Inventors: Adam Pieter De Leeuw, Steven Groeger, Stuart John Hayton
Method and apparatus for automatically generating a general extraction function calculable on an input signal, e.g. an audio signal to extract therefrom a predetermined global characteristic value of its contents, e.g. a descriptor

Patent number: 7624012

Abstract: The invention enables to generate a general function (4) which can operate on an input signal (Sx) to extract from the latter a value (DVex) of a global characteristic value expressing a feature (De) of the information conveyed by that signal. It operates by: generating at least one compound function (CF1-CFn), said compound function being generated from at least one of a set of elementary functions (EF1, EF2, . . .

Type: Grant

Filed: December 16, 2003

Date of Patent: November 24, 2009

Assignee: Sony France S.A.

Inventors: François Pachet, Aymeric Zils
Method and apparatus for determining the possibility of pattern recognition of time series signal

Patent number: 7603274

Abstract: A method and apparatus for determining the possibility of pattern recognition of time series signal independent of a pattern recognition ratio is provided. The method for determining the possibility of pattern recognition of time series signal includes extracting a time forward feature and a time reversed feature from an input signal having a time series pattern, generating time forward alignment and time reversed alignment by using the time forward feature and the time reversed feature, comparing the time forward alignment with the time reversed alignment to compute a likelihood of pattern recognition, and determining that the input signal can be recognized if the likelihood is larger than a predetermined threshold value.

Type: Grant

Filed: November 2, 2005

Date of Patent: October 13, 2009

Assignee: Samsung Electronics Co., Ltd.

Inventor: Kwangil Hwang
Speaker clustering and adaptation method based on the HMM model variation information and its apparatus for speech recognition

Patent number: 7590537

Abstract: A speech recognition method and apparatus perform speaker clustering and speaker adaptation using average model variation information over speakers while analyzing the quantity variation amount and the directional variation amount. In the speaker clustering method, a speaker group model variation is generated based on the model variation between a speaker-independent model and a training speaker ML model. In the speaker adaptation method, the model in which the model variation between a test speaker ML model and a speaker group ML model to which the test speaker belongs which is most similar to a training speaker group model variation is found, and speaker adaptation is performed on the found model. Herein, the model variation in the speaker clustering and the speaker adaptation are calculated while analyzing both the quantity variation amount and the directional variation amount. The present invention may be applied to any speaker adaptation algorithm of MLLR and MAP.

Type: Grant

Filed: December 27, 2004

Date of Patent: September 15, 2009

Assignee: Samsung Electronics Co., Ltd.

Inventors: Namhoon Kim, Injeong Choi, Yoonkyung Song
Handheld electronic device and method for dual-mode disambiguation of text input

Patent number: 7586423

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.

Type: Grant

Filed: June 30, 2006

Date of Patent: September 8, 2009

Assignee: Research In Motion Limited

Inventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
Bio-phonetic multi-phrase speaker identity verification

Patent number: 7567901

Abstract: Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score.

Type: Grant

Filed: April 13, 2007

Date of Patent: July 28, 2009

Assignee: AT&T Intellectual Property 1, L.P.

Inventor: Hisao M. Chang
Handheld electronic device and method for disambiguation of compound text input and for prioritizing compound language solutions according to completeness of text components

Patent number: 7545290

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to prioritize compound language solutions according to various criteria, including the degree of completeness of the text components of a compound language solution.

Type: Grant

Filed: January 13, 2006

Date of Patent: June 9, 2009

Assignee: Research In Motion Limited

Inventors: Vadim Fux, Michael Elizarov
Method and apparatus for generating and updating a voice tag

Patent number: 7471775

Abstract: A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.

Type: Grant

Filed: June 30, 2005

Date of Patent: December 30, 2008

Assignee: Motorola, Inc.

Inventor: Yan Ming Cheng
Voice authentication system

Patent number: 7447632

Abstract: A voice authentication system includes: a standard template storage part 17 in which a standard template that is generated from a registered voice of an authorized user and featured with a voice characteristic of the registered voice is stored preliminarily in a state of being associated with a personal ID of the authorized user; an identifier input part 15 that allows a user who intends to be authenticated to input a personal ID; a voice input part 11 that allows the user to input a voice; a standard template/registered voice selection part 16 that selects a standard template and a registered voice corresponding to the inputted identifier; a determination part 14 that refers to the selected standard template and determines whether or not the inputted voice is a voice of the authorized user him/herself and whether or not presentation-use information is to be outputted by referring to a predetermined determination reference; a presentation-use information extraction part 19 that extracts information regarding

Type: Grant

Filed: September 29, 2005

Date of Patent: November 4, 2008

Assignee: Fujitsu Limited

Inventor: Taisuke Itou
Method and apparatus for training a text independent speaker recognition system using speech data with text labels

Patent number: 7447633

Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.

Type: Grant

Filed: November 22, 2004

Date of Patent: November 4, 2008

Assignee: International Business Machines Corporation

Inventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
SYSTEM AND METHOD OF SPEECH RECOGNITION TRAINING BASED ON CONFIRMED SPEAKER UTTERANCES

Publication number: 20080243504

Abstract: An interactive speech recognition training process and system is disclosed. A speech recognition process is applied to a received speaker utterance. Utterance data are matched by the system with data in a grammar database and the speaker is requested to confirm a determined match. If the system determines from the speaker's response that the match is not confirmed, a negative score is assigned to the utterance data. If the match is determined by the system to be confirmed, a positive score is assigned to the utterance data. Scores for a plurality of such speaker utterances are accumulated in a log file, the accumulated scores used to adjust acoustic models for the grammar database.

Type: Application

Filed: March 30, 2007

Publication date: October 2, 2008

Applicant: Verizon Data Services, Inc.

Inventor: Parind Poi
Content selelction systems and methods using speech recognition

Publication number: 20080228481

Abstract: Embodiments of the present invention improve content selection systems and methods using speech recognition. In one embodiment, the present invention includes a speech recognition method comprising storing content on an electronic device, wherein the content is associated with a plurality of content attribute values, adding the content attribute values to a first recognition set of a speech recognizer, receiving a speech input signal in said speech recognizer, generating a plurality of likelihood values in response to the speech input signal, wherein each likelihood value is associated with one content attribute value in the recognition set; and accessing the stored content based on the likelihood values.

Type: Application

Filed: March 13, 2007

Publication date: September 18, 2008

Applicant: Sensory, Incorporated

Inventor: Todd F. Mozer
Optimization of detection systems using a detection error tradeoff analysis criterion

Patent number: 7424425

Abstract: In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.

Type: Grant

Filed: May 19, 2002

Date of Patent: September 9, 2008

Assignee: International Business Machines Corporation

Inventors: Jiri Navratil, Ganesh N. Ramaswamy
Dynamic N-best algorithm to reduce recognition errors

Patent number: 7421387

Abstract: A method for reducing recognition errors. The method includes receiving an N-best list associated with an input of a computer based recognition system. The N-best list includes one or more hypotheses and associated confidence values. The input is classified in response to the N-best list, resulting in a classification. A re-scoring algorithm that is tuned for the classification is selected. The re-scoring algorithm is applied to the N-best list to create a re-scored N-best list. A hypothesis for the value of the input is selected based on the re-scored N-best list.

Type: Grant

Filed: May 18, 2004

Date of Patent: September 2, 2008

Assignee: General Motors Corporation

Inventor: Kurt S. Godden
Adaptive multi-pass speech recognition system

Patent number: 7401017

Abstract: Method and apparatus for multi-pass speech recognition. An input device receives spoken input. A processor performs a first pass speech recognition technique on the spoken input and forms first pass results. The first pass results include a number of alternative speech expressions, each having an assigned score related to the certainty that the corresponding expression correctly matches the spoken input. The processor selectively performs a second pass speech recognition technique on the spoken input according to the first pass results. Preferably, the second pass attempts to correctly match the spoken input to only those expressions which were identified during the first pass. Otherwise, if one of the expressions identified by the first pass is assigned a score higher than a predetermined threshold (e.g., 95%), the second pass is not performed.

Type: Grant

Filed: April 4, 2006

Date of Patent: July 15, 2008

Assignee: Nuance Communications

Inventors: Hy Murveit, Ashvin Kannan, Ben Shahshahani, Chris Leggetter, Katherine Knill
Biometric voice authentication

Patent number: 7386448

Abstract: A system and method enrolls a speaker with an enrollment utterance and authenticates a user with a biometric analysis of an authentication utterance, without the need for a PIN (Personal Identification Number). During authentication, the system uses the same authentication utterance to identify who a speaker claims to be with speaker recognition, and verify whether is the speaker is actually the claimed person. Thus, it is not necessary for the speaker to identify biometric data using a PIN. The biometric analysis includes a neural tree network to determine unique aspects of the authentication utterances for comparison to the enrollment authentication. The biometric analysis leverages a statistical analysis using Hidden Markov Models to before authorizing the speaker.

Type: Grant

Filed: June 24, 2004

Date of Patent: June 10, 2008

Assignee: T-Netix, Inc.

Inventors: John C. Poss, Dag Boye, Mark W. Mobley
Adaptable communication techniques for electronic devices

Patent number: 7376434

Abstract: Improved approaches for users of electronic devices to communicate with one another are disclosed. The electronic devices have audio and/or textual output capabilities. The improved approaches can enable users to communicate in different ways depending on device configuration, user preferences, prior history, etc. In one embodiment, the communication between users is achieved by short audio or textual messages.

Type: Grant

Filed: August 2, 2006

Date of Patent: May 20, 2008

Assignee: IpVenture, Inc.

Inventors: C. Douglass Thomas, Peter P. Tong
Handheld electronic device with text disambiguation

Patent number: 7312726

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device. The device enables editing during text entry and also provides a learning function that allows the disambiguation function to adapt to provide a customized experience for the user. The disambiguation function can be selectively disabled and an alternate keystroke interpretation system provided.

Type: Grant

Filed: June 2, 2004

Date of Patent: December 25, 2007

Assignee: Research In Motion Limited

Inventors: Vadim Fux, Michael G. Elizarov, Sergey V. Kolomiets
Method for improving speaker identification by determining usable speech

Patent number: 7177808

Abstract: Method for improving speaker identification by determining usable speech. Degraded speech is preprocessed in a speaker identification (SID) process to produce SID usable and SID unusable segments. Features are extracted and analyzed so as to produce a matrix of optimum classifiers for the detection of SID usable and SID unusable speech segments. Optimum classifiers possess a minimum distance from a speaker model. A decision tree based upon fixed thresholds indicates the presence of a speech feature in a given speech segment. Following preprocessing, degraded speech is measured in one or more time, frequency, cepstral or SID usable/unusable domains. The results of the measurements are multiplied by a weighting factor whose value is proportional to the reliability of the corresponding time, frequency, or cepstral measurements performed. The measurements are fused as information, and usable speech segments are extracted for further processing.

Type: Grant

Filed: August 18, 2004

Date of Patent: February 13, 2007

Assignee: The United States of America as represented by the Secretary of the Air Force

Inventors: Robert E. Yantorno, Daniel S. Benincasa, Stanley J. Wenndt, Brett Y. Smolenski
Multichannel voice detection in adverse environments

Patent number: 7146315

Abstract: A multichannel source activity detection system, e.g., a voice activity detection (VAD) system, and method that exploits spatial localization of a target audio source is provided. The method includes the steps of receiving a mixed sound signal by at least two microphones; Fast Fourier transforming each received mixed sound signal into the frequency domain; filtering the transformed signals to output a signal corresponding to a spatial signature of a source; summing an absolute value squared of the filtered signal over a predetermined range of frequencies; and comparing the sum to a threshold to determine if a voice is present. Additionally, the filtering step includes multiplying the transformed signals by an inverse of a noise spectral power matrix, a vector of channel transfer function ratios, and a source signal spectral power.

Type: Grant

Filed: August 30, 2002

Date of Patent: December 5, 2006

Assignee: Siemens Corporate Research, Inc.

Inventors: Radu Victor Balan, Justinian Rosca, Christophe Beaugeant
Database annotation and retrieval

Patent number: 7054812

Abstract: A system is provided for determining a sequence of sub-word units representative of at least two words output by a word recognition unit in response to an input word to be recognized. In a preferred embodiment, the word alternatives output by the recognition unit are converted into sequences of phonemes. An optimum alignment between these sequences is then determined using a dynamic programming alignment technique. The sequence of phonemes representative of the input sequences is then determined using this optimum alignment.

Type: Grant

Filed: April 25, 2001

Date of Patent: May 30, 2006

Assignee: Canon Kabushiki Kaisha

Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner
Method and system for verifying and enabling user access based on voice parameters

Patent number: 7054811

Abstract: A system for verifying and enabling user access, which includes a voice registration unit for providing a substantially unique and initial identification of each of a plurality of the speaker/users by finding the speaker/user's voice parameters in a voice registration sample and storing same in a database. The system also includes a voice authenticating unit for substantially absolute verification of an identity of one of said plurality of users. The voice authenticating unit includes a recognition unit for providing a voice authentication sample, and being operative with the database. The voice authenticating unit also includes a decision unit operative with the recognition unit and the database to decide whether the user is the same as the person of the same identity registered with the system, such that the identity of one of the plurality of users is substantially absolutely verified.

Type: Grant

Filed: October 6, 2004

Date of Patent: May 30, 2006

Assignee: Cellmax Systems Ltd.

Inventor: Ziv Barzilay
Method, apparatus, and computer readable media for minimizing the risk of fraudulent access to call center resources

Patent number: 6937702

Abstract: Method, apparatus, and computer-readable media for minimizing the risk of fraudulent access to call center resources. The invention described herein provides a method of minimizing fraudulent access to call center resources, with the method including at least the following. One or more authenticated biometric samples are associated with at least one person. The person then submits at least one test biometric sample during a login process to obtain authorization to access to call center resources, for example, to process telephone calls or to receive training. This test biometric sample is captured and the differences between the test biometric sample and the one or more authenticated biometric samples is quantified. Depending on the degree of difference between the at least one authenticated biometric sample and the test biometric sample, the person's request for authorization to access call center resources is dispositioned.

Type: Grant

Filed: June 24, 2002

Date of Patent: August 30, 2005

Assignee: West Corporation

Inventors: Jill M. Vacek, Mark J. Pettay, Hendryanto Rilantono, Mahmood S. Akhwand, Gary L. West
Methods and apparatus for conversational name dialing systems

Patent number: 6925154

Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.

Type: Grant

Filed: May 3, 2002

Date of Patent: August 2, 2005

Assignee: International Business Machines Corproation

Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
Voice recognition system and method

Publication number: 20040260549

Abstract: A voice recognition system includes an adaptive filter and a subtractor. The adaptive filter generates a simulated talk-back voice y(n) by setting a filter coefficient simulating a transfer system in which an input voice corresponding to a voice command and a talk-back voice output from a speaker are input into a microphone and by filtering a talk-back voice x(n). The subtractor extracts the input voice by subtracting the simulated talk-back voice y(n) from mixed sound input into the microphone. With this configuration, the talk-back voice is attenuated from the mixed sound including the input voice and the talk-back voice input tedinto the microphone, and then, the mixed sound is supplied to a voice recognition engine. Accordingly, the user can input his/her voice during a talk-back operation without the need to interrupt it by pressing a speech button every time the user wishes to input the voice. The voice recognition operation time can be thus reduced.

Type: Application

Filed: April 30, 2004

Publication date: December 23, 2004

Inventors: Shuichi Matsumoto, Toru Marumoto
Bio-phonetic multi-phrase speaker identity verification

Publication number: 20040162726

Abstract: A speaker identity claim (SIC) utterance is received and recognized. The SIC utterance is compared with a voice profile registered under the SIC, and a first verification decision is based thereon. A first dynamic phrase (FDP) is generated, and a user is prompted to speak same. An FDP utterance is received, and compared with the voice profile registered under the SIC to make a second verification decision. If the second verification decision indicates a high or low confidence level, the speaker identity claim is accepted or rejected, respectively. If the verification decision indicates a medium confidence level, a second dynamic phrase (SDP) is generated, and the user is prompted to speak same. An SDP utterance is received, and compared with the voice profile registered under the SIC to make a third verification decision. The speaker identity claim is accepted or rejected based on the third verification decision.

Type: Application

Filed: February 13, 2003

Publication date: August 19, 2004

Inventor: Hisao M. Chang
Identification apparatus and method

Publication number: 20040128131

Abstract: An audible command can be utilized to both permit identification of the speaker and to permit subsequent actions that comprise a corresponding response to the audible command when the identity of the speaker correlates with that of a previously authorized individual. Such identification can be supplemented with other identification mechanisms. Hierarchical levels of permission can be utilized, with or without confidence level thresholds, to further protect the device against unauthorized access and/or manipulation.

Type: Application

Filed: December 26, 2002

Publication date: July 1, 2004

Applicant: Motorola, Inc.

Inventors: William Campbell, Robert Gardner, Charles Broun
Speech recognition system and standard pattern preparation system as well as speech recognition method and standard pattern preparation method

Patent number: 6741962

Abstract: A speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.

Type: Grant

Filed: March 7, 2002

Date of Patent: May 25, 2004

Assignee: NEC Corporation

Inventor: Kenichi Iso
Speaker verification and speaker identification based on a priori knowledge

Patent number: 6697778

Abstract: Client speaker locations in a speaker space are used to generate speech models for comparison with test speaker data or test speaker speech models. The speaker space can be constructed using training speakers that are entirely separate from the population of client speakers, or from client speakers, or from a mix of training and client speakers. Reestimation of the speaker space based on client environment information is also provided to improve the likelihood that the client data will fall within the speaker space. During enrollment of the clients into the speaker space, additional client speech can be obtained when predetermined conditions are met. The speaker distribution can also be used in the client enrollment step.

Type: Grant

Filed: July 5, 2000

Date of Patent: February 24, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Olivier Thyes, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
Biometric identification system

Publication number: 20030229492

Abstract: In one embodiment the present invention provides a method for identity verification comprising the steps of: (a) comparing at least one first spoken voice print of a user speaking at least one piece of personal data against a first stored voice print of the user speaking said at least one piece of personal data; (b) comparing at least one second spoken voice print of the user speaking at least one piece of travel data against a second stored voice print of the user speaking said piece of travel data; and (c) determining if the user is a given individual based the results of step (a) and step (b).

Type: Application

Filed: September 13, 2002

Publication date: December 11, 2003

Inventor: Marc Edward Nolan
Network-accessible speaker-dependent voice models of multiple persons

Publication number: 20030125947

Abstract: A voice model database server determines the identity of a speaker through a network over which the voice model database server provides to one or more speech-recognition systems output data regarding a person with access to the speech-recognition system receiving the output data. The voice model database server attempts to locate, based on the identity of the speaker, a voice model for the speaker. Finally, the voice model database server retrieves from a storage area the voice model for the speaker, if the voice model database server located a voice model for the speaker.

Type: Application

Filed: January 3, 2002

Publication date: July 3, 2003

Inventor: Michael Allen Yudkowsky
Speech enabled, automatic telephone dialer using names, including seamless interface with computer-based address book programs

Patent number: 6563911

Abstract: The present invention a speech enabled automatic telephone dialer device, system, and method using a spoken name corresponding to name-telephone number data of computer-based address book programs. The invention includes user telephones connected to a PBX-type telephony mechanism, which is connected to a telephony board of a name dialer device. User computer workstations containing loaded address book programs with name-telephone number data are connected to the name dialer device. The name dialer device includes a host computer in a network; a telephony board for controlling the PBX for dialing; memory within the host computer for storing software and name-telephone number data; and, software to access computer-based address book programs, to receive voice inputs from the PBX-type telephony mechanism, to create converted phonemes from names to match voice inputs with specific name-telephone number data from the computer-based address book programs for initiating an automatic dialing.

Type: Grant

Filed: January 23, 2001

Date of Patent: May 13, 2003

Assignee: iVoice, Inc.

Inventor: Jerome R. Mahoney
System and method for computing and transmitting parameters in a distributed voice recognition system

Publication number: 20030004720

Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server . The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit to the server .

Type: Application

Filed: January 28, 2002

Publication date: January 2, 2003

Inventors: Harinath Garudadri, Hynek Hermansky, Lukas Burget, Pratibha Jain, Sachin Kajarekar, Sunil Sivadas, Stephane N. Dupont, Maria Carmen Benitez Ortuzar, Nelson H. Morgan
Combining N-best lists from multiple speech recognizers

Publication number: 20020193991

Abstract: A method and system for utilizing multiple speech recognizers. The speech system includes a port through which an input audio stream may be received, at least two recognizers that may convert the input stream to text or commands, and a combiner able to combine lists of possible results from each recognizer into a combined list. The method includes receiving an input audio stream, routing the stream to one or more recognizers, receiving a list of possible results from each of the recognizers, combining the lists into a combined list and returning at least a subset of the list to the application.

Type: Application

Filed: June 13, 2001

Publication date: December 19, 2002

Applicant: Intel Corporation

Inventors: Steven M. Bennett, Andrew V. Anderson
Proofreading assistance techniques for a voice recognition system

Publication number: 20020184022

Abstract: A system that identifies recognized words from a voice recognition system that have the lowest possibility of being correct, and flagging those words on a user interface, to help with proofreading.

Type: Application

Filed: June 5, 2001

Publication date: December 5, 2002

Inventor: Gary F. Davenport
System, method and computer program product for large-scale street name speech recognition

Publication number: 20020133344

Abstract: A system, method and computer program product are provided for speech recognition. During operation, a database of words are maintained. Initially, a probability is assigned to each of the words which indicates a prevalency of use of the word. Further, an utterance is received for speech recognition purposes. Such utterance is matched with one of the words in the database based on least in part on the probability.

Type: Application

Filed: January 24, 2001

Publication date: September 19, 2002

Inventor: Bertrand A. Damiba

prev 1 2 3 4 5 6 next